How would you go about finding the value of \sqrt{3} if you didn’t have a square root button on your calculator? Well, the most obvious thing might be to try some values, based on your knowledge of the square root function. You are being asked to find that x for which:




or, in other words, that x, which, when squared  gives 3. We have to be a little careful here because we know that there will actually be two numbers which satisfy this (one positive, one negative), and we are interested in the positive one only.


So, we try some values, but we don’t do it randomly, we can see that because 1^2=1 and 2^2=4 that whatever number squared gives 3 must be between 1 and 2. We can try something called a binary intersection. This just means taking the values which we know bound the right answer (ie. we know that 1<x<2), and trying the number in the middle. Well, that number is 1.5, and 1.5^2=2.25 which is less than 3, so 1.5<x<2. Find the number in the middle again: 1.75^2=3.0625. So now we know that 1.5<x<1.75. Again, the number in the middle 1.625^2=2.64063, so 1.625<x<1.75. Once more…1.6875^2=2.84766.


One thing to note here is that we actually got pretty close with 1.75 and we could have just lowered it a little to get a better answer, but it looks like this method is taking quite a few steps to get a very accurate answer. In fact, to get an answer correct to three decimal places using this method would take some 17 steps, which isn’t very efficient.


Note that here we are taking  a function which we could easily calculate using a calculator, because we know how to take square roots. However, what we are really saying here will be applicable to functions which aren’t as easy to invert as f(x)=x^2 is. This is really what we are doing. Although we spoke about the square root to start with, all of the calculations above are actually squaring numbers. Squaring is easily inverted, other functions may not be.


Ok, so the conclusion of the above is that this method is not very efficient. What else can we do? Well, we will start by rephrasing our question. So far we are trying to calculate the x for which x^2=3. We could also write this as the x for which x^2-3=0. Let’s give this function x^2-3 a name and call it y(x). So we have some function y(x) and we want to know when it is equal to 0. Well, that sounds very much like a question about when the graph of y(x) cuts the x-axis. Let’s plot this function, but let’s plot it by ‘hand’ – because we don’t yet know when it cuts through the x-axis.




Note that we’ve implemented the xkcd Mathematica style using the code here:


ok, so we’re going to start with a guess of the value at which it cuts the axis, and we’re going to guess x=2. We draw a line up to the graph at y(2)=2^2-3=1 and then we do something a bit strange. We now make an approximation of the graph. We are going to say, what if our function wasn’t this square function, but was just the line given by the tangent to the graph at the point (2,1). If that were the case, where would it cut the axis? Let’s zoom in and take a look…



The yellow line in the above is the tangent line. Well, without doing any squaring we can actually calculate exactly where this line crosses the axis. Let’s see what we need for this:

  1. What’s the gradient of our function at our first guess point (ie. at x=2)?:
  2. The derivative of our function is y'(x)=2x, so the gradient is 4.
  3. The tangent line is thus (using y-y_1=m(x-x_1), or your favourite straight line formula): y=4(x-2)+1=4x-7
  4. Where does this cross the x-axis?
  5. Set y=0: 0=4x-7, which gives x=\frac{7}{4} (which is actually 1.75).

This point (the point at which the yellow line above crosses the x-axis is already a pretty good approximation of \sqrt{3} which is actually what we’re trying to find. We can do better though. Let’s try this one more time. But let’s take this new point x=\frac{7}{4} as our starting guess. Repeating the above:


  1. What’s the gradient of our function at this new point (ie. at x=\frac{7}{4})?:
  2. The derivative of our function is the same as before. ie y'(x)=2x, so the gradient is \frac{7}{2}.
  3. The tangent line is thus y=\frac{7}{2}\left(x-\frac{7}{4}\right)+\left(\frac{7}{4}\right)^2-3=\frac{7x}{2}-\frac{97}{16}
  4. Where does this cross the x-axis?
  5. Set y=0: 0=\frac{7x}{2}-\frac{97}{16}, which gives x=\frac{97}{56}


In fact already, after just two steps our answer is correct to three decimal places. That’s pretty remarkable, and what’s even more remarkable is that in this case we didn’t even need to square anything!


So, let’s look through what we did. There were really two main processes going on. First we were asked to find the square root of 3. We converted this question into a question about when a given function was equal to 0.


  1. Trying to find x=\sqrt{3}.
  2. This is the same as solving x^2=3 (at least the same as finding the positive solution).
  3. Which is the same as solving x^2-3=0.
  4. Which is the same as asking when the function y(x)=x^2-3 is equal to zero.
  5. Which is the same as asking when the graph of the function y(x)=x^2-3 goes through the x-axis.


Now, this latter question is answered in a number of steps:


  1. Make a guess at the root of our function (ie. the value of x for which the function gives zero).
  2. Find the tangent to the function at our guess value of x.
  3. See when this tangent crosses the x-axis.
  4. Use this value as our next guess.
  5. Find the tangent line to our original function at this new value.
  6. See when this tangent crosses the x-axis.


Ok, so can we write this down in a somewhat more general way? Let’s say that we have some function f(x) and we want to know when it is equal to 0, without inverting it, which is often very difficult. Let’s say that our first guess is x_0. The tangent line to f(x) at this point is given by:




This cuts the x-axis when: 0=f'(x_0)(x-x_0)+f(x_0). Call this value x_1:




Now the next best approximation is going to be (because we could have had x_1 as our first guess had we happened to chose that):




This is then an iterative procedure. In general, given the i^{th} term, to find the (i+1)^{th} term, we just have:




We can see that what we’re doing is changing our guess each time by a fraction which is related to the value of the function and its gradient at the previous guess.


Note that in this we didn’t specifically talk about finding a square root of a number. This is pretty general. Let’s take a more complicated example.


We could certainly find more roots of other numbers, ie. what is 3.2^\frac{1}{5} by finding the zero of the function y(x)=x^5-3.2 but we’re going to look at a more interesting application here.


What if we want to find the roots of e^x-5 x (ie. the values for which this gives zero, or equivalently for which its graph cuts the x-axis). It turns out that for a function like this we cannot invert it (as we could with finding the root of x^2, or sin x for instance). We’re going to find the root of this function without using any inverse functions at all.


Well, to start with we can find that this function has a turning point (by calculating first and second derivatives), and that it must have two roots (it’s continuous, has one minimum which is a global minimum, below zero, and asymptotes at \pm\infty to \infty). It thus has two roots. We start with a guess for the first root at x=5. Let’s blindly use Newton-Raphson and see what happens. Well if our first guess is x_0=5, then:




Now we use this as our next guess:



Repeating this another four times gets us to x=2.54266. And when we plug this into our original function, we get 0.00011. Which is pretty close to zero.


Note that what we’ve done here is not do away with the calculator, but we’ve managed to get a very good approximation to an equation which is not solvable exactly. As an exercise, try and find the other root of this function.


Note that the initial guess you make for your function is very important, and there are some functions which will simply not work using this method naively. Try finding the root of \sqrt{x} using this method and see what happens.


If you are a programmer, see what the shortest piece of code is which could implement the Newton-Raphson method.

How clear is this post?