TANGENT LINES AND CONVERGENCE OF ITERATION PROCEDURES

The problem of computing the slope of tangent line to a planar curve -- say the graph of a function -- at point on this curve is the central problem of the differential calculus. It is not so obvious at the outset why this is an important problem to be able to solve. But it is important because it leads to the solution of a whole host of other problems which may at first seem unrelated. You will see this over and over again in your study of the calculus.

In this section of the course notes, we will focus on one example which serves as a convenient context in which to review some basics about working with functions and their graphs.

The problem we will take up is that of finding good rational approximations to the square root of 2, accurate to as many decimal places as desired. The square root of 2 is irrational, so it can never be given exactly in terminating decimal form, but there are many ways of producing a sequence of rational numbers "converging" to it.

One the first things we will do in this course is develop a precise mathematical understanding of what "converging" means. This is the subject of the theory of limits, to which we shall turn in another section of notes, pretty soon. Now let's work with the notion on an intutitive level: a sequence of numbers, with xn denoting the nth number in the sequence, is said to converge to a number z if given any small interval of values around z, there is a point in the sequence after which all of the xn lie inside this interval. Of course, the point in the sequence depends on the size of the interval.

For example, if the interval is

[z - 10-6,z + 10-6]

this means that there is some N so that after the Nth term in the sequence, each xn has the same decimal expansion as z, up through the sixth place to the right of zero. Decimal places after that will in general keep changing, at least for a while, but if the sequence is convergent, there is another point in the sequence, further out, at which the first ten decimal places to the right of zero "freeze forever", and another still further out when the first hundred "freeze forever", and so on.

In this section of the notes, we will work on this intuitive basis with the problem of finding sequences of rational numbers that converge to the square root of two. In dooing so, we shall develop some experience and concrete examples of "convergent sequences" with which we can approach the general theory later.

We shall examine several iterative procedures for producing such sequences. Iterative procedures are procedures that take a given approximation as input, and properly used give a better approximation as output. Running them again and again, you "should" be able to get as much accuracy as you want.

Therefore it is important to determine the circumstances under which they can be successfully used. And it turns out that in understanding when the procedures work, and how well they work, the key lies in knowing the slopes of certain tangent lines.

The main thing we are after here is not the procedures themselves, but an understanding of the role of tangent line slope in the analysis of their effectiveness and workability. The ancient Babylonians had an excellent iterative procedure for computing square roots thousands of years ago. But our analysis of its efficacy will equip us to develop new iterative procedures for finding all sorts of things -- not just square roots, and in fact, not just numbers.

Without further ado, let's introduce the first of these proceures:

Take the polynomial x2-2 and divivde it by x+1. (You can divide by other things as well; we'll consider other possibilities in a moment.) Dividing, you get:

(x2 -2)/(x+1) = x - (x+2)/(x+1)

Clearly the left hand side is zero if and only if x = sqrt(2), or if x = -sqrt(2). So it is exactly at these two values of x that

x = f(x)

where f(x) is defined by

f(x) = (x+2)/(x+1)

This leads us to the following definition.

Definition of Fixed Points

A point x on the real line is called a fixed point of the function f(x) in case

f(x) = x

To find the square roots of two, all we have to do is to find the fixed points of this function f(x) = (x+2)/(x+1).

One way to approach this is graphically. Let's draw the graphs of

y = f(x)

and

y = x

Where the two graphs cross, x = y = f(x), so the x coordinate (or the y coordinate) of a crossing of the graphs is a fixed point.

The problem with this is that you have to draw a very accuate graph to get an accurate value for the fixed point, and a lot -- really a lot -- of computation must be done to draw an accurate graph.

The graph above was drawn by computer, using matlab, a program in your student software package. Clearly, y = f(x) is graphed in blue, and y = x in green. From the graph you can see that the square root of two is about 1.4. But what we want to learn from the graph is not so much the value itself, but why the following iteration method for finding the square root of two works.

The procedure is this: Pick any starting approximation x0, which need not be particularly close. Let's take, say, x0 = 1. Then for all integers n greater than zero, we reculsively define

xn = f(xn-1).

Here are the first 25 terms of the sequence this prescription produces.

     1.000000000000000e+00
     1.500000000000000e+00
     1.400000000000000e+00
     1.416666666666667e+00
     1.413793103448276e+00
     1.414285714285714e+00
     1.414201183431953e+00
     1.414215686274510e+00
     1.414213197969543e+00
     1.414213624894870e+00
     1.414213551646055e+00
     1.414213564213564e+00
     1.414213562057320e+00
     1.414213562427273e+00
     1.414213562363799e+00
     1.414213562374690e+00
     1.414213562372821e+00
     1.414213562373142e+00
     1.414213562373087e+00
     1.414213562373096e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00 
     

They are given to 16 decimal places, and note that the last few are the same to this accuracy. This provides a pretty good picture of convergence, and very rapid convergence at that! Less that 25 step to freeze 16 decimal places! However, we are going to do much better. (The ancient Babylonians already did, and we'd better do at least as well.) This iteration was produced by matlab as well. The m-files for producing these and other examples in these notes are available on a code page for this section. You can copy the files from there if you'd like to do your own experimants, though we'l have another, simpler way with java applets a little later in the notes.

Note that as far as the first 16 decimal places are concerned,

x25 = x24.

But by the way they were produced,

x25 = f(x24).

So at this level of precision,

x24 = f(x24).

which means that x24 is a fixed point. Squaring it, you find that indeed it is the sqaure root of two to 16 decimal places.

You can experiment with this iteration method in matlab, but there is a more convenient alternative. There is a java applet that was designed just for this section of notes. It isn't versitaile tool like matlab; it just does one thing: graphically iterate functions of your choosing. But since it only does one thing, it is set up to make that one thing very easy to do.

To use it, you go to the web page containing the applet, and enter the function you want to iterate at the bottom. It graphs it, as above. Then click on your desired starting point. It shows up as a red dot. Then press the step button, and you can "step through" the iteration.

Here is what you'll see:

Notice that the function we are studing is entered in at the bottom. You can change this in your experimants. Above the text field for the function, there are three "radio buttons". The one on the right is activated, which displays controls for stepping through the iteration in the panel on the right. The middle radio button put info and controls on the range of the graph in the right panel. You can use these to "zoom in" or "zoom out". It has been set up here to show the region 0 < x < 4 and 0 < y < 4. The cross-hairs are centered on (2,2), not the origin. All of this information is displayed in the panel on the right if you click on "graph range controls".

Finally, if you click on "cursor position" you bring up a display of the coordinates of the curdor as you move it over the graph. In this way, you can easily read the coordinates of any feature on the graph without the need for axes.

Click the "cursor" button, and move the cursor around until the x value is at the desired starting value, x0. Then click the mouse button. The point (x0,x0) will be marked with a red dot, and the panel at right changes to the "step" display.

We got as close to x0 = 1 as we could on this resolution. (You could always zoom in.) Now hit the step button. You see:

A second dot has been drawn at (x1,x1), as well as some lines that will help us understand the iteration geometrically.

To see a little, better, let's "zoom in":

Now, let's understand the lines: There is a geometric way to understand what the iteration is going to do from the graphs, without dealing in numbers at all! You don't need the formula for f(x), just its graph!

Here's how you get from x0 to x1 using the geometry of the graph alone: You leave (x0,x0)going parallel to the y-axis -- i.e., keeping the x-value fixed at x0 -- until you hit the blue graph -- the graph of f(x). Clearly, you hit it at y = f(x0) = x1, the new x-value.

To bring it back to the green line, move parallel to the x-axis i.e., keeping the y-value fixed at y = x1, until you hit the green line. Since it is the graph of y = x, the point you hit is (x1,x1).

This may seem a little complicated, but notice what it gives us: a simple rule for figuring out the effects of the iteration geometrically. all you need is the graphs; you don't need any numbers at all to see if the iteration is going to converge or not! Just "bounce back and forth between the green amd blue curves, traveling vertically from green to blue, and horizontally from blue to green.

As we've seen above, this one will converge very fast. We need to keep "zooming in" to see what is going on.

Having zoomed way in on the fifth iteration, here is what we see:

If you keep on following it, you will see that the red lines spiral tightly about the intersection point.

There is another thing to notice: as we zoom in closer and closer, the piece of the blue graph we see becomes less and less curved -- more and more indistinguishable from a piece of a straight line. This is a key thing to understand: under sufficiently close magnification around some point on a curve, the graph of a curve and of its tangent line at that point are indistinguishable. And that's great, because lines are always given by simple eqautions of the form y = m*x + b. No matter how complicated the equation for the curve is, the equation for the tangent line is very, very simple. So if their graphs are indistinguishable, why not simply work with the equation for the line?

The theory of limits will enable us to not only get good approximations using this strategy, but exact answers. That will come a bit later.

You are strongly encouraged at this point to try the applet out, changing the function and the starting points. Pay attention to how linear things get when you zoom in.

Let's do some of that here. Let's try some other values for x0, but keeping f the same.

First of all, observe that there is another fixed point, namely -sqrt(2). Here is another graph, showing this fixed point.

Notice that there are two branches to the graph of f(x), separated by a singularity at x = -1, where the denominator in f(x) vanishes. First, let's try to find a starting point that leads us to this fixed point, say

x0 = -1.4.

One might guess that this will produce a sequence converging to -sqrt(2), especially since it starts so close. But here are the results:

    -1.400000000000000e+00
    -1.500000000000001e+00
    -9.999999999999973e-01
     3.752999689475423e+14
     1.000000000000003e+00
     1.499999999999999e+00
     1.400000000000000e+00
     1.416666666666667e+00
     1.413793103448276e+00
     1.414285714285714e+00
     1.414201183431953e+00
     1.414215686274510e+00
     1.414213197969543e+00
     1.414213624894870e+00
     1.414213551646055e+00
     1.414213564213564e+00
     1.414213562057320e+00
     1.414213562427273e+00
     1.414213562363799e+00
     1.414213562374690e+00
     1.414213562372821e+00
     1.414213562373142e+00
     1.414213562373087e+00
     1.414213562373096e+00
     1.414213562373095e+00

The sequence is tending to the positive sqaure root of 2! And notice the size of fourth term in the sequence. The steady progression to the fixed point that we saw the first time just isn't here.

Well, then let's try a starting point on the other side of the negative fixed point, say

x0 = -2.

The first 25 terms of the sequence this produces are

    -2.000000000000000e+00
                         0
     2.000000000000000e+00
     1.333333333333333e+00
     1.428571428571429e+00
     1.411764705882353e+00
     1.414634146341463e+00
     1.414141414141414e+00
     1.414225941422594e+00
     1.414211438474870e+00
     1.414213926776741e+00
     1.414213499851323e+00
     1.414213573100136e+00
     1.414213560532626e+00
     1.414213562688870e+00
     1.414213562318917e+00
     1.414213562382391e+00
     1.414213562371500e+00
     1.414213562373369e+00
     1.414213562373048e+00
     1.414213562373103e+00
     1.414213562373094e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00

Again! It seems we just can't hit the negative fixed point.

And things can get even worse for other functions f. In fact, let's consider one of the simplest:

f(x) = x2.

This has fixed points at x=0 and at x=1. Let's take 2 as our starting point, and think about what happens. The sqaure of 2 is 4, and the sqaure of that is 16. Clearly, the terms of the sequence will just increase without bound. There is no convergence to anything at all in this case.

These experiments raise a couple of questions:

(1.) Why could we find the positive fixed point by iteration, and not the negative one?

(2.) Which starting values x0 produce sequences that converge to a fixed point?

and probably most important

(3.) How can we change f to get a procedure that converges as quickly as possible from as many starting points as possible?

The key to all of the is the "slope" at which the graph of y = f(x) cuts across the graph of y = x at the fixed point.

We had to state this a little loosely, using quotes, because we don't really have a precise definition for the "slope" of a graph -- unless it is the graph of a linear function. But probably you are at least vaguely familiar with the notion of the tangent line to a graph at some point on it. We identify the slope of a graph at a point with the slope of its tangent line there.

In this course you will learn a lot about how to compute these tangent lines, but for now it will suffice to estimate them graphically.

Definition of Attractive and Repulsive Fixed Points

A fixed point of the function f(x) is called attractive in case the slope of the graph of f(x) at that point has a value strictly between -1 and 1.

A fixed point of the function f(x) is called repulsive in case the slope of the graph of f(x) at that point has a value strictly less than -1, or strictly greater than 1.

Theorem on Attractive and Repulsive Fixed Points

Suppose that z is a fixed point of some function f(x),; i.e., z = f(z). Suppose that z is attractive. Then there is some positve number a so that for all x with

|x-z| < a

|f(x) - z| < |x-z|.

That is, each successive application of f brings one closer to the fixed point. On the other hand, if z is repulsive, there is some positve number a so that for all x with

|x-z| < a

|f(x) - z| > |x-z|.

That is, each successive application of f takes one further away from the fixed point, at least until one leaves the region |x-z| < a.

We will now more or less prove this theorem. We won't be able to dot the i's and cross the t's at this stage of the course, but we can at least see why it is true, and gain some insight into the role of the slope of the tangent line.

The first step of the proof is where we will need to be a bit formal at this stage: Zoom in on the graph until the graph of y = f(x) is essentially linear. This is where the choice of a comes in. One weekness of the theorem as it is stated is that it gives us no information on how big or small a might be. From your experiments, you know it has something to do with the curvature of the graph of y = f(x), but at this point in the course, we don't have the tools to deal with this. Later we shall, and then we shall come back and address the question of how big a is. This after all is a very intereating question: its answer tells us how good our initial guess -- our starting point for the iteration -- has to be.

For now though, lets just suppose that you have found a value of a so that when you graph in the corresponding region, the picture you see is linear.

Here is the picture, with a lot of annotation:

The fixed point, (z,z) is at the center of the graph, where the horizontal line y = z and the vertical line x = z cross. Two more lines pass through this point: the green line y = x, and the graph of y = f(x) -- that by hypothesis is effectively a line on this scale -- and is graphed in dark blue.

Now focus on the triagle with vertices

(xn,xn+1) (xn,z) and (z,z)

This is clearly a right triangle. The length of its vertical side is |z - xn+1|, and the length of its horizontal side is |z - xn|. The third side is the hypotenuse. Now, let phi denote the angle in the vertex opposite to the vertical side. By the definition of the tangent function, i.e., the ratio of the opposite to the adjacent side lengths,

|tan(phi)| = |z - xn+1|/|z - xn|

which is the same as

|z - xn+1| = |tan(phi)|*|z - xn|

Now, what is the value of tan(phi)? Well, it is just the absolute value of the slope of the blue line -- rise over run. So let s be this slope. Then we have

|z - xn+1| = |s|*|z - xn|

Clearly you see that if |s| < 1, then the successive values of |z - xn| are getting smaller and smaller -- reduced by a factor of |s| each step.

On the other hand, if |s| > 1, then the successive values of |z - xn| are getting bigger and bigger -- increased by a factor of |s| each step.

This completes the proof.

We can learn something from the proof that wasn't stated in the theorem; that is, we can see from the proof that the closer the slope is to zero, the more attractive the fixed point is; i.e, the faster the resulting sequence will close in on the limiting value.

This suggests an answer to our third question: we should try to arrange things so that the slope of the function that we are iterating has a small a slope as possible at the fixed point, hopefully even zero slope!

Therefore, let's go back to our function f(x) = (x+2)/(x+1), and change it so that the square root of two is still a fixed point, but so that the slope is smaller.

How can we do this? Well, just think back to where it came from in the first place. The function f(x) is, after a sign change, the remainder one gets dividing x2-2 by x+1. Now the value 1 in the x+1 is quite arbitrary. So let's make it a variable.

Let's divide by x+b for as yet unspecified b, and then let's try to choose b to make sqrt(2) as attractive as possible.

This works since

(x2 -2)/(x+b) = x - (b*x+2)/(x+b)

Again, the left side vanishes exactly at x = sqrt(2) and at x = -sqrt(2), so at exactly these values of x do we have 0 = x - (b*x+2)/(x+b).

We therefore define

fb(x) = (2+b*x)/(x+b)

Later in this course we will learn how to compute the slope of fb(x) at x = sqrt(2), but for now we will have to rely on graphs to see what changes we should make in the value of b.

Here are some graphs for several values of b.

Here b = 2. Notice this time the slope is positive at sqrt(2).

Here b = 1.5. Notice this time the slope is positive at sqrt(2), but very small. (The vertical line in the graph shouldn't really be there! It comes from the singularity when the denominator of our function vanishes. Just ignore it.)

Here b = 1.4. Notice this time the slope is negative at sqrt(2), but really very small.

(Again, the vertical line in the graph comes from the singularity when the denominator of our function vanishes. Just ignore it.)

We would expect very rapid convergence with this value of b. Here are the first 25 terms in the sequence, starting from 1:

     1.000000000000000e+00
     1.416666666666667e+00
     1.414201183431953e+00
     1.414213624894870e+00
     1.414213562057320e+00
     1.414213562374690e+00
     1.414213562373087e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     

Notice that the first sixteen digits are frozen after only the 8th term! Clearly, the closer b is to sqrt(2), the more attractive sqrt(2) is. So what value of b should we use in our iteration procedure? The best rational approximation we have, of course!. And what is that? Well, if we've computed xn so far, but not xn+1, then xn is our best available approximation to sqrt(2), and that is what we should use for b in the next step.

That is, we should define

xn+1 = fxn(xn).

We can put this in standard form if we define

g(x) = fx(x) = (2+x*x)/(x+x) = (2+x2)/(2*x)

Then our iteration procedure becomes simply

xn+1 = g(xn).

Starting from x0 = 1, the first 25 terms in the sequence are:


     1.000000000000000e+00
     1.500000000000000e+00
     1.416666666666667e+00
     1.414215686274510e+00
     1.414213562374690e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00
     1.414213562373095e+00

Note that all 16 digits stabilize after the firt six steps! The fixed point sqrt(2) is really attractive for g! And if we graph it, you see that the slope of g at sqrt(2) is zero!

Something else interesting has happend. The other fixed point, -sqrt(2) has been made attractive! Try it - though by now you know what you have to find: Since the geomerty of the crossing is the same (up to a reflection) for the two fixed points, they behave the same way under interation.

The procedure is easily modified to extract the square roots of numbers other than 2. To find the square root of a postive number c, redefine g(x) by

g(x) = (c+x2)/(2*x)

You should be able to see, after some analysis, that sqrt(c) is an attractive fixed point for this g(x). This procedure for computing square roots was known to the ancient Babylonians.

PROBLEMS AND EXERCISES