(1) f(x,y) = 0
and
(2) g(x,y) = 0
find all points (x,y) that satisfy these equations.
The method takes certain input -- namely a good educated guess
for about where each solution should be.
Usually, one arrives at this guess by graphical methods. Let's take that part for granted here, and suppose we already have
a point (x0,y0) which approximately
solves the system (i.e., makes the right hand sides of (1) and (2)
both "small".
How do we get from here to a better solution; i.e. one which works to some more decimal places? This is the problem toward which Newton's method is aimed.
The key idea is this: To get a good approximate solution,
we replace the exact problem above with the approximate problem
h(x,y) = 0
and
k(x,y) = 0
where h(x,y) and k(x,y) are the best linear approximations to
f(x,y) and g(x,y) at the point (x0,y0).
In geometric terms, this that means h(x,y) and k(x,y) are the linear functions such that
z= h(x,y)
and
z = k(x,y)
are the equations for the tangent planes to the graphs
of f and g respectively at the point
(x0,y0)
Linear eqautions are easy to deal with, and the plan is to use their
solution to get a better
approximate solution. The hope then is that
repeating this often enough, we can get as good an approximation as we require.
Now you will notice that this description of Newton's method does not
contain any formulas really. This is an important point. I want to make
a strong distinction here between a method and an
algorithm.
There are however many ways to turn Newton's method into
an algorithm. Each has its own advantages and disadvantages, so which one is
the best one depends on the application. A typical good book on numerical analysis will contain five or six implementations of the method as an algortihm that
can be programed on a computer.
Here we will present two implementations of Newton's method, and then we will study
their relative strengths.
h(x,y) = 0
and
k(x,y) = 0
where h(x,y) and k(x,y) are the best linear approximations to
f(x,y) and g(x,y) at the point (xj-1,yj-1).
In the worked problems below we will generate the first two terms in the
approximating sequence by hand. We will then generate more with the aid of a computer program.
Start with, say, f. We want to find a good approximate solution of f(x,y) = 0.
Think of f(x,y) as representing the "altitude" at the point (x,y), and suppose
for the sake of the argument that this is positive -- i.e., we're above sea level. To get down to sea level as quickly as possible, you'd go down hill
by the steepest slope. (Well, maybe not, that might be a cliff -- but on
paper we're not troubled by that sort of detail...)
So just go down hill against the gradient. That is, the steepest way down hill
is opposite to the direction the gradient points, so go in this direction.
Let n denote the unit vector that points in this direction.
Now we not only need to know which way to go, but how far to go in that direction. Let d denote this distance, measured
"as the crow flies".
So the question that we need to answer is:
what is n, and what is d?
We can easily express the answers in terms of the gradient; here's how:
First d. Well, our altitude is f(x0,y0), and we need to bring this to zero. The slope in our chosen direction is
|gradf(x0,y0)|
so the distance d we should go (slope is rise over run) is
d = f(x0,y0)/
|gradf(x0,y0)|
The unit vector n in the direction we will go is
n = -gradf(x0,y0)/
|gradf(x0,y0)|
Hence the new point should be
(x1,y1) = (x0,y0) + d n
or
(1) (x1,y1) = (x0,y0) -
f(x0,y0)grad
f(x0,y0)/
|gradf(x0,y0)|2
This is the key formula in this approach. We derived it assuming that
f(x0,y0) was positive, so that we needed to go down hill,
but it works in the other case too: if f(x0,y0)
is negative, then
-f(x0,y0)grad
f(x0,y0)/
|gradf(x0,y0)|2
is a positvie multiple of the gradient, so the displacement
dn will be in the direction of the gradient;,i.e., the uphill
direction as it should be. So the key formula (1) works no matter what the
sign of f(x0,y0) is.
Now, we would have f(x1,y1) = 0
exactly if f were linear. In a region where f doesn't deviate much from being linear, i.e., where its graph is close to the tangent plane at
(x0,y0), then this will be correspondingly close to being satisfied. Hence, we've got what we want for f: a point (x1,y1) at which f(x1,y1) is
pretty close to zero.
So now that we've got the value of f in good shape, what about g?
Just take the new point as a fresh starting point, and apply the same procedure for g.
Then go back to f, and so on , alternating.
We summarize as follows:
(xj,yj) = (xj-1,yj-1) -
f(xj-1,yj-1)grad
f(xj-1,yj-1)/
|gradf(xj-1,yj-1)|2
for j odd, and by
and
(xj,yj) = (xj-1,yj-1) -
g(xj-1,yj-1)grad
g(xj-1,yj-1)/
|gradg(xj-1,yj-1)|2
for j even.
In the worked problems below. we will generate the first three terms in the
approximating sequence by hand.
But first, let's see why it works. A picture helps. In the following graph,
the green curve is part of the solution set of the cubic eqaution
y-x3 +2*x +1 = 0
and the blue curve is part of the circle
x2 + y2 - 3 = 0
The blue and green curves cross near to
(1.46,-1.04)
which means there is a solution near there. The graph is a close-up on this region. The dots in the picture are the succesive points generated
by the "one-at-a-time" implementation of Newton's method, starting from the point off to the right of the curves. The next point, after this starting guess
is the one you get to by going directly toward the green curve. The next
is the one you get going from there directly toward the blue curve. And so on.
Notice how the trail of dots "bounces back and forth" in the wedge, closing in on the intersection point as it goes.
How fast the sequence converges to the crossing point -- the solution --
depends on what the angle of crossing is.
What is better -- an angle close to 0, or close to pi/2?
Think about it!
One thing is for sure: if you are doing things "by hand", each step of the "one at a time" method is easoer to do.
But that is not such a good answer -- your'e not likely to do to many of these by hand. In fact, you want to do just enough to get a good feeling for what is going on.
To compare them in the context of a computer program,
having an interactive computer program with the two implementations is a big
help.
But with a computer program, we can experiment painlessly!
Here is
a link to an applet
that will let you experiment. It is the applet used to generate this pictur up above.
The applet interface looks like this:
What you see are the two equations graphed with the functions given in the boxes
below the graph. From the graph, we see there are four solutions in this example.
A radio button changes the panel on the right and lets you select the imlementation that you want to use.
At the bottom are two text fields in which you to specify the two functions
to be set equal to zero. The solution curves will be graphed in the colors indicated in the squares beside the functions.
If you select the "graph range controls" radio button, you will be able to
specify the region that you want graphed.
If you then select the "step button" radio button, you will bring up a panel
with a -- you guessed it -- step button. You can then click on your choice of a starting point, and then press this step button to step through the sequence of points
generated by whichever implementation of Newton's method you are using.
Clicking the "algorithm" radio button brings up a menu of implementations.
There are four of them to choose from, including the two discussed in these notes. Ignore the third for now. The fourth is a compbination of the first two. It starts out doing the one at a time implementation, and switches to the two at a time implementation at a carefully chosen stage in the process. See the lab
notes for more information on this and source code for doing your own implementation.
Try it out. You will see that the solution point you finally converge to -- when
you do converge -- is not always the one that is closest to where you started.
In fact, the question of which point one finally converges to has a suprisingly
interesting and complicated answer -- which depends on the implementation.
Here are some computer generated pictures that show what can happen.
These are gennerated by another applet, that you can also experiment with. Here is what is going on in them.
The applet takes the point corresponding to each pixel
as a starting point. It runs the selected implementation of Newton's method
until |g(x,y)| + |f(x,y)| is less than 10-9,
or 80 steps have been taken. After 80 steps, it gives up then no matter what the size of
|g(x,y)| + |f(x,y)| is.
If convergence didn't happen by 80 steps, it colors the pixel white, and goes
on the the next one. If however |g(x,y)| + |f(x,y)| got below
10-9, it takes the point at which the iteration was stopped -- a solution point -- and compares it with a list of previously found
solutions. If the point is in the list already, it was assigned a color (the first one to get listed gets red as its color, the second gets green, the third blue... and so on). If it is found on the list, the pixel gets the previosly assigned color. If not, it is added to the list and assigned a color. The pixel gets that color.
This goes on, until all the pixels are covered in a row by row left to right sweep. At the end, one has a picture like those below.
The one on the left is for the one function at a time approach. (White comes out as gray here because of the way I produced this picture...) The next one
is for the two at a time approach. I'll say what the third one is a bit later.
But first notice that:
f(x,y) = y - x3 + 2*x =1
and
f(x,y) = x2 + y2 -1
(b) We begine by computing the values of the functions and their gradients at
the starting point:
gradf(x,y) = (-3*x2 + 2,1)
gradg(x,y) = (2*x,2*y)
Therefore
f(1,1) = 1, gradf(1,1) = (-1,1)
and
g(1,1) = 1, gradg(1,1) = (2,2)
and hence, the linearization of f(x,y) at (1,1) is:
1 - (x-1) + (y-1) = -x + y + 1
and the linearization of g(x,y) at (1,1) is:
1 +2*(x-1) + 2*(y-1) = 2*x + 2*y -3
We now approximate by setting the linearizations -- instead of the functions themselves -- equal to zero. We have
-x + y = -1
2*x + 2*y = 3
Adding twice the first equations to the second yields the solution
y = 1/4 and x = 5/4
This is one complete iteration. At the new point (5/4, 1/4)
we readily compute that
f(5/4, 1/4) = -13/64 and g(5/4, 1/4) = 5/8
Which is an improvement -- both values got closer to zero.
We can check this with the applet. Pushing the "step" button in the applet
yields:
The numerical results agree exactly with our "by hand" calculations. (So you see you can use the applet to check the homework!)
On to the next iteration!!!
We've already computed the values of the functions at the new point, so we next need the values of the gradients. We compute:
f(5/4,1/4) = -13/64, gradf(5/4,1/4) = (-43/16,1)
and
g(5/4,1/4) = 5/8, gradg(5/4,1/4) = (5/2,1/2)
and hence, the linearization of f(x,y) at (1,1) is:
-13/64 - (43/64)(x-5/4) + (y-1/4)
and the linearization of g(x,y) at (1,1) is:
5/8 + (5/2)*(x-5/4) + (1/2)*(y-1/4)
We now approximate by setting the linearizations -- instead of the functions themselves -- equal to zero. After multiplying through to get rid of the denomonators, and simplifying, we get:
-86*x + 32*y = -93
20*x + 4*y = 21
Solving this, one has
x = 261/246 and y = 17/572
Now you notice two things: (x,y) is close to (1,0)
which we know is the answer, but the fractions are getting messy! It is
time to let a computer take over!
Here is the third itteration:
and the fourth
and the fifth:
After five iterations, we have four decimal places of accuracy. But notice
that there was a really big improvement from the fourth to the fifth.
What happens next?
Try the applet and find out!!
(c) We will start with f. As before
f(1,1) = 1, gradf(1,1) = (-1,1)
So the new point we move to (x1,y1) is
(x1,y1) = (1,1) - (1/(1+1))*(-1,1) = (1,1) - (-1/2,1/2) = (3/2,1/2)
That's it, the first iteration. At this point (3/2,1/2), we have
f(3/2,1/2) = -7/8 and g(3/2,1/2) = 3/2
So we overshot a bit, and made g worse. But then we were working only with f. So now lets work with g.
We have:
g(3/2,1/2) = 3/2, gradf(3/2,1/2) = (3,1)
So the new point we move to (x2,y2) is
(x2,y2) = (3/2,1/2) - ((3/2)/(9+1))*(3,1) = (3/2,1/2) - (-9/20,3/20) = (21/20,7/20)
At this point (21/20,7/20), we have
f(21/20,7/20) = 2339/8000 and g(21/20,7/20) = 9/40
That is, we've reduced the size of both f and g by a factor of
four in just two steps. Here are pictures generated by the applet. First, the starting point:
Next, after the first step (Noyice that the numerical values agree exactly with
our "by-hand" calculations):
And finally, after eight (count 'em!) steps:
So it is taking quite a while to get two decimal places of accuracy here.
But each step is much simpler...
a, b, c, d, e, f, g, h, i, j
as in
f(x,y) = a*x3 + b*x2*y + c*x*y2 +
d*y3 + e*x2 + f*x*y + g*y2 +
h*x + i*y + j
Assign values to the coefficients as follows: Toss a coin. If it comes up heads,
put
a=1
Otherwise, put
a=0
Do the same for
b, c, d, e, f, g, h, i, j
except with j. Here use -1 instead of 1 for heads.
(Why the minus sign for j? This is in case you toss, say,
T T T T H T H T T H
With these rules the eqaution f(x,y) =0 becomes
f(x,y) = x2 + y2 -1 = 0
which has solutions, as opposed to
f(x,y) = x2 + y2 +1 = 0
which doesn't have solutions.)
Now that you have the eqaution f(x,y) = 0, use the same rule to randomly specify g(x,y) = 0.
Then
Repeat the procedure for a new pair, again, and again,...
There are 220 different problems that can be generated this way. That is more than one million.
So you have an effectively infinite store of practice problems here! Try several.
You be suprised perhaps by the variety of graphs you see.
The Problem To Be Solved
Given a system of two equations
First Implementation of Newton's Method as an Algortihm
The most straightforward approach is the following algorithm for producing a sequence of sucessive approximations starting from the first approximation
(x0,y0):
Newton's Method: Two-At-A-Time Approach
For each positive integer j define
(xj,yj)
to be the solution of
Second Implementation of Newton's Method as an Algortihm
Here is another, simpler, algorithm for producing a sequence of sucessive approximations starting from the first approximation
(x0,y0):
In this simpler implementation, you work with one function at a time. Here are the steps.
Newton's Method: One-At-A-Time Approach
For each positive integer j define
(xj,yj)
to be given by
Comparison of the Two Implementations
Now, how do the two methods compare?
WORKED PROBLEMS
Worked Problem 1
Let
Solution to Worked Problem 1
(a) Below is a graph showing the two curves -- blue for g(x,y) = 0
and green for f(x,y) = 0. As you see, there are two solutions. The center of the circle is at the origin, and the red dot -- our starting point -- is at (1,1). If you wnat more coordinate values, go to the applet, and bring up the cursor position panel... The graph suggests that the two solutions shouls be near (0,1) and (1,0). And in fact, these are solutions! So it is not necesary to approximate them. On the other hand, right now we are trying to learn about the approximation itself, and then it is good to know the exact answer that you are approximating. That's why this problems was cooked up to be solvable in closed form.
Next, after the second step (Noyice that again the numerical values agree exactly with
our "by-hand" calculations):
Next, after the third step:
POSED PROBLEMS
Posed Problem 1
For this problem, consider the general third degree polynomial in two variables.
This has 10 coefficients