There is a very simple idea behind many of the methods of multivariable calculus. Namely, one studies functions of several variables by applying the single variable calculus to them in "one dimensional slices of them".
The simplest version of this would be to apply it one variable at a time; that is, using vertical or horizontal slices. But that would be a bit too simple to take us very far.
A good way to do this, which will take us far, is to build single variable functions F(t) by combining multivariable functions f(x,y) with parameterized paths (x(t),y(y)) as follows: We define
F(t) = f(x(t),y(t))
For example, if
f(x,y) = x2 + y4 and (x(t),y(t)) = (t2 + 1,t)
then
F(t) = 2*t4 + 2*t2 + 1
Now F(t) is a simple function of one variable -- namely t -- and we can differentiate it or integrate it with ease.
But why is this useful?
The best way to answer this is by giving an application. We will do this here by using this method to derive the several variable version of Taylor's theroem from the one variable version -- very easily.
Recall that for functions F(t) of one variable, Taylor's theorem
with remainder tells us which polynomial of degree N in t
is the "best" approximation to
F(t) at a given point t0 by a , and moreover,
it tells us how good the approximation is.
Consider a function
f(x,y)
and a "base point"
(x0,y0)
about which we shall expand.
Now here is where the path comes in. Fix any other point (x,y)
and define the paramterized path
(x(t),y(t)) = (x0 + t*(x-x0),
y0 + t*(y-y0))
Notice that
F(t) = f(x(t),y(t))
Then
F(0) = f(x0,y0)
F(1) = f(x,y).
In what follows -- and, in fact, in most applications anywhere -- the main interest is in the Taylor approximations of degrees one and two. So we concentrate on these.
By Taylor's Theorem applied to F(t) up to first order,
we have that
F(t) = F(0) + F'(0)*(t-t0) + R1(t,t0)
where
R1(t,t0) = (1/2)F''(s)*(t-t0)2
for some s between 0 and t.
And by Taylor's Theorem aplied to F(t) up to second order,
we have that
F(t) = F(0) + F'(0)*(t-t0) + (1/2)F''(0)*(t-t0)2+ R2(t,t0)
where
R2(t,t0) = (1/6)F'''(s)*(t-t0)3
for some s between 0 and t.
If we work out what this says for F(t) = f(x(t),y(t))
with t0 = 0 and t = 1, we get,
first writing things still in terms of F,
F(1) = F(0) + F'(0) + R1(t,t0)
in the first order case, and
F(1) = F(0) + F'(0) + (1/2)F''(0) + R2(t,t0)
and the second order case.
We now express these in terms of the original function f. We
have already said what F(0) and F(0) are in terms of f(x,y). For the
derivatives of F(t), we just need the chain rule. First,
F'(t) = fx(x(t),y(t))x'(t) + fy(x(t),y(t))y'(t)
Evaluating this at t= 0 gives us
F'(0) = fx(x0,y0)x'(0) +
fy(x0,y0)y'(0)
Finally, we easily compute from the definition that
x'(0) = (x-x0) and y'(0) = (y-y0)
Computing F''(0) is the same in principle, but requires a bit more care.
Differentiating the first derivative again yields
F''(t) = fxx(x(t),y(t))*(x'(t))2+ 2*fxy(x(t),y(t))*x'(t)*y'(t)
fyy(x(t),y(t))(y'(t))2
Note that there are no terms involving x''(t) or
y''(t) because x(t) and y(t) are
linear functions of t. Evaluting this at t = 0 gives us
F''(0) = fxx(x0,y0)*(x'(0))2+ 2*fxy(x0,y0)*x'(0)*y'(0)
fyy(x0,y0)(y'(0))2
Finally, evaluating x'(0) and y'(0) as before
gives us
F''(0) = fxx(x0,y0)*(x-x0)2+ 2*fxy(x0,y0)*(x-x0)*
(y-y0)
fyy(x0,y0)(y-y0)2
This takes care of everything but the remainder terms. Working them out is pretty much more of the same. The details are left as an exercise.
In the meantime we state the results, and discuss how we shall use them.
f(x,y)= f(x0,y0) +
fx(x0,y0)*(x-x0) +
fy(x0,y0)*(y-y0) +
R1
Where
|R1| <= M*((x - x0)2 +
(y - y0)2)
and M is an upper bound on
|fxx| , |fxy| and |fyy|
along a line segment conecting (x0,y0) and
(x,y)
f(x,y)= f(x0,y0) +
fx(x0,y0)*(x-x0) +
fy(x0,y0)*(y-y0) +
fxx(x0,y0)*(x-x0)2 +
2*fxy(x0,y0)*(x-x0)*
(y-y0)+
fyy(x0,y0)*(y-y0)2 +
R2
Where
|R2| <= M*((x - x0)2 +
(y - y0)2)3/2
and M is an upper bound on
|fxxx| , |fxxy|, |fxyy| and |fyyy|
along a line segment conecting (x0,y0) and
(x,y)
You may be wondering what happened to the factors of 1/N! in the
remainder term. They got eaten up -- partially -- by effects of the mutlidimensionality. Actually, one could do better. Though this is not so important
for small N, it becomes useful for large N. For more details,
click here to see the proof and derivation of the bound on the remainder.
In two variables, a function h(x,y) is linear
in case it has the form:
h(x,y) = a*x + b*y + c
for constants a, b and c.
Notice that
h(x,y) = a*(x - x0) + b*(y - y0) + c
is also linear becuase the terms can be regrouped into the specified form.
Linear functions are so much easier to work with than non-linear functions
that we often want to appriximate non-linear functions with linear ones.
And, as we will see soon when we study Newton's method in several variables,
we can get answers of arbitrarily good accuracy this way.
But to do this well, we genreraly need to choose the best linear approximation. This is given to us by Taylor's formula, just as it is in one dimension.
|f(x,y) - h(x,y)|
gets closer and closer to 0,
as (x,y) gets closer and closer
to (x0,y0), then we say that
h(x,y) approximates f(x,y) at the point
(x0,y0).
Now clearly any linear function k(x,y)
of the form
k(x,y) = dot((a,b),(x-x0,y-y0))
+ f(x0,y0)
approximates f(x,y) at the point
(x0,y0). But there is something special
about the case (a,b) = Gradf(x0,y0): This choice gives us the best linear approximation to f(x,y) at
(x0,y0).
To explain this, we have to say what it means for one approximation to be better
than another.
A formal definition will follow, but let's try to grasp the point with an example first.
Consider the function
f(x,y) = x2 + y2
And let's compute the Taylor expansion at (1,1).
This is:
f(x,y) = 2 + 2*(x-1) + 2*(y-1) + R1 = 2*x + 2*y -2
R1
The best linear approximation is therefore
h(x,y) = 2 + 2*(x-1) + 2*(y-1)
Noe let's consider another function
k(x,y) = 2 + 3*(x-1) + 2*(y-1)
Where we've changed one of the coefficients. Note that
k(1,1) = h(1,1) = f(1,1) = 2
so both k(x,y) and h(x,y) are approximations to
f(x,y) at (1,1).
However,
|f(x,y)-h(x,y)| = |R1|
while
|f(x,y)-k(x,y)| = |(x-1) + R1|
Now, which of these discrepencies is bigger?
Well, for (x,y) close to (1,1) it is going to be the second one. Consider
(x,y) = (1.000001,1) = (1 + 10-6,1)
Then it is easy to see from the bound on the size of R1
in our theorem that
|R1| < 10-11
but that
|(x-1) + R1| is almost equal to 10-6
Here is the point: The size of R1 is always a fixed multiple of the square of the distance from the base point. If we modify any of the
linear coefficients in the linear function we get by truncating the Taylor expansion, we will get an error that is proportianal to the distance
at some point close by. (Note that this happened for
(x,y) = (1.000001,1) in our example, but wouldn't have
for (x,y) = (1,1.000001).)
Now when the distance is small, it is better to have an error going like the square of the distance, since the square of a small number is reall small --
as with 10-6 and 10-12.
That is the idea. If you understand that, you don't really need the
definition that follows.
Then h(x,y) is a better approximation to
f(x,y) at (x0,y0) than
h(x,y) is provided there is an R0 > 0 such that
H(R) < K(R)
for all R < R0.
Of course we say that a function is the best approximation if it is better than any other. We get the best approximations, in the sense defined above, by simply throwing out the reaminder terms in Taylor's theoremm:
h(x,y) = a*(x - x0) + b*(y - y0) + c
with
a = fx(x0,y0)
b = fy(x0,y0)
c = f<(x0,y0)
The proof of this theroem is just a simple analysis of the error term in Taylor's Theorem. Similarly, we have
h(x,y) = A*(x - x0)2 + B*(y - y0)2 +
C*(x - x0)*(y - y0) + a*(x - x0) + b*(y - y0) + c
with
2*A = fxx(x0,y0)
2*B = fyy(x0,y0)
C = fxy(x0,y0)
and a, b and c are as above.
In the next section of the notes we will study the geometry of best linear approximations to f(x,y) -- and see the connection with tangent planes to the graph of
f(x,y).
f(x,y) = x2 - y2 - 2*x -2*y
Notice that this function is quadratic, so you should already have a pretty good
idea about what the best linear and quadratic approximations are.
f(0,0) = 0 fx (0,0) = -2 fy (0,0) = -2
fxx (0,0) = 2 fyy (0,0) = -2 fxy (0,0) = 0
Therefore, the best linear approximation h(x,y)> is
h(x,y) = -2*x -2*y
and the best quadratic approximation q(x,y)> is
q(x,y) = x2 - y2 - 2*x -2*y
which is just f(x,y) as you should have expected.
For the next point, (1,-1),
One easily computes that
f(1,-1) = 0 fx (1,-1)) = -2 fy (1,-1) = -2
fxx (1,-1) = 2 fyy (1,-1) = -2 fxy (1,-1) = 0
Therefore, the best linear approximation at (1,-1), h(x,y), is
h(x,y) = -2*(x-1) -2*(y+1) = -2*x - 2*y
and the best quadratic approximation at (1,-1), q(x,y)>, is
q(x,y) = (x-1)2 - (y+1)2
which is another way of writing f(x,y).
f(x,y) = x3 +2*x*y + y2 - 1
Notice that this function is not quadratic, but perhaps you already have a pretty good
idea about what the best linear and quadratic approximations are.
f(0,0) = -1 fx (0,0) = 0 fy (0,0) = 0
fxx (0,0) = 0 fyy (0,0) = 2 fxy (0,0) = 2
Therefore, the best linear approximation h(x,y)> is
h(x,y) = -1
and the best quadratic approximation q(x,y)> is
q(x,y) = y2 + 2*x*y - 1
which is just what you should have expected.
For the next point, (1,-1),
One easily computes that
f(1,-1) = -1 fx (1,-1)) = 1 fy (1,-1) = 0
fxx (1,-1) = 6 fyy (1,-1) = 2 fxy (1,-1) = 2
Therefore, the best linear approximation at (1,-1), h(x,y)>, is
h(x,y) = (x-1) -1 = x -2
and the best quadratic approximation at (1,-1), q(x,y)>, is
q(x,y) = 3*(x-1)2 + (y+1)2
+ (x-1)*(y+1) + (x-1) -1
Is this what you expected?
f(x,y) = x4*y +y4*x - 2*x*y + 3
Notice that this function is not quadratic, but perhaps you already have a pretty good
idea about what the best linear and quadratic approximations are.
f(x,y) = ex2 + y2
Notice that this function is not a polynomial, so it is harder to
find the approximations by algebra.
The problem to be considered here is: how can we do this for functions of several variables?
Taylor's Theorem In Two Variables: First Order Case
For any function f(x,y) with continuous second order derivatives
Taylor's Theorem In Two Variables: Second Order Case
For any function f(x,y) with continuous third order derivatives
Best Linear and Quadratic Approximation
Now we recall one of the basic truths of mathematics: The nicest functions are the linear functions After that, the nicest are the quadratic functions; i.e., the second degree polynomials.
Definition of Approximation of Functions
Let h(x,y) and f(x,y) be two functions. Then
if
Definition of Better Approximation
Let h(x,y) and k(x,y) be two functions that
approximate f(x,y) at the point (x0,y0).
Let R > 0 be a given radius, and let H(R) denote the
"worst case" dissagreement between h(x,y) and f(x,y)
for (x,y) in the disk or radius R centered at
(x0,y0). That is,
H(R) is the maximum value of |f(x,y) - h(x,y)|
for (x,y) in the disk or radius R centered at
(x0,y0). Let K(R) be the
corresponding quantity for the function h(x,y).
Theorem on the Best Linear Approximation
The best linaer approximation to f(x,y) at the point
(x0,y0) is
Theorem on the Best Quadratic Approximation
The best linaer approximation to f(x,y) at the point
(x0,y0) is
WORKED PROBLEMS
Worked Problem 1
Let
Solution to Worked Problem 1
One easily computes that
Worked Problem 2
Let
Solution to Worked Problem 2
One easily computes that
POSED PROBLEMS
Posed Problem 1
Let
Posed Problem 1
Let