Again recall the very simple idea that is behind many of the methods of multivariable calculus. Namely, one studies functions of several variables by applying the single variable calculus to one dimensional "slices" of them.
We have seen how this leads directly from Taylor's Theroem for a single variable to Taylor's Theorem for several variables. This led us to the notion of best linear approximation. Now let's understand the geometry behind this.
Recall that the strategy is to build single variable functions F(t) by combining multivariable functions f(x,y) with parameterized paths (x(t),y(y)) as follows: We define
F(t) = f(x(t),y(t))
Now F(t) is a function of a single variable -- namely t -- and we can differentiate it or integrate it with the many single varible methods at our disposal.
We started out with a function f(x,y) of two variables. What does differentiating F(t) tell us about f(x,y)? We can do it, but what is the point?
You can understand, in non-mathematical terms, what this tells you if you take the following point of view: Think of a piece of land that is to be surveyed. Think of x and y as the coordinates on a map of this piece of land, and finally think of f(x,y) as giving the altitude at the point (x,y).
Imagine you walk along a path that goes all over this piece of land, measuring the rate of change in your altitude as you go, taking notes the rate of change in altitude; i.e., the slope, at points all along your path.
If you took enough notes, you could afterwards construct an accurate three dimensional model of the region. That is, you can learn everything there is to know about the altitude function f(x,y) this way. And since slopes are given by derivatives, you can see how by studying F'(t) for lots of parameterized paths (x(t),y(t)) -- which ammounts to just another way of saying "keeping track of how your altidude changes as you walk all over the place" -- you can learn all about the altitude function.
Now, how do make mathematics out of this? In crude terms, how do we turn these musings about strategy into formulas?
The key to this is to extract the geometrical content from F'(t). Geometry will be the key to everything we do in this course.
To carry this discussion back to mathematics as such, let's consider a path (x(t),y(t)) that passes through some point (x0,y0) at t=0, and let's keep track of how our altitude changes as you walk along the path. Now lets compute the derivative of F(t) = f(x(t),y(t)). By the chain rule,
F'(0) = fx(x0,y0)*x'(0) + fy(x0,y0)*y'(0)
This is a combination of four numbers, namely
fx(x0,y0), x'(0), fy(x0,y0), y'(0)
Two came from f, and two are determined by the particular path through the point (x0,y0), namely the two components of the velocity vector
V = (x'(0),y'(0))
For different velocity vectors, we get a different value for the derivative. Suppose for example that V = (3,-1). Then we have
F'(0) = fx(x0,y0)*3 + fy(x0,y0)*(-1)
But suppose the velocity vector had been V = (2,4)? Then we would have had
F'(0) = fx(x0,y0)*2 + fy(x0,y0)*4
There are clearly infintely many such derivatives, since there are infinitely many possible velocity vectors V. Each one of these derivatives is called a directional derivative. Specifically, we make the following definition:
fx(x0,y0)*a + fy(x0,y0)*b
or, what is the same, the derivative at t = 0 of the function
F(t) = f(x0 + a*t,y0 + b*t)
To see where this last bit came from, suppose we are given an arbitrary vector (a,b).
Now clearly there are infinitely many paths through (x0,y0)
that have this velocity vector (a,b) at t=0.
But a particularly simple one is the straight-line path
(x(t),y(t)) =(x0 + a*t,y0 + b*t)
The definition of directional derivatives
becomes a bit more inteligible if we remember two things:
First, the vector (x'(0),y'(0)) is the velocity of the
paramterized curve (x(t),y(t)) at t = 0. Second, given
two vectors X = (a,b) and Y = (c,d),
the dot product dot(X,Y) is given by
dot(X,Y) = a*c + b*d
Remembering these two things, we see that if we make the following definition:
Gradf(x0,y0) = (fx(x0),fy(x0)).
Therefore, if we write
V = (x'(0),y'(0))
for our velocity vector,
then we can rewrite our directional derivatives in the following
geometrical form:
F'(0) = dot(Gradf(x0,y0),
V)
In fact, we will usually supress explicit mention of the arguments t, x0 and y0
-- which after all are taking on "general", i.e., non-specific , values in our discussion anyway --
and simply write
F'(0) = dot(Gradf,V)
This last form of the chain rule certainly looks more geometric, written out
in terms of vectors and a dot product. It also separates what comes from f, and what comes from the path. In doing so, it singles out
the data we need from f to compute directional derivatives. The
for defining the gradient as a vector is exactly that it contains all the
data from f that we need to compute arbitrary directional
derivatives and a simple vector operation can be used to
do the computation.
A good mathematical definition is never made for mysterious reasons. It
is made for practical reasons with a particular class of computations in view.
The nomenclature "directional derivative" may seem a bit inappropriate, since
the value of F'(0) depends not only on the direction of the velocity
vector, but also on its magnitude . However, if we double the magnitude of the
velocity vector, the directional derivative doubles too. On the other hand, if we change
the direction by, say, doubling the angle the velocity vector makes with the x axis,
anything at all can happen to the size of the directional derivative. So the
name is not so bad: the really interesting dependence on V
is in the dependence on the direcion of V. Once we know
the directional derivatives of f for all unit vector
velocities, we know them for all velocities.
Now that we know about partial derivatives, we ask: what is the derivative of a function
f(x,y) of two variables?
For a function g(x) of one variable, we were always working with the derivative
g'(x). Now we've got all these infinitely many directional derivatives. Isn't there something we
can single out as the derivative? It may seem we can make progress by
considering the two numbers fx(x0,y0)
and fy(x0,y0). This is better than infinitely many,
but these are just two particular directional derivatives: namely those corresponding to
the directions (1,0) and (0,1) respectively. Any special place given to these directions would be an
artifact of our coordinates system.
There is a good notion of the derivative of a function f(x,y) of two variables.
To find it we just have to stopping thinking in terms of numbers, and to think instead in
terms of geometry.
Recall that geometrically speaking the derivative g'(x0) for a function g(x)
of a single variable x specifies the tangent line
to the graph y= g(x) at x0.
For a function f(x,y) of two variables, the corresponding quantity of interest is the
tangent plane to the graph z = f(x,y) at the point (x0,y0).
Clearly, this is a single, well defined geometrical entity, and whatever specifies this plane has
every bit as much right to be considered the derivative of f(x,y) as what specifies the
tangent line of g(x) -- namely g'(x).
So, what does specify this plane, and how is it related to the directional derivatives that we have been
considering? That is the question we shall now answer.
To do this, recall that the equation of a plane takes the general form
z = a*x + b*y + c
and note that this plane is the graph of a linear function; i.e. the function h(x,y)
h(x,y) = a*x + b*y + c
So what we are looking for is a way to compute the numbers a, b and c that give the
tangent plane to f at (x0,y0).
This will be easy as soon as we have a precise definition of what it means for the graphs
of two functions to be tangent to one another.
(1) f(x0,y0) = h(x0,y0)
(2) For every vector (a,b), the directional derivatives of f and h along (a,b) are equal at
(x0,y0).
That is, not only do the values of the functions agree at the point of tangency, but
also their slopes in any direction.
Now suppose the graphs of f(x,y) and h(x,y) are tangent to one another at (x0,y0).
Suppose further that h(x,y) is linear; i.e., for some a,
b and c,
h(x,y) = a*x + b*y + c
Then by (1), plugging in x0 and y0 we get
f(x0,y0) = a*x0 + b*y0 + c
or
c = f(x0,y0) - a*x0 - b*y0
Now, the directional derivative of f(x,y) in the direction (1,0) is fx(x0,y0),
while the directional derivative of h(x,y) in the direction (1,0) is h_x(x0,y0) = a.
In the same way,
the directional derivative of f(x,y) in the direction (0,1) is fy(x0,y0),
while the directional derivative of h(x,y) in the direction
(0,1) is h_y(x0,y0) = b.
So we must have
a = fx(x0,y0) and b= fy(x0,y0)
Returning to our expression for c, we have
c = f(x0,y0) - fx(x0,y0)*x0 - fy(x0,y0)*y0
Now h(x,y) = a*x + b*y + c becomes
h(x,y) = fx(x0,y0)*x + fy(x0,y0)*y + [f(x0,y0) - fx(x0,y0)*x0 - fy(x0,y0)*y0]
or what is the same
h(x,y) = fx(x0,y0)*(x - x0) + fy(x0,y0)*(y - y0) + f(x0,y0)
Note that h(x,y) is just the best linear approxiamtion to
f(x,y) at (x0,y0) as defined in the
section on the mutivariable Taylor's theorem.
We can, however, rewrite this a bit more clearly in vector terms. Using the definitions made above:
h(x,y) = dot(Gradf,(x-x0,y-y0))
+ f(x0,y0)
which is of course the plane
z - dot(Gradf,(x-x0,y-y0))
+ f(x0,y0) = 0
f(x,y) = x2 - y2 - 2*x -2*y
f(0,0) = 0 fx (0,0) = -2 fy (0,0) = -2
Therefore, the gradient gradf(0,0)> is
gradf(0,0)> = (-2,2)
and the equation of the tangent plane at (0,0) is
z = - 2*x -2*y
For the next point, (1,-1),
One easily computes that
f(1,-1) = 0 fx (1,-1)) = -2 fy (1,-1) = -2
Therefore, the gradient gradf(1,-1)> is
gradf(0,0)> = (0,0)
and the equation of the tangent plane at (0,0) is
z = 0
f(x,y) = x3 +2*x*y + y2 - 1
f(0,0) = -1 fx (0,0) = 0 fy (0,0) = 0
fxx (0,0) = 0 fyy (0,0) = 2 fxy (0,0) = 2
Therefore, the gradient gradf(0,0)> is
gradf(0,0)> = (0,0)
and the equation of the tangent plane at (0,0) is
z = -1
For the next point, (1,-1),
One easily computes that
f(1,-1) = 0 fx (1,-1)) = -2 fy (1,-1) = -2
f(1,-1) = -1 fx (1,-1)) = 1 fy (1,-1) = 0
fxx (1,-1) = 6 fyy (1,-1) = 2 fxy (1,-1) = 2
Therefore, the gradient gradf(1,-1)> is
gradf(-1,1)> = (1,0)
and the equation of the tangent plane at (0,0) is
z = x-1
f(x,y) = x4*y +y4*x - 2*x*y + 3
f(x,y) = ex2 + y2
Definition of directional derivatives
Given a vector (a,b), the
directional derivative of f
in the direction (a,b) at the point (x0,y0) is the number
Definition of the gradient
The gradient of f at (x0,y0), denoted by
Gradf(x0,y0), is defined to be
the vector
Tangency and Tangent Planes
Definition of tangency for graphs in two variables
Let f(x,y) and
h(x,y) be two functions of x and y.
Consider their graphs; i.e., z = f(x,y) andz = h(x,y). These graphs are said to be
tangent to one another at (x0,y0) provided
Theorem on the tangent plane of f in terms of Gradf
The tangent plane to f at (x0,y0)
is the graph of the linear function
WORKED PROBLEMS
Worked Problem 1
Let
Solution to Worked Problem 1
One easily computes that
Worked Problem 2
Let
Solution to Worked Problem 2
One easily computes that
POSED PROBLEMS
Posed Problem 1
Let
Posed Problem 2
Let