TOPIC: LAGRANGE MULTIPLIERS

The problem to be considered:

We are given a constraint equation

g(x,y) = 0

and an optimizee function f(x,y).

The goal is to find the maximum and minimum values of f among the values taken on by f along the constraint curve; i.e., the set of points for which g(x,y) = 0. Moreover, we would like to find all of the points (x,y) at which these maxima and minima are attained.

An Example Problem

Such problems often arise in applications. Here is an example. Suppose you want to make cylindrical tin cans that hold, say, one quart of some liquid. One thing you might like to do is to minimize the amount of material needed to make the can. This is proportional to the total surface area. If the radius of the can is R, and the height is H, the surface area is

2*pi*R2 + 2*pi*R*H

while the volume Vis

pi*R2*H = V

The problem is to vary R and H so that we keep the volume constant.

We can write the constraint equation above in the form g(x,y) =0 by taking

x = R y = H and introducing a = V/pi.

Then, subtracting the right side of the constraint equation from the left, we get

g(x,y) = x2*y -a = 0

Then, ignoring a constant factor of 2*pi, the function to be minimized is

f(x,y) = x2 + x*y

Note something important here: what we want to know is what x and y are at the minimum, since that tells us what shape to make the can. And the place (x,y) at which the minimum occurs doesn't depend on whether we take our cost function as it is, or multiply it by any constant -- that just changes the minimum value, not where the minimization occurs. Think about it. This will be clear also from the equations we derive to solve this problem. But that comes later.

Also note that since we are considering our variables as lengths, we are only concerend with

x > 0 and y > 0

Graphical Solution of this Problem

Now lets solve this problem graphicly, before developing the theory. The Lagrange multipliers applet lets you enter a cost function f(x,y) and a constraint function g(x,y), to be used in the constraint equation g(x,y) = 0. It then graphs f(x,y) in color, indicating the highes values with white pixels, low values with green pixels, and in between heights get inbetween colors. (Think of deep green valeys and snowy mountain tops.) The constraint curve is drawn in blue. You can also set the region to be graphed. Here we have chosen

-4 < x,y < 4

Of course, we also have to fix the value of a. We have taken a = 1 to produce these graphs.

Clicking on points draws in the contour curves for the cost function through those points. Here are some pictures:

The first picture shows four contour curves at large heights, i.e., values of f -- see the "snow" in the upper right hand corner? As we "walk along the blue path" from right to left in the picture, we are crossing to lower and lower levels.

The next picture shows some contour curves for some really low values. These are lower than anything you can get on the constraint curve -- clearly there is a minimum amount of metal you can use to make a can holding a quart...

The final picture has just one contour curve that is pretty much just right: Move it any fursther to the left, towards lower values, and it wouldn't touch the constraint curve.

Let's change the scale and zoom in to get a better look:

In the first one, the contour curve in the middle just touches the constraint curve. The place wehre they touch is where the minimum occurs.

Think about this until it is really clear.

In the applet, if you put the cursor over a point it tells you the coordinate at that point, and the value of f(x,y) there. Do this and you find it says that the mimimum occurs near to

x = 0.8 and y = 1.6

Actually it is a bit hard to tell, since both the constraint curve and the contour curve have very little curvature at this point. In fact, the curves seem pretty close to touching for all y in the range

1.15 < y < 2.95

The second picture above is a closeup, and you see it doesn't help us pin things down much better.

But here is what we have learned graphicaly: We should make our can such that the height is about twice the radius, i.e., the height and diameter should be the same. It turns out, as we shall see, that this is the exact answer, and it is independent of a; i.e., it is true independent of the volume. This will come out of the analytic approach, but it can be seen directly. It has to do with the different scaling properties of area and volume -- basicly, a is always 1 in some system of units, so our result is valid in those units. But since the statement of this result -- namely, that the height and diameter should be the same -- doesn't involve the units, we have the result in any other system of units as well.

This kind of reasoning is called a scaling argument, and is fundamentally important in the mathematical and physical sciences.

How can we derive this without relying so much on pictures -- which require a lot of computational resources to draw?

Let's develop an analytic appraoch to the problem.

A First Analytic Approach

We can reduce this to a familiar one variable optimization problem by parameterizing the constraint curve.

This is easy, but only because of the particularly simple constraint equation that we are faced with. If we put

(x(t),y(t)) = (1/sqrt(t),t)

Then

g(x(t),y(t)) = (1/sqrt(t))2*t - 1 = 0

so this parameterized curve does trace out the constraint curve. We could have also used

(x(t),y(t)) = (t,1/t2)

but we will be able to make an important connection with the nearly vertical region of tangency in the graphs above if y = t.

Now as usual, define

F(t) = f(x(t),y(t)) = 1/t + sqrt(t)

Since this gives us the area used at the point (x(t),y(t)) on the constraint curve, we may as well be looking for the minimum of F(t) over all t > 0 . To find it, first differentiate:

F'(t) = -1/t2 + 1/(2*sqrt(t)) Setting this eqaul to zero, one easily solves to find

t = 22/3

At this value of t, the path is at (x,y) = (2-1/3,22/3)

We aren't done yet -- we have to consider the "limiting ends" of our constraint curve. But clearly from the formula for it, F(t) tends to infinity as either t tends to zero or infinity. So the minimum occurs at

(x0,y0) = (2-1/3,22/3)

This is the answer, but notice that there is another way of expressing it since y0 = 2*x0, i.e., the height should be twice the radius. As we have explained above, this staement which is indpeendent of units applies to all volumes.

Now, why don't you see all the cans in the supermarket looking like this? The answer has to do with the broad region of tangency that we found in the graphs above. This can be seen analytically, but it we digressive to do so here.

A Second Analytic Approach

The problem with the first approach is that it is often hard to parameterize the constraint curve, or to work with the equations one gets that way.

If for example

g(x,y) = x2 + y2 - 1 and f(x,y) = x4 + y6 +2*x*y

then we can parameterize the constraint curve, which is the unit circle with sin(t) and cos(t) in the usual way:

(x(t),y(t)) = (cos(t),sin(t))

This leads to

F(t) = f(x(t),y(t)) = cos4(t) + sin6(t) + 2*cos2(t)*sin2(t)

Differentialting this doesn't lead to a particulalry friendly equation.

However, let's go back and look at what F'(t) is geometricly. Recall that we can write is as

F'(t) = dot(gradf,V)

where V is the velocity vector to the contstraint curve at the point (x(t),y(t)). This is progress:

We have that F'(t) = 0 whenever gradf and V are orthogonal. To go further we need to get a formula for V in terms of g. This we can do, because the constraint curve is a contour curve of g. That is the point of writing the constraint equation in the form g(x,y) = 0.

We know from our analysis of gradients and contour curves that the tangential directions to the contour curve, such as V, are orthogonal to the gradient of g. Applying the perp operation to gradg, we get a vector that points in the tangent direction; i.e., direction of V. Thus,

V = r*gradgperp

for some multiple r. Now

F'(t) = dot(gradf,V)

r*dot(gradf,gradgperp)

So that

F'(t) = 0 whenever dot(gradf,gradgperp) = 0

Since our short list of points is given by finding all the points where F'(t) = 0, we won't miss anything -- we might get some extras, but O.K. --if we find the short list of all points where dot(gradf,gradgperp) = 0.

Now that the geometry has served its purpose -- which was to avoid a parameterization, let's work out a short formula. Since

gradgperp = (-gy,gx)

dot(gradf,gradgperp) = -fxgy + fygx

So we find our short list by setting this last quantity eqaul to zero. That is only one equation in two unknowns, but we have another: the constraint equation itself!

We have derived the following result:

Theorem on Constrained Optimization in Two Variables

If there is a maimum or a minimum of the of the values that the function f(x,y) assumes on the constraint curve g(x,y) = 0, then it occurs at a point at which

(1) fx(x,y)gy(x,y) = fy(x,y)gx(x,y)

and

(2) g(x,y) = 0

Notice that this is two equations in two unknowns, so we can expect that often there will be just a finite number of solutions. The list of all of the solutions is our short list. Note that finding these solutions may not be easy, but we have a method: Newton's method.

Now, where does the term "Lagrange multipliers" come form?

A short answer is that it was the multiple r that came and went quickly in our derivation.

A better answer is that there is another way to write the equation (1) that genralizes to the situation in which there are three or more variables. Equation (1) is almost equivalent to the pair of equations

(3) gradf(x,y) = r*gradg(x,y)

for some value r. These equations are equivalent if we make the assumption that gradg(x,y) does not equal zero anywhere on the constraint curve. That is, there are no gritical points of g on the constraint curve g = 0. This is the "usual case", but there are eaxample when it doesn't hold.

The equation (3) is a pair of equations because equality means eqaulity in each of the two components of the vectors.

The value r that multiplies gradg(x,y) is called a Lagrange multiplier.

So an equivalent system is the system of eqautions (2) and (3), which is three equations in three unknowns: x, y and r. Eliminating r, one arrives at the simpler set of equations (1) and (2).

So why ever introduce this multiplier variable in the first place?

One answer is that the eqaution in multiplier form has a geomentric meaning: Namely, that the contour curves of f and g are tangent at a point where it holds , or else one is at a critical point of f. Another answer is that the perping operation is a two-variable thing, and doesn't generalize to three or more vaiables. We have the following:

Theorem on Constrained Optimization in Three Variables with one Constraint

If there is a maimum or a minimum of the values that the function f(x,y,z) assumes on the constraint surface g(x,y,z) = 0, then it occurs at a point at which

(4) gradf(x,y,z) = r*gradg(x,y,z)

and

(5) g(x,y,z) = 0 r. Notice that (4) gives us three equations, one for each component, so we have four eqautions for four unknowns, as we would expect if we hope to get a finite, non-empty, list of solutions.

WORKED PROBLEMS

Worked Problem 1

Let

f(x,y) = x*y2 +x + 2*y

and

g(x,y) = x*y-1

Find the minimum and maximum values, if they exist, of f(x,y) subject to the constraint g(x,y) = 0 with x>0. In the case that they do exist, identify all of the points (x,y) at which these values are attained.

Notice that the constraint curve y=1/x has two separate branches: one for x>0, and one for x<0 Here we are concenred only with the first, because we have in mind the following interpretation:

Think of x and y as giving the width and height of a rectangle. Then the constraint equation g(x,y) = 0 is the constraint that the area of the rectangle is 1. The function f is then a cost function that contains terms that go up with the sqaure of the height, and also linearly. You can tink of examples like this from building construction with a little effort.

Solution to Worked Problem 1

We easily compute that

gradf(x,y) = (y2+1,2+2*x*y)

and

gradg(x,y) = (y,x)

Hence our equation

fxgy = fygx

becomes

x*y2 + x = 2*y + 2*x*y2

Combining this with the constraint equation x*y = 1 and simplifying yields

y2 = 1/3

Taking sqaure roots, we get two solutions for y. For each of these the corresponding x is 1/y, according to the constraint equation.

Thus, we have two solutions:

(sqrt(3),1/sqrt(3)) and (-sqrt(3),-1/sqrt(3))

Only the first one lies on our constraint curve, becuase we are asked to only consider the x>0 branch in the statement of the problem. Hence there is only one point on our short list:

(sqrt(3),1/sqrt(3))

Now, is there a maximum or a minimum?

There can't be both, because they'd have to occur at places on our short list, and there is only one place there.

We handle this part of the quaestion as usual: by considering what happens for very large and small values of x. Since y=1/x, on the constraint curve, the cost function is

f(x,1/x) = 3/x + x

Notice that this is bounded below -- it is always positive since x itself is -- but that it tends to infinity as x tends to either zero or infinity. Thus there will be no maximum, but there will be a minimum.

Since there is only one point on our short list, this must be the location of the minimum.

So the answer is that the minimum value of the cost function is attained at (sqrt(3),1/sqrt(3)), and this minimum value is 2*sqrt(3).

Here are the pictures from the Lagrange multipliers applet, which confirm this result as correct:

The first show the region -10 < x,y < 10, and a few contour curves have been added. As you see, the fourth from the left looks like it just touches the constraint curve tangentially, Reading off the location of this point with the cursor, and zooming in, we get the second picture. The middle contour curve here is tangent, and the tangency occurs about at (1.73,0.569), as it should.

Things to Notice

  • The constraint equation here was so simple that we could solve it for y. Whenever you can do this, you can solve the problem by single vartiable methods -- because that is all that is left after you eliminate the second one. Indeed, we already obsrved that on the constraint curve, the cost function was

    f(x,1/x) = 3/x + x

    Let's call the right hand side F(x). Minimizing it by single vartiable methods, we solve F'(x) = 0 But

    F'(x) = -3/x2 + 1

    which leads us once more to x = sqrt(3). In the next problem, there is no easy way to use the constraint eqaution to eliminate a variable.

  • Worked Problem 2

    Let

    f(x,y) = (x+y)4 +(x-y)2

    and

    g(x,y) = x2 + y2-1

    Find the minimum and maximum values, if they exist, of f(x,y) subject to the constraint g(x,y) = 0 with x>0. In the case that they do exist, identify all of the points (x,y) at which these values are attained.

    Solution to Worked Problem 2

    We observe right at the beginning that since the constraint curve is bounded -- it is the unit circle -- there is no problem with limiting values of the vabiables. So there will be a maximum and a minimum.

    We compute:

    gradf(x,y) = (4*(x+y)3 + 2*(x-y),(4*(x+y)3 - 2*(x-y))

    and

    gradg(x,y) = (2*x,2*y)

    Hence our equation

    fxgy = fygx

    becomes (after throwing out some comon factors of 2):

    2*(x+y)3*y + (x-y)*y = 2*(x+y)3*x - (x-y)*x

    or

    (x+y)(x-y) = 2*(x-y)*(x+y)3

    This is solved if either

    x = y or x = -y.

    Going back to our constraint equation, in either case we can now eliminate y in our constraint eqaution to get

    2*x2 = 1

    Putting it al together, we have four points for our short list so far:

    (1/sqrt(2),1/sqrt(2)) (-1/sqrt(2),-1/sqrt(2)) (1/sqrt(2),-1/sqrt(2)) (-1/sqrt(2),1/sqrt(2))

    where the first two came from x = y, and the next two from y = -x

    To find the rest, assume that neither x = y nor x = -y. Then we can cancel (x+y)(x-y) from both sides of our equation, yoelding

    1 = 2*(x+y)2

    So now we have to solve the system given by this equation and our constraint eqaution

    x2 + y2 =1

    To do this, note that

    2*(x+y)2 = 2*(x2 + y2 + 2*x*y) = 2*(1 + 2*x*y) using the constraint equation. Thus

    1 = 2*(x+y)2

    leads to

    y = -1/4*x

    Using this to eliminate y in the constraint equation, we get

    x2 + (1/16)x-2 =1

    Multiplying through by x2, we get a quadratic eqaution in x2. Comleteing the square, this is

    (x2 + 1/2)2 = 5/16.

    This has the two solutions x = sqrt(sqrt(5)/4 - 1/2) and x = -sqrt(sqrt(5)/4. Then from y = -1/4*x, which tells us that y has the opposite sign as x, and enables us to compute y in terms of x, we find two more points on our short list:

    (sqrt(sqrt(5)/4 - 1/2),-sqrt(3/2 - sqrt(5))) and (-sqrt(sqrt(5)/4 - 1/2),sqrt(3/2 - sqrt(5)))

    Thus, so far, our short list has 6 points on it. Now notice that our eqautions are symmetric in interchanging x and y. We made an arbitrary choice above when we eliminated y. Switching the roles of the two variables leads to the final two solutions -- no more arbitrary choices remain.

    (-sqrt(3/2 - sqrt(5)),sqrt(sqrt(5)/4 - 1/2)) and (sqrt(3/2 - sqrt(5)),-sqrt(sqrt(5)/4 - 1/2))

    To find the maxima and minima, we list these with the corresponding values of f:

    (1/sqrt(2),1/sqrt(2)) and (-1/sqrt(2),-1/sqrt(2)) at which f = 4

    (-1/sqrt(2),1/sqrt(2)) and (1/sqrt(2),-1/sqrt(2)) at which f = 2

    (sqrt(sqrt(5)/4 - 1/2),-sqrt(3/2 - sqrt(5))) and (-sqrt(sqrt(5)/4 - 1/2),sqrt(3/2 - sqrt(5))) at which f = 7/4

    (-sqrt(3/2 - sqrt(5)),sqrt(sqrt(5)/4 - 1/2)) and (sqrt(3/2 - sqrt(5)),-sqrt(sqrt(5)/4 - 1/2)) at which f = 7/4

    Evidently, the maximum value is 4, and it occurs at the first two points in our list, while the minimum value is 7/4, and it occurs at the last four points on our list. Here are the pictures from the Lagrange multipliers applet, which confirm this result as correct:

    These show the three contour curves -- for f(x,y) = 7/4, f(x,y) = 2 and f(x,y) = 4 -- that have points tangent to the constraint curve. Two of them are very close and get in the way of each other, so they are on three separate graphs this time.