Unit 4: Vector Calculus
Chapter 18: The Gradient Vector
Sections18.4: Lagrange multipliers
Copyright
Copyright * 2001 by Addison Wesley Longman, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America.
Initializations
> restart;
>
with(plots):
with(plottools):
with(linalg):
with(student):
read(`C:/Program Files/Maple 6.01/pvac.txt`):
Warning, the name changecoords has been redefined
Warning, the protected names norm and trace have been redefined and unprotected
>
Constrained Optimization
Elementary calculus considers constrained optimization problems of the form
Find the rectangular box of maximum area if the perimeter is fixed at 100.
Initially, these problems sre solved by eliminating one of the variables. For example, with the area of the box written as
> A := x*y;
>
and the perimeter constraint as
> P := 2*x + 2*y = 100;
>
the constraint can be solved, for
, that is, for
> Y := solve(P,y);
>
so that
> Ax := subs(y=Y,A);
>
The ordinary techniques of differentiation (or plotting) now lead to the maximum of 625 at
= 25, as we see from the graph
> plot(Ax,x=0..50);
>
or by the analytical device of
> X[max] := solve(diff(Ax,x)=0,x);
>
The corresponding y -coordinate is
> Y[max] := subs(x=X[max],Y);
>
The rectangle is a square of side 25, and area
.
>
The Lagrange Multiplier Method
The Lagrange multiplier technique is an alternate method for solving constrained optimization problems. By making an elegant application of the gradient vector, it avoids the algebra of using the constraint to eliminate one of the variables. The method of Lagrange multipliers is illustrated through the following five examples.
>
Example 18.7
In this first example, we show how the Lagrange-multiplier method selects points at which the level curves of the objective function
are tangent to the constraint curve. To do this, we will find the extreme value of
along the constraining ellipse
.
The objective function (what is maximized or minimized) is therefore
> f := x*y;
>
and the constraint curve is defined by the function
> g := x^2 + 4*y^2 - 8;
>
with the actual constraint equation being
> constraint_equation := g = 0;
>
Figure 8.11 shows the constraint ellipse in the
-plane, along with the surface
.
>
a := sqrt(8):
b := sqrt(2):
xt := a*cos(t):
yt := b*sin(t):
ft := subs({x=xt, y=yt},f):
p1 := spacecurve([xt,yt,.01], t=0..2*Pi, color = black, thickness = 3):
p2 := plot3d(f,x=-2..2,y=-2..2,style=hidden,color=black):
p3 := plot3d(0, x = -a..a, y = -2..2, color = cyan, style=wireframe):
display([p||(1..3)],scaling=constrained, tickmarks=[3,3,3], labels=[`x `,` y`,`z `], labelfont=[TIMES,ITALIC,12],axes=frame, orientation =[-25,70]);
>
By way of interpretation, imagine being restricted to walking on this ellipse in the
xy
-plane. Overhead, the function
determines the shape of the ceiling. You want to know where, on the elliptic path being walked, you will find the highest and lowest points of the ceiling.
The interSectionsof the cylinder
and the surface
is the space curve (thin black) shown in Figure 18.12, below. Included is a rendition of the ellipse (thick black) in the
-plane. Clearly, there are two maxima and two minima.
>
p1 := spacecurve([xt,yt,ft], t=0..2*Pi, color = black, thickness = 3):
p2 := spacecurve([xt,yt,0], t=0..2*Pi, color = black, thickness = 6):
p3 := plot3d(0, x = -a..a, y = -2..2, color = cyan, style=wireframe):
dots:=spacecurve({[[-3,-2,-2],[-3,2,-2],[3,2,-2]]}, color=black, linestyle=2):
display([p||(1..3),dots], axes = frame, orientation = [45,70], scaling = constrained, labels=[` x`,`y `,`z `], labelfont=[TIMES,ITALIC,12], tickmarks=[5,5,5], orientation=[-40,75]);
>
Maple Graphics - Generating Figure 8.12
Write the constraint ellipse in standard form by dividing through by 8 so that the form
is obtained.
> constraint_equation/8;
>
In this form, we realize that
>
a := sqrt(8);
b := sqrt(2);
>
and a convenient parametric form of the ellipse is then
>
xt := a*cos(t);
yt := b*sin(t);
>
The function
, evaluated along this ellipse, is then
> ft := subs({x=xt, y=yt},f);
>
Plot the heights of
above the ellipse as a space curve. Include the ellipse itself in the figure.
>
p1 := spacecurve([xt,yt,ft], t=0..2*Pi, color = black, thickness = 3):
p2 := spacecurve([xt,yt,0], t=0..2*Pi, color = green, thickness = 6):
p3 := plot3d(0, x = -a..a, y = -2..2, color = red, style=wireframe):
display([p||(1..3)], axes = boxed, orientation = [45,75], scaling = constrained);
>
The green path is the constraint ellipse along which we walk. The black curve is the contour of the ceiling directly above the constraint ellipse.
>
The contour map in Figure 8.13 is yet another way of looking at this problem. The level curves of
(black) and the constraint ellipse (red) are all drawn in the
-plane.
>
gx := implicitplot(g = 0, x = -3..3, y = -2..2, color=red):
p1 := contourplot(f, x =-3..3, y = -3..3, contours = [1,2,3,-1,-2,-3], color = black):
p2 := textplot({[-2.1,2.5,`f(x,y) < 0`],[2.2,2.5,`f(x,y) > 0`],
[-2.1,-2.5,`f(x,y) > 0`], [2.2,-2.5,`f(x,y) < 0`], [0,1,`g(x,y) = 0`]}, font=[TIMES,ROMAN,12]):
p3 := display([p1,p2,gx], axes = boxed, scaling = constrained, xtickmarks=7, ytickmarks=[-1,0,1], labels=[x,`y `], labelfont=[TIMES,ITALIC,12]):
p3;
>
Where a level curve of
cuts through the graph of g = 0, there can't be a stationary value.
Where the level curve of
is tangent to the graph of g = 0, the value of
becomes stationary because it "pauses" momentarily.
The method of
Lagrange
multipliers
seeks points were the level curves of
are tangent to the constraint curve
.
It does this by looking for points where the gradient vectors of
and
are colinear. This happens where
is a multiple of
where we obtain the gradients in Maple as
>
gf := grad(f, [x,y]);
gg := grad(g, [x,y]);
>
Figure 18.14 (below) shows the gradient vectors at the four points where they are colinear.
>
gft := subs({x=xt,y=yt},op(gf)):
ggt := subs({x=xt,y=yt},op(gg)):
f1 := z -> arrow(subs(t=z,[xt,yt]), subs(t=z,op(gft)),.2,.4,.3, color=black):
f2 := z -> arrow(subs(t=z,[xt,yt]), subs(t=z,op(ggt)),.2,.4,.3, color=cyan):
f3 := z -> display([f1(z),f2(z)]):
p4 := display([f3(Pi/4),f3(3*Pi/4),f3(5*Pi/4),f3(7*Pi/4)]):
display([p4,gx,p1], scaling=constrained, axes=box, xtickmarks=7, ytickmarks=[-1,0,1], labels=[x,`y `], labelfont=[TIMES,ITALIC,12]);
>
We can do better with the live medium of a computer screen. We can simulate a trip around the ellipse, showing the two gradient vectors encountered at each point. Thus, we evaluate
and
on the ellipse via
>
gft := subs({x=xt,y=yt},op(gf));
ggt := subs({x=xt,y=yt},op(gg));
>
then generate the following animation. The constraint ellipse is drawn in red, but grad( g ) is drawn in cyan. The level curves of f are drawn in black, as is grad( f ).
>
f1 := z -> arrow(subs(t=z,[xt,yt]), subs(t=z,op(gft)),.2,.4,.3, color=black):
f2 := z -> arrow(subs(t=z,[xt,yt]), subs(t=z,op(ggt)),.2,.4,.3, color=cyan):
f3 := z -> display([f1(z),f2(z)]):
p3 := display([seq(f3(.05*Pi*k),k=0..39)],insequence=true):
display([gx,p1,p3], scaling=constrained);
>
This example shows two things. First,
(black) never equals
(cyan) which is considerably longer than
. Second, at two points of tangency,
points in the same direction as
, but at the other two points of tangency, the two gradients point in exactly opposite directions.
The analytic condition which expresses the colinearity of the gradients is
which stands for the pair of component equations
The factor of proportionality,
, is called the
Lagrange multiplier
.
There are three unknowns, namely, the coordinates
and
, and the Lagrange multiplier
. These are determined by the two equations in
and the constraint equation
Table 18.1 lists the equations, the solutions, and the value of
at each extreme point.
Equations
=========== === ====== =====
2
2
__________________________________________________
Table 18.1
The solutions in Table 18.1 are obtained in Maple as follows.
The simplest way to equate coefficients of the two gradient vectors is with the
equate
command from the
student
package. Note how we use the notationally simpler Lagrange multiplier
m
instead of
.
> q := equate(gf, m*gg);
>
So far, we have two equations in the three unknowns x , y , and the Lagrange multiplier m .
The third equation is the constraint equation g = 0. Consequently, we solve three equations in three unknowns. (Note that the first two equations are already in a set called
, so to insert the remaining equation into that set, use the operator for the union of sets, thereby generating a set of three equations.)
> q1 := solve(q union {g = 0}, {x, y, m});
>
There are 4 possible extreme values corresponding to the critical points (2, 1), (-2, -1), -2, 1), and (2, -1)
>
for k from 1 to 4 do
P||k := subs(q1[k],[x,y]);
od;
>
The values of
at the critical points are
>
for k from 1 to 4 do
subs(q1[k],[[x,y],f]);
od;
>
Hence, there are two places on the constraint curve
where
attains a maximum and two places on the constraint curve where
attains a minimum.
f(2,1) = 2
f(-2,-1) = 2
f(-2,1) = -2
f(2, -1) = -2
>
Example 18.8
In this second example, we will find the Lagrange multiplier is zero, so an extreme point occurs where the constraint is not operative. The extreme point is on the constraint, but it is also a critical point for the unconstrained objective function. To illustrate this, we find the extreme values of
> f := x^2*y;
>
subject to the constraint
, where
is given by
> g := x + y -3;
>
Figure 18.15 (below) shows the surface
, the
-plane, constraint (line)
, and the interSectionsof the plane
with the surface
.
>
F := subs(y = 3-x,f):
p4 := spacecurve([x,3-x,F, x = -2..4], color = black, thickness=3):
p5 := spacecurve([x,3-x,0,x=-2..4], color = red, thickness=3):
p6 := plot3d(0,x=-2..4,y=-1..5,color=cyan):
p7 := plot3d(f,x = -2..4,y=-1..5,color=black,style=wireframe):
display([p||(4..7)],axes=frame, labels=[x,y,z], view=[-2..4,-1..5,-2..20], orientation=[-60,60], tickmarks=[7,7,5], labels=[`x `,` y`,`z `], labelfont=[TIMES,ITALIC,12]);
>
The space curve in which the plane
intersects the surface, namely,
> Z := subs(y = 3-x,f);
>
is shown in Figure 18.16.
> plot(Z, x = -2..4, color=black, xtickmarks=7, ytickmarks=8, labels=[x,`z `], labelfont=[TIMES,ITALIC,12]);
>
It suggests
has extreme values at
and
along the constraint line.
Figure 18.17 shows a contour plot of the surface
and in red, a graph of the constraint line
.
>
p8 := contourplot(f, x = -2..4, y = -1..5, contours = [0,.1,1,2,3,4,5,6,7,8], color = black):
p9 := implicitplot(g,x=-2..4,y=-1..5,color=red):
display([p8,p9], axes=boxed, xtickmarks=7, ytickmarks=7, labels=[x,`y `], labelfont=[TIMES,ITALIC,12],scaling=constrained);
>
This view suggests we will find extreme values of
at
x
= 0 and
x
= 2.
This view is significant since it shows but a single point of tangency at about (2,1).
Table 18.2 lists the equations
, and the solutions for
, and
, and the values of
at the extreme points.
Equations
======== === ===== ======
4
4
0
0
_____________________________________________
Table 18.2
When
, the constraint does not apply. The corresponding extreme point is a critical point for the unconstrained optimization problem, and would have been found without the constraint. That is why Figure 18.17 shows but one point of tangency between the constraint line and the level curves of the objective function
.
Carrying out the computations in Maple, we have the gradient vectors of both
and
given by
>
gf := grad(f, [x,y]);
gg := grad(g, [x,y]);
>
I f we then equate the components of
with a multiple of
, we obtain
> q := equate(gf, m*gg);
>
Solving three equations in three unknowns, we obtain
> q1 := solve(q union {g = 0}, {x, y, m});
>
There are two critical points:
(2, 1) with
(0, 3) with
with corresponding function values
>
subs(q1[1],[[x,y],f]);
subs(q1[2],[[x,y],f]);
>
Example 18.9
In this third example, the objective function must be constructed from a verbal description provided by the problem statement, namely, the requirement to find the (shortest) distance from the origin to the plane
.
Thus, the quantity to be minimized is the distance from the origin to a point (
) on the plane. It's generally easier, however, to miminize the
square
of the distance.
The constraint
is the equation of the plane. We herefore have
> f := x^2 + y^2 + z^2;
>
and
> g := 2*x + y - z - 5;
>
The gradients of
and of
are, respectively,
>
gf := grad(f, [x,y,z]);
gg := grad(g, [x,y,z]);
>
There are three equations in
, and four unknowns, namely,
, and
. Table 18.3 lists the four equations and their single solution.
Equations
============ === ========== ===========
________________________________________________________
Table 18.3
To obtain the equations
in Maple, equate the components of
with a multiple of
, using the more convenient letter
instead of
. We obtain
> q := equate(gf, m*gg);
>
Solve four equations in the four unknowns
, again using the
union
command to add the constraint equation to the set of three equations obtained from the gradient condition. We find
> q1 := solve(q union {g = 0}, {x,y,z,m});
>
Evaluate the distance
at the critical point, obtaining
> simplify(subs(q1,[[x,y,z],sqrt(f)]));
>
Example 18.10
A pentagon is formed from a rectangle surmounted by an isosceles triangle. What dimensions give the pentagon least perimeter if the area is fixed at the value a ? (See the following figure, Figure 18.18.)
>
a:='a':
p10 := plot([[0,0],[6,0],[6,3],[3,5],[0,3],[0,0]], color=black):
p11 := plot({[[0,3],[6,3]],[[3,5],[3,3]]}, linestyle = 3, color=black):
p12 := textplot({[1.5,2.7,`x`], [4.5,2.7,`x`], [3.2,4,`z`], [5.4,4.5,`sqrt(z^2 + x^2)`], [6.2,1.5,`y`], [3,-.5,`2x`]}, font=[TIMES,ROMAN,12]):
display([p||(10..12)],axes=none,scaling=constrained);
>
In this fourth example, the constraint contains a symbolic parameter, and the algebra becomes significantly more complicated. The point of the example is to show how to navigate through such seemingly complexity.
>
From Figure 18.18 the area of the triangle, the area of the rectangle, the total area, and the perimeter are given by
Area of Triangle: 2 [
x
z
]
Area of Rectangle:
Total Area:
=
a
Perimeter:
The objective function is the perimeter, so we define
> f := 2*sqrt(z^2 + x^2) + 2*x + 2*y;
>
The constraint is the fixed area, so we define
> g := 2*x*y + x*z - a;
>
The gradient vectors grad( f ) and grad( g ) are then
>
gf := grad(f, [x,y,z]);
gg := grad(g, [x,y,z]);
>
Table 18.4 lists the four equations arising from
and the constraint
. There are four possible solutions for
, and
; computed exactly in Maple (below), they are listed in floating-point form in table 18.4. Only one solution gives all three dimensions as positive.
Equations
====================== === ====================== ========
______________________________________________________________________________
Table 18.4
The third solution is the physically meaningful one, and can be given exactly as
where
The minimum value of the perimeter can also be given exactly as
It is surprising how much the presence of the symbolic parameter
complicates the algebra. But the reader should not let the additional complexity in the algebra obscure the underlying simplicity of the basic technique of Lagrange multipliers.
To execute these calculations in Maple, equate the coefficients of
with a multiple of
, obtaining (after again replacing
with the more convenient letter
),
> q := equate(gf, m*gg);
>
Solve four equations in the four unknowns
, obtaining
> q1 := solve(q union {g = 0}, {x,y,z,m});
>
where again, the union command adds the constraint equation to the set of three equations obtained from the gradient condition.
The cure for the RootOf structure is allvalues .
> q2 := allvalues(q1);
>
There are 4 possible critical points and we will have to investigate each one. Execute the following Maple instruction to isolate and name each critical point.
>
for k from 1 to 4 do
s||k := q2[k];
od;
>
A useful view of the critical points is afforded by a conversion to floating-point form. Thus, for the first critical point, we have
> evalf(s1);
>
Reject this solution since z (a length) is negative.
The next critical point to examine is
> evalf(s2);
>
Again, reject this solution since at least one variable (a length) is negative.
The situation is the same for the last critical point, as we see from
> evalf(s4);
>
where both
and
(which are lengths) are negative.
The remaining critical point evaluates to
> evalf(s3);
>
and we see this is viable since all three variables (lengths) are positive. The function value at this candidate is
> f3 := simplify(subs(s3,f),symbolic);
>
Thus, there is just the one solution, s3, for which the point (
) is
> P := subs(s3,[x,y,z]);
>
The function value at this point is
> f3;
>
An alternate view of this value is given by
> evalf(f3);
>
Maple can be coerced into simplifying the optimal value.
> factor(expand(rationalize(f3)));
>
Maple can also be coerced into simplifying (slightly) the representation of the extreme point.
> P1 := factor(expand(rationalize(P)));
>
Maple resists compressing this solution to the form
where
However, Maple does help with enough of the algebra so that with just a bit of intervention, this form can be achieved.
Let us begin by defining the quantity
> u := a*(2-sqrt(3));
>
whose square root is
. (Thus,
.)
In as many places as Maple will allow, insert
where this quantity appears. (Since we have just assigned to the letter
, we can't substitute this letter into the expression, so we substitute the alternate letter,
.)
We obtain
> PP1 := algsubs(u=U,P1);
>
Next, replace
with
.
> PP2 := subs(sqrt(U)=beta,PP1);
>
But
=
which we can inject via
> radnormal(subs(U^(3/2)=u*beta,PP2));
>
Except for Maple's tendency to rationalize
to
, we have achieved the desired form.
>
Example 18.11
In this fifth example we extend the method to the case of two constraints by finding the shortest distance from the origin to the interSectionsof the two planes
that is,
>
g1 := 2*x - 3*y + 5*z - 9;
g2 := 6*x + y - 7*z - 12;
>
We give two solutions, the first with the use of Lagrange multipliers, and the second, without. For both solutions, the objective function is
that is,
>
f := x^2 + y^2 + z^2;
>
the square of the distance from the origin to the point
.
As explained below, the Lagrange-multiplier method generalizes to
f
f
=
f
+
f
that is, to
so now, there are five equations in the five unknowns
Alternatively, the function
could be defined, and the same set of equations obtained from
f
=
=
0
The equations, and their single solution, are listed in Table 18.5. The point closest to the origin is designated by
.
Equations
============== ========== =============== =========
_____________________________________________________________________
Table 18.5
Implementing this first solution in Maple begins with the computation of the gradients of the objective function and the constraints, which we designate as
>
grad_f := grad(f,[x,y,z]);
grad_g1 := grad(g1,[x,y,z]);
grad_g2 := grad(g2,[x,y,z]);
>
The generalization of the Lagrange-multiplier technique relevant here is
=
This form of the Lagrange-multiplier method conforms well to the available Maple syntax, and is implemented with
> q := equate(grad_f,m1*grad_g1 + m2*grad_g2);
>
which forms three equations in the five unknowns
. The additional two equations needed to make a determinate system are the constraints themselves. The five relevant equations are then solved in Maple with the syntax
> q1 := solve(q union {g1,g2}, {x,y,z,m1,m2});
>
The point on the interSectionsof the two planes which is closest to the origin is then
> P:=subs(q1,[x,y,z]);
>
and the distance of this point from the origin is
> f1 := simplify(subs(q1,sqrt(f)));
>
or
> evalf(f1);
>
as a floating-point number.
At the point
, we find
to be the vector
> F := subs(q1,op(grad_f));
>
Of course, at
, this vector will agree with
, as we see by the following calculation.
> subs(q1,evalm(m1*grad_g1 + m2*grad_g2));
>
A second solution to this problem can be framed within the confines of simple multivariable calculus operations. First, a vector along the line of interSectionsof the two constraint planes is given by the cross product of normals to the two planes. Thus, we compute the vector
V
=
x
that is,
> V := crossprod(grad_g1,grad_g2);
>
which lies along the line of interSectionsof the constraint planes. This vector is perpendicular to
F
=
at the point
, as we can see from
> dotprod(F,V);
>
Hence, the line of interSectionsof the constraint planes is tangent to the level surface determined by the objective function,
. This is the analog of the geometry which applies in the case of the objective function
constrained by a single constraint curve
.
To find a point on the line of interSectionsof the two constraint planes, set
in both planes to find where the line passes through the
-plane. The solution of the equations
>
eq1 := subs(z=0,g1);
eq2 := subs(z=0,g2);
>
is found to be
> q2 := solve({eq1,eq2},{x,y});
>
so the line of interSectionscan be given parametrically by the vector
> R := evalm(subs(q2,[x,y,0]) + t*V);
>
If we evaluate
along this line, we obtain
, a function of a single variable. This function is
> ft := simplify(subs(x=R[1],y=R[2],z=R[3],f));
>
and the techniques of elementary differential calculus, setting the derivative to zero and solving, gives the critical value of
as
> T := solve(diff(ft,t),t);
>
Evaluating
R
at this value of
gives
> subs(t=T,op(R));
>
which is the vector form of the point
found by the Lagrange-multiplier method.
Finally, we construct Figure 18.19, a sketch of the sphere
> f = f1^2;
