Implicit Differentiation In principle, under very general circumstances, systems of equations in any number of variables can be solved for some of the variables in terms of the others, i.e. as functions of the others (but not necessarily elementary functions).1 These functional relationships among the variables are said to be implicitly defined by the equations. A precise statement which discusses the existence, uniqueness, and differentiability of such implicitly defined functions is called the Implicit Function Theorem (see below). If there are m equations and m + n variables, a solution will involve m dependent variables (one for each equation) and n independent variables. The dependent variables are the ones we solve for and they can, in general, be chosen in an arbitrary manner. Suppose the variables are x1 , . . . , xn , y1 , . . . , ym with x = (x1 , . . . , xn ) independent and y = (y1 , . . . , ym ) dependent. Then the equations can be written in the form F1 (x, y) = 0,. . . , Fm (x, y) = 0 and a solution will have the form y1 = Y1 (x), . . . , ym = Ym (x). Assuming the existence and differentiability of the solutions, here ∂yi without we are concerned with how to find the various derivatives ∂x j having to actually solve the equations. The method is referred to as implicit differentiation. First, as far as notation is concerned, it must be stressed that if, say, w and t are two of the variables appearing in the system of equations, then the symbol ∂w does not make sense until it is stated which vari∂t ables in addition to t have been chosen as the independent variables (or equivalently, which variables in addition to w have been chosen as the dependent variables) since we must know which variables are being held fixed when we perform the differentiation with respect to t. The result is generally different in different contexts. Example 1. x + y + u − v = 0, 3x + 3y + u + v = 0 If x and y are taken to be independent and we solve for u and v, we obtain u = −2x − 2y, v = −x − y. In this context then, ∂u = −2. ∂x However, if x and v are taken to be independent and we solve for u = 0. and y, then u = 2v,y = −x − v and now we find that ∂u ∂x 1

When we say, ”In principle,. . . , these systems of equations can be solved”, we simply mean that the solutions described exist: computing them is another (generally difficult) matter. 1

2

A commonly used notation to distinguish the various contexts is to list the other independent variables as subscriptsafter the differenti  ∂w = −2, ∂u = 0. ation symbol, e.g. ∂t r,s,... . So, in Example 1, ∂u ∂x y ∂x v To explain the method of implicit differentiation, let us consider the case of two equations in four variables: F (x, y, u, v) = 0, G(x, y, u, v) = ∂v 0, where u, v are dependent and x, y independent. To find ∂u and ∂x , we ∂x regard the equations as functional identities where u and v are certain differentiable functions of x and y. We assume also that F and G are differentiable. Using the Chain Rule, we differentiate with respect to x holding the other independent variable y fixed and we obtain Fx · 1 + Fy · 0 + Fu ux + Fv vx = 0 and Gx · 1 + Gy · 0 + Gu ux + Gv vx = 0. The result is a system of two linear equations in the two unknowns ux and vx , namely Fu ux + Fv vx = −Fx ,

Gu ux + Gv vx = −Gx .

If we solve this system by using Cramer’s Method, then ux = D1 /D and vx = D2 /D where       Fu Fv −Fx Fv Fu −Fx D = det , D1 = det , D2 = det . Gu Gv −Gx Gv Gu −Gx The determinant D is called  Jacobian of (F, G) with respect to  the F,G . (u, v) and is denoted by J u,v or ∂(F,G) ∂(u,v) In this notation then, if F (x, y, u, v) = 0 and G(x, y, u, v) = 0 define u and v as differentiable functions of x and y, then     F,G F,G J J x,v u,x ∂u ∂v = −   and =−   ∂x ∂x J F,G J F,G u,v

providing J Similarly,



F,G u,v



u,v

6= 0. 

F,G y,v





F,G u,y



J J ∂u ∂v = −   and =−   ∂y ∂y J F,G J F,G u,v

u,v

Example 2. With reference to Example 1 above,     1 −1   det J F,G 3 1 x,v ∂u 4   = − = −2 =−   =− ∂x y 2 1 −1 J F,G det u,v 1 1

3



∂u ∂x



J



F,G x,y



 det

1 1 3 3



0   =− =0 =−   =− 2 1 1 v J F,G det u,y 1 3 confirming what we found above. In the general case we considered at the outset, F1 (x, y) = 0, . . . , Fm (x, y) = 0, we define the Jacobian of F = (F1 , . . . , Fm ) with respect to y = (y1 , . . . , ym ) to be  ∂F1  ∂F1 . . . ∂y   ∂y1 m ∂Fi  .. ..  det = det  . .  ∂yj m×m ∂Fm ∂Fm . . . ∂ym ∂y1 and denote it by   ∂(F1 , . . . , Fm ) ∂F F1 , . . . , Fm or or . J y1 , . . . , y m ∂(y1 , . . . , ym ) ∂y Then, proceeding as we did above, using Cramer’s Method, we obtain ∂yi the following formula for ∂x : Let z be obtained from y by replacing yi j by xj , i.e. zk = yk for k 6= i but zi = xj ; then ∂(F1 ,...,Fi ,...,Fm )

∂F ∂yi ∂(y ,...,x ,...,ym ) ∂z = − ∂F = − ∂(F11 ,...,Fji ,...,Fm ) . ∂xj ∂y ∂(y1 ,...,yi ,...,ym )

Remark. In the case of one equation, F (x1 , . . . , xn , y) = 0, the above ∂y ∂F ∂F reduces to ∂x = − ∂x / ∂y . For example, if x2 +y 2 +z 2 = xyz is written j j in the form F (x, y, z) = 0 where F (x, y, z) = x2 + y 2 + z 2 − xyz, then ∂z Fx 2x − yz =− =− ∂x Fz 2z − xy providing 2z − xy 6= 0 (and, of course, F (x, y, z) = 0.) Example 3. The curve given by the equations x2 + y 2 + z 2 = 4,

x+y+z =1

is a circle since it is the section of a sphere by a plane. On this circle, if we take x as the parameter, then x is the independent variable and dy dz and dx , we write the given both y and z are functions of x. To find dx equations in the form F = 0, G = 0 and obtain     2x 2z det J F,G 1 1 x,z dy x−z z−x   =− =−   =− = dx y−z y−z 2y 2z J F,G det y,z 1 1

4

for y 6= z and similarly,     2y 2x det J F,G 1 1 y,x x−y y−x dz   =− = =−   =− y−z y−z dx 2y 2z J F,G det y,z 1 1 for y 6= z. Example 4. The following three equations in five variables x = uv,

y = u2 + v 2 ,

z = u2 − v 2

can be solved for u, v, z as functions of x and y. To find 2

∂z ∂x

we let

2

F (x, y; u, v, z) = uv−x, G(x, y; u, v, z) = u +v −y, H(x, y; u, v, z) = u2 −v 2 −z. Then 

 v u −1   2v 0  det  2u F,G,H J u,v,x 2u −2v 0 4uv 4x ∂z = 2  =−  =−  =− 2 F,G,H ∂x v −u z v u 0 J u,v,z 2v 0  det  2u 2u −2v −1 for z 6= 0; Similarly  v u 0   2v −1  det  2u F,G,H J u,v,y 2u −2v 0 2u2 + 2v 2 y ∂z  =−   =− 2 = =−  2 F,G,H 2(v − u ) z ∂y v u 0 J u,v,z 2v 0  det  2u 2u −2v −1 

for z 6= 0. Here is the theorem on which the method discussed above is based, the Implicit Function Theorem: Theorem. Let x = (x1 , . . . , xn ), y = (y1 , . . . , ym ) and suppose that (1)

F1 (x, y) = 0, . . . , Fm (x, y) = 0

is a system of m equations in the m + n variables x1 , . . . , xn , y1 , . . . , ym and (a) P0 = (a, b) is a solution of (1), (b) all the first partials of F1 , . . . , Fm exist and are continuous in a neighbourhood of P0 , and 1 ,...,Fm ) (c) ∂(F 6= 0 at P0 . ∂(y1 ,...,ym )

5

Then, in a neighbourhood of P0 , the system (1) defines y1 , . . . , ym as functions of x1 , . . . , xn (or, one could say, y as a function of x); more precisely, there exist neighbourhoods U of a and V of b and functions Y1 (x), . . . , Ym (x) defined on U such that, for all x in U , (x, y) is a solution of (1) with y in V if and only if y = (Y1 (x), . . . , Ym (x)). So, for x in U and y in V , the system (1) is equivalent to y1 = Y1 (x), . . . , ym = ∂Yi Ym (x). Moreover, each of the partials ∂x (x) exists and is continuous j on U and ∂F (x, Y(x)) ∂Yi ∂z (x) = − ∂F ∂xj (x, Y(x)) ∂y where ∂F ∂(F1 , . . . , Fm ) = , Y(x) = (Y1 (x), . . . , Ym (x)) ∂y ∂(y1 , . . . , ym ) and z is obtained from y by replacing yi by xj . Exercises: (1) (Cramer’s Rule) Verify the following and express the results in terms of determinants. (i) The system of 3 linear equations ai x + bi y + ci z = di (i = 1, 2, 3) can be written as a single vector equation, namely (*) xa + yb + zc = d. When a · (b × c) 6= 0, there is a unique solution: x=

d · (b × c) a · (d × c) a · (b × d) ,y = ,z = a · (b × c) a · (b × c) a · (b × c)

(Hint: For x, dot both sides of (*) with b × c.) (ii) For 2 linear equations, a1 x + b1 y = d1 , a2 x + b2 y = d2 , we can obtain a similar result by using (i) with a3 = b3 = d3 = 0 and c1 = c2 = 0, c3 = 1. (2) In the following, compute the indicated derivatives. At any point that satisfies the equations, what condition on the variables guarantees that the implied functional relationship exists in some neighbourhood of the point? ∂z (a) sin(xyz) + x2 + y 2 + z 2 = 2; ∂x . dy dz 2 2 2 and dx . (b) 2x + 3y + 4z = 12, x − yz = 2; dx 2 2 (c) x = u − v , y = 2uv; ux , uy , vx , vy , uxx , where in each case, u and v are functions of x and y.   (d) xu2 + v = y 3 , 2yu − xv 3 = 4x; ∂u , ∂x . ∂x y ∂y v

(3) xy 2 + zu + v 2 = 3, x3 z + 2y − uv = 2, xu + yv − xyz = 1

6

Show that these equations define x, y, z as functions of u and v in a neighbourhood of the point P0 (x, y, z, u, v) = (1, 1, 1, 1, 1) ∂y when (u, v) = (1, 1). and find ∂u (4) (a) Show that in a neighbourhood of P0 (x, y, u, v) = (1, 1, 2, 1), the equations xu + yv − uv = 1, xv + yu − xy = 2 define u and v as functions of (x, y) and ux (1, 1) = −2. (b) Show that in a neighbourhood of P0 the equations in (a) ∂y define y and v as functions of (x, u) and ∂x (1, 2) = −1. (5) Given 4 equations in 7 variables x, y, . . . how many possible ∂y interpretations are there for ∂x ? (6) Suppose y1 , . . . , ym are differentiable functions of x1 , . . . , xn . Define the Jacobian matrix of y = (y1 , . . . , ym ) with respect to x = (x1 , . . . , xn ) to be the m × n matrix  whose (i, j)th en∂yi m try is ∂x . We denote it by either D yx11,...,y or Dx y. If now ,...,xn j x1 , . . . , xn are differentiable functions of t = (t1 , . . . , tp ), show that (a) Dt y = Dx yDt x (matrix multiplication) ∂x (b) If m = n = p, then ∂y = ∂y . ∂t ∂x ∂t (c) If m = n = p and t = y and the functions x → y and y → x are inverses of each other, then Dx y and Dy x are ∂x inverses of each other and ∂y and ∂y are reciprocals of each ∂x other.