Wednesday, December 3, 2014

Chain rules

Suppose we want to differentiate the expression $f(x,y)=x^2y^3$ with respect to $t$, where $x$ and $y$ are both functions of $t$. The multivariable calculus method is to use the multivariable chain rule:
$$\frac{df}{dt}=\frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt},$$
which would leave us with $2xy^3 x'+3x^2y^2y'$. However, one could do this completely with single variable calculus:
\begin{align*}
\frac{d}{dt}(x^2y^3)&=(\frac{d}{dt} x^2)y^3+x^2(\frac{d}{dt}y^3)\\
&=(2x \frac{dx}{dt}) y^3+x^2(3y^2\frac{dy}{dt})
\end{align*}
which gives the same answer. So the question arises as to whether one can derive the multivariable chain rule this way. At least if you have a function which is a composition of standard functions like addition, multiplication, exponentiation, trig functions and so forth, one can prove the multivariable chain rule by noticing that one can use the product rule, chain rule and so forth, to push your derivative through the function until it hits pure functions of $x$ or $y$. When you take the derivative at this final stage, you will end up multiplying by $\frac{dx}{dt}$ or $\frac{dy}{dt}$ accordingly. Collecting the coefficients of $\frac{dx}{dt}$ together will collect all those instances where you hit a pure function of $x$, which will form the function $\frac{\partial f}{\partial x}$, and similarly for the $y$ part.

One could make this argument more rigorous by arguing recursively. Show it is true for pure functions of $x$ and $y$ and then show that if it works on a given function, then it will work if you apply another one variable function to it, or if you multiply two such functions together, etc. For example, suppose that we know the multivariable chain rule for $f(x,y)$ and $g(x,y)$, then we can show it works for $f(x,y)g(x,y)$ by differentiating in the one variable sense:
\begin{align*}
\frac{d}{dt}(f(x,y)g(x,y))&=f_t g+f g_t\\
&=(f_x x_t+f_y y_t)g+f(g_xx_t+g_yy_t)\\
&=(f_x g+f g_x)x_t+(f_yg+fg_y)y_t\\
&=(fg)_xx_t+(fg)_yy_t
\end{align*}

I don't see a way to prove the multivariable chain rule as a consequence of the one variable case for general functions as opposed to some recursively constructed class. I would be interested if anyone has any ideas.

No comments:

Post a Comment