The chain rule
Before reading this guide, it is recommended that you read Guide: Introduction to differentiation and the derivative.
Narration of study guide: |
What is the chain rule?
In Guide: Introduction to differentiation and the derivative, you saw how valuable the idea of a derivative of a function is in determining the behaviour of that function. For instance, the derivative can be used to show if a function is increasing or decreasing at a point. Differentiation is commonly used in many subjects (physics, chemistry, biology, economics to name a few) to analyse behaviour of systems that change, and equations involving derivatives can be solved to explain this behaviour.
It was mentioned in that same guide that you are able to differentiate certain combinations of functions, such as the sum and difference of two functions, or scalar multiple of a single function. You need extra techniques to differentiate products, quotients, and compositions of functions; you will need the product rule, quotient rule, and chain rule respectively.
This guide will look at the chain rule for differentiation in order to find the derivative of a composition \(u(v(x))\) of two functions. This guide explains the rule, where it comes from, and how the chain rule can be used.
The summary table of key derivatives from Guide: Introduction to differentiation and the derivative is reproduced here for reference:
Function \(f(x)\) | Derivative \(f'(x)\) | Notes |
---|---|---|
\(f(x) = c\) | \(f'(x) = 0\) | |
\(f(x) = ax + b\) | \(f'(x) = a\) | |
\(f(x) = ax^n\) | \(f'(x) = anx^{n-1}\) | \(n \neq 0\) |
\(f(x) = ae^{bx}\) | \(f'(x) = abe^{bx}\) | |
\(f(x) = a\sin(bx)\) | \(f'(x) = ab\cos(bx)\) | |
\(f(x) = a\cos(bx)\) | \(f'(x) = -ab\sin(bx)\) | |
\(f(x) = a\ln(bx)\) | \(f'(x) = \dfrac{a}{x}\) |
Statement of the chain rule
Here is the statement of the chain rule:
Let \(u(x)\) and \(v(x)\) be two differentiable functions. Then the chain rule says that \[\left(u\circ v\right)'(x) = \frac{\mathrm{d}}{\mathrm{d}x}\left(u(v(x))\right) = u'(v(x))\cdot v'(x)\] that is, the derivative of \(u(x)\) composed with \(v(x)\) with respect to \(x\) is equal to the product of the derivative of \(u\) with respect to \(v\) and the derivative of \(v\) with respect to \(x\).
This can also be written as \[\frac{\mathrm{d}}{\mathrm{d}x}\left(u(v(x))\right) = \frac{\textrm{d}u}{\textrm{d}v}\cdot \frac{\textrm{d}v}{\textrm{d}x}.\]
It’s really important to keep track of variables throughout the chain rule.
The discovery of the chain rule is often credited to Gottfried Leibniz (link to Mactutor biography, external site), one of the co-founders of calculus (along with Isaac Newton); although he only used it on functions of the form \(\sqrt{a + bx + cx^2}\). The first ‘modern’ mention of the chain rule, over a century after Leibniz’ original usage, is due to Joseph-Louis Lagrange (link to Mactutor biography, external site).
Unlike the product rule (see Guide: The product rule), the choice of functions for the chain rule is prescribed; \(u(x)\) is the function on the ‘outside’ and \(v(x)\) is the function on the ‘inside’.
Sometimes \(f(x)\) and \(g(x)\) are written instead of \(u(x)\) and \(v(x)\) in the statement of the product rule. The reason that \(u(x)\) and \(v(x)\) is used here is that sometimes the function itself is called \(f(x)\); and you then can’t use it again!
To see why this really is the derivative of the quotient of two functions, please see Proof sheet: Rules of differentiation.
Examples
Although the statement of the chain rule uses more than one variable, all of your final answers should be in terms of the variable on the inside. This means that any derivative of \(f(g(x))\) must be in terms of \(x\), any derivative of \(u(v(t))\) must be in terms of \(t\) and so on.
What is the derivative of \(y = e^{x^3}\)?
In this case, you have a composition of two functions making \(y\). The idea is to write \(y\) as \(e^v\) where \(v\) is a function of \(x\). So here, the two functions are determined to be \(y = e^v\) and \(v(x) = x^3\). In order to use the chain rule on \(y\), you could differentiate \(y\) with respect to \(u\) and \(v\) with respect to \(x\) beforehand and then substitute them into the statement of the quotient rule. Doing this gives
For \(y(u) = e^u\), then \(\dfrac{\mathrm{d}y}{\mathrm{d}u} = e^u\).
For \(u(x) = x^3\), then \(\dfrac{\mathrm{d}u}{\mathrm{d}x} = 3x^2\).
Putting these into the statement of the chain rule gives
\[ \begin{aligned} \frac{\textrm{d}y}{\textrm{d}x} &= \dfrac{\mathrm{d}y}{\mathrm{d}v}\cdot \dfrac{\mathrm{d}v}{\mathrm{d}x} = e^u\cdot 3x^2 \end{aligned} \] but as mentioned, you should always leave your answer in terms of \(x\). Luckily, \(u = x^3\), and putting this in gives \[\frac{\textrm{d}y}{\textrm{d}x} = 3x^2e^{x^3}\] and this is your final answer.
You don’t need to be so rigorous in your own working. Here’s another example of using the chain rule.
What is the derivative of \(f(x) = \ln(x^5)\) with respect to \(x\)?
You can use the chain rule on \(f(x)\) to differentiate with respect to \(x\). You can infer that \(u(v) = \ln(v)\) and \(v(x) = x^5\). Differentiating \(u\) with respect to \(v\) gives \(u'(v(x)) = 1/v = 1/x^5\) and differentiating \(v\) with respect to \(x\) gives \(v'(x) = 5x^4\). Putting these into the statement of the chain rule gives
\[ \begin{aligned} f'(x) &= u'(v(x))\cdot v'(x) = \dfrac{1}{x^5}\cdot 5x^4 = \dfrac{5x^4}{x^5} = \frac{5}{x} \end{aligned} \]
and this is your answer. Here, the answer has been simplified by using the laws of indices.
Here’s an example which will save you a lot of tedious bracket expanding!
What is the derivative of \(y = (x^3 + 5)^7\) with respect to \(x\)?
Prior to the chain rule, this may have required using the binomial theorem to expand these brackets and differentiate term by term (see [Guide: Binomial theorem] for more). However, you can use the chain rule on \(y\) to differentiate with respect to \(x\). You can infer that \(u = x^3+5\) and \(y= u^7\). Differentiating \(y\) with respect to \(u\) gives \(\dfrac{\mathrm{d}y}{\mathrm{d}u} = 7u^6\), and differentiating \(u\) with respect to \(x\) gives \(\dfrac{\mathrm{d}u}{\mathrm{d}x} = 3x^2\). Putting these into the statement of the chain rule gives
\[ \begin{aligned} \dfrac{\mathrm{d}y}{\mathrm{d}x} &= \dfrac{\mathrm{d}y}{\mathrm{d}u}\cdot \dfrac{\mathrm{d}u}{\mathrm{d}x} = 7u^6\cdot 3x^2 = 21x^2(x^3 + 5)^6 \end{aligned} \]
and this is your answer.
The moral of this story (as it is also in the discussion of the product rule; see Guide: The product rule) is as follows:
If you are differentiating a function comprising of brackets to a large power, don’t expand the brackets! This only leads to extra work - and more opportunities for mistakes. Use the chain rule instead!
Here’s an example that combines the sum rule and the chain rule.
Find the derivative of \(y = \sqrt{4t^4 - 1} + (t^4 + \sqrt[3]{t} + t^{-2})^3\).
In this case, you can see that there are two functions comprising \(y\). Here, \(y = f + g\) where \[f(t) = \sqrt{2t^4 - 1} = (2t^4 - 1)^{1/2}\quad\textsf{ and }\quad g(t) = (t^4 + \sqrt{t} + t^{-2})^{3}\] You can notice that both \(f\) and \(g\) are compositions of functions. So you’ll need to use the chain rule on each function separately in order to find the derivative of \(y\). Let’s do these in separate work areas, so that things aren’t confused.
For the derivative of \(f\), you can notice that \[f(u) = \sqrt{u} = u^{1/2}\quad\textsf{ and } u= 2t^4 - 1\] Differentiating \(f\) with respect to \(u\) and \(u\) with respect to \(t\) gives \[\dfrac{\mathrm{d}f}{\mathrm{d}u} = (1/2)u^{-1/2} = \frac{1}{2\sqrt{u}}\quad\textsf{ and }\quad \dfrac{\mathrm{d}u}{\mathrm{d}t} = 8t^3\] Putting this into the formula for the chain rule and simplifying gives \[\begin{aligned}\dfrac{\mathrm{d}f}{\mathrm{d}t} &= \dfrac{\mathrm{d}f}{\mathrm{d}u}\cdot \dfrac{\mathrm{d}u}{\mathrm{d}t} = \frac{1}{2\sqrt{u}}\cdot 8t^3 = \frac{8t^3}{2\sqrt{2t^4 + 1}} \end{aligned}\] and this is the derivative of \(f\) with respect to \(t\).
For the derivative of \(g\), you can write that \[g(v) = v^3 \quad\textsf{ and }\quad v = t^4 + \sqrt{t} + t^{-2}\] (Notice the use of \(v\) instead of \(u\) here!) Differentiating \(g\) with respect to \(v\) and \(v\) with respect to \(t\) gives \[\dfrac{\mathrm{d}g}{\mathrm{d}v} = 3v^2 \quad\textsf{ and }\quad \dfrac{\mathrm{d}v}{\mathrm{d}t} = 4t^3 + \frac{t^{-1/2}}{2} -2t^{-3}\] Putting this into the formula for the chain rule and simplifying gives
\[ \begin{aligned} \frac{\mathrm{d}g}{\mathrm{d}t} &= \frac{\mathrm{d}g}{\mathrm{d}v}\cdot \frac{\mathrm{d}v}{\mathrm{d}t} = (3v^2)\cdot (4t^3 + \frac{t^{-1/2}}{2} -2t^{-3})\\[0.5em] &= \left(12t^3 + \frac{3t^{-1/2}}{2} - 6t^{-3}\right)(t^4 + \sqrt{t} + t^{-2})^{2} \end{aligned} \] and this is the derivative of \(g\) with respect to \(t\).
Finally, using the sum rule, you can say that \[ \begin{aligned} \frac{\mathrm{d}y}{\mathrm{d}t} &= \frac{\mathrm{d}f}{\mathrm{d}t} + \frac{\mathrm{d}g}{\mathrm{d}t}\\[0.5em] &= \frac{8t^3}{2\sqrt{2t^4 + 1}} + \left(12t^3 + \frac{3t^{-1/2}}{2} - 6t^{-3}\right)(t^4 + \sqrt{x} + x^{-2})^{2} \end{aligned} \] and this is your final answer.
Here’s another example:
Find the derivative of the function \(f(x) = \cos^4(x) - \sin^4(x)\).
Both terms of the function \(\cos^4(x)\) and \(\sin^4(x)\) require the chain rule to differentiate. As in Example 3, you will need to be careful with this and do each term separately.
To differentiate \(\cos^4(x)\), you can recognize this as a function \(g(h) = h^4\) (so \(g'(h(x)) = 4h^3\)) where \(h(x) = \cos(x)\) (so \(h'(x) = -\sin(x)\)). Using the chain rule to differentiate this gives \[\begin{aligned} (g(h(x))' &= g'(h(x))\cdot h'(x)\\[0.5em] &= 4h^3\cdot (-\sin(x)) = -4\cos^3(x)\sin(x) \end{aligned}\] and so the derivative of \(\cos^4(x)\) is \(-4\cos^3(x)\sin(x)\).
To differentiate \(\sin^4(x)\), you can recognize this as a function \(u(v) = v^4\) (so \(u'(v(x)) = 4v^3\)) where \(v(x) = \sin(x)\) (so \(v'(x) = \cos(x)\)). Using the chain rule to differentiate this gives \[\begin{aligned} (u(v(x))' &= u'(v(x))\cdot v'(x)\\[0.5em] &= 4v^3\cdot (\cos(x)) = 4\sin^3(x)\sin(x) \end{aligned}\] and so the derivative of \(\sin^4(x)\) is \(4\sin^3(x)\cos(x)\).
Therefore, putting these together using the difference rule gives: \[f'(x) = -(4\cos^3(x)\sin(x) + 4\sin^3(x)\cos(x)\] and this could be your final answer.
However, a little bit of factorization and knowledge of trigonometric identities (see Guide: Trigonometric identities (radians)) you can trim down this answer. As \(\cos^2(x) + \sin^2(x) = 1\) and \(\sin(2x) = 2\cos(x)\sin(x)\), you can write
\[ \begin{aligned} f'(x) &= -(4\cos^3(x)\sin(x) + 4\sin^3(x)\cos(x)\\[0.5em] &= -4\cos(x)\sin(x)\left(\cos^2(x) + \sin^2(x)\right)\\[0.5em] &= -4\cos(x)\sin(x) = -2\sin(2x) \end{aligned} \]
and this is another acceptable final answer.
The trimmed answer in Example 5 is nice; you can recognize this as the derivative of \(\cos(2x)\). So was all of this necessary? The answer is no; you don’t always need the chain rule to differentiate if the expression you are trying to differentiate can be written in a different way.
For instance, the function in Example 4 is \[f(x) = \cos^4(x) - \sin^4(x)\] which, by the difference of two squares is equal to \[f(x) = (\cos^2(x) + \sin^2(x))(\cos^2(x) - \sin^2(x)).\] Using trigonometric identities, you can say that \(\cos^2(x) + \sin^2(x) = 1\) and \(\cos(2x) = \cos^2(x)-\sin^2(x)\), and so \(f(x) = \cos(2x)\). This can then be differentiated, with the same answer as in Example 4!
In short, it’s always good to explore alternative ways of writing mathematical expressions to save you a little work.
Here’s another example which requires multiple uses of the chain rule.
Find the derivative of \(y = \sin\left(\ln\left(x^{-2}\right)\right)\).
In this situation you can write \(y\) as \(y(u(x))\), where \[y(u) = \sin(u) \quad \textsf{ and }\quad u(x) = \ln(x^{-2})\] Differentiating \(y(u)\) with respect to \(u\) is something that you can do, but you can see that in order to differentiate \(u(x)\) with respect to \(x\), you’ll need to apply the chain rule!
So, to find the derivative of \(u\) with respect to \(x\), you can write \[u(v) = \ln(v) \quad \textsf{ and }\quad v(x) = x^{-2}\] Differentiating \(u\) with respect to \(v\) and \(v\) with respect to \(x\) gives \[\frac{\textrm{d}u}{\textrm{d}v} = \frac{1}{v} \quad \textsf{ and }\quad \frac{\textrm{d}v}{\textrm{d}x} = -2x^{-3}\] Putting this into the formula for the chain rule (and carefully using the laws of indices) gives \[\frac{\textrm{d}u}{\textrm{d}x} = \frac{\textrm{d}u}{\textrm{d}v}\cdot\frac{\textrm{d}v}{\textrm{d}x} = \frac{1}{v}\cdot (-2x^{-3}) = \frac{-2x^{-3}}{x^{-2}} = -2x^{-1} = -\frac{2}{x}\]
Now, you can go back to the initial problem. Thanks to your work differentiating \(u\) with respect to \(x\), you can say that \[\frac{\textrm{d}y}{\textrm{d}u} = \cos(u) \quad \textsf{ and }\quad \frac{\textrm{d}u}{\textrm{d}x} = -\frac{2}{x}\] Finally, putting this into the formula for the chain rule, and remembering to give your answer only in terms of \(x\), gives \[\frac{\textrm{d}y}{\textrm{d}x} = \frac{\textrm{d}y}{\textrm{d}u}\cdot\frac{\textrm{d}u}{\textrm{d}x} = \cos(u)\cdot\left(-\frac{2}{x}\right) = \frac{-2\cos(\ln(x^{-2}))}{x}\]
and this is your final answer.
Since \(\frac{\textrm{d}u}{\textrm{d}x} = \frac{\textrm{d}u}{\textrm{d}v}\cdot\frac{\textrm{d}v}{\textrm{d}x}\), it follows that for the composition of three functions \(y = u(v(x))\), the derivative is given by: \[\frac{\textrm{d}y}{\textrm{d}x} = \frac{\textrm{d}y}{\textrm{d}u}\cdot\frac{\textrm{d}u}{\textrm{d}v}\cdot\frac{\textrm{d}v}{\textrm{d}x}\] In fact, you can use repeated applications of the chain rule to find the derivative of any finite number of composed functions.
Applications of the chain rule
Derivatives of other common functions
You can use the chain rule to work out some derivatives of some other trigonometric functions. In Guide: Trigonometry (radians), you saw that \[\tan(x) = \frac{\sin(x)}{\cos(x)}\qquad \sec(x) = \frac{1}{\cos(x)}\qquad \csc(x) = \frac{1}{\sin(x)}\qquad \cot(x) = \frac{1}{\tan(x)} = \frac{\cos(x)}{\sin(x)}\] are the other circular trigonometric functions. By using the chain rule, you can find the derivatives of \(\sec(x)\) and \(\csc(x)\).
What is the derivative of \(f(x) = \dfrac{1}{\cos(x)}\) with respect to \(x\)?
You can use the chain rule on \(f(x)\) to differentiate with respect to \(x\). You can infer that \(u(v) = 1/v = v^{-1}\) and \(v(x) = \cos(x)\). Differentiating \(u\) with respect to \(v\) gives \(u'(v(x)) = -v^{-2} = -1/\cos^2(x)\) and differentiating \(v\) with respect to \(x\) gives \(v'(x) = -\sin(x)\). Putting these into the statement of the chain rule gives
\[ \begin{aligned} f'(x) &= u'(v(x))\cdot v'(x) = \left(-\dfrac{1}{\cos^2(x)}\right)\cdot \left(-\sin(x)\right) = \frac{\sin(x)}{\cos^2(x)} \end{aligned} \]
and this is your answer. You could also use the fact that \(\tan(x) = \sin(x)/\cos(x)\) to say that \[f'(x) = \dfrac{\sin(x)}{\cos^2(x)} = \sec(x)\tan(x).\]
In a similar way, you can find the derivative of \(f(x) = \csc(x)\) to be equal to \[f'(x) = -\csc(x)\cot(x).\]
Other rules for differentiation
The process of differentiating \(\sec(x)\) and \(\csc(x)\) can be generalized to find the derivative of the reciprocal of any function, not only \(\cos(x)\) and \(\sin(x)\). If you can use the chain rule to demonstrate a general formula for the derivative of \(1/f(x)\), you can not only use this general formula to differentiate \(\sec(x)\) and \(\csc(x)\), but any other reciprocal function. This is a common technique amongst mathematicians; find the most general formula possible to apply to as many situations as you can. Let’s take a look at the reciprocal rule:
You can use the chain rule to prove the reciprocal rule, which says that if \(f(x)\) is not equal to \(0\) for all \(x\), then \[\left(\frac{1}{f(x)}\right)' = \frac{-f(x)}{(f(x))^2}.\]
The function you are trying to differentiate is \(y = 1/f(x)\), so you can use the chain rule with \(y(u) = 1/u = u^{-1}\) and \(u(x) = f(x)\). Differentiating \(y\) with respect to \(u\) gives \(\dfrac{\mathrm{d}y}{\mathrm{d}u} = -u^{-2} = -1/u^2\), and differentiating \(u\) with respect to \(x\) gives \(\dfrac{\mathrm{d}u}{\mathrm{d}x} = f'(x)\). Putting these into the chain rule gives:
\[ \begin{aligned} \frac{\textrm{d}y}{\textrm{d}x} &= \dfrac{\mathrm{d}y}{\mathrm{d}u}\cdot \dfrac{\mathrm{d}u}{\mathrm{d}x} = \left(-\frac{1}{u^2}\right)\cdot f'(x) = \frac{-f'(x)}{(f(x))^2} \end{aligned} \]
and this is exactly the reciprocal rule.
You could also use the quotient rule to prove the reciprocal rule is true. See Guide: The quotient rule for more.
For instance, you can use the reciprocal rule with \(f(x) = \sin(x)\) (so \(f'(x) = \cos(x)\)) to write that \[\frac{\mathrm{d}}{\mathrm{d}x}\left(\csc(x)\right) = \frac{\mathrm{d}}{\mathrm{d}x}\left(\frac{1}{\sin(x)}\right) = \frac{-(\cos(x))}{\sin^2(x)} = -\csc(x)\cot(x).\]
In fact, you can use the chain rule to prove ‘rules’ for differentation of functions composed with other functions. For instance, in Example 1, you saw that the derivative of \(e^{x^3}\) is equal to \(3x^2e^{x^3}\). Taking \(f(x) = x^3\), you can see that this is equal to the derivative \(f'(x) = 3x^2\) multiplied by \(e^{f(x)}\). It follows that you can use the chain rule to prove that, for any function \(f(x)\): \[\frac{\mathrm{d}}{\mathrm{d}x}\left(e^{f(x)}\right) = f'(x)e^{f(x)}\]
Similarly, in Example 2, you saw that the derivative of \(f(x) = \ln(x^5)\) was \(5x^4/x^5\), which is the derivative of \(x^5\) divided by \(x^5\). This is no coincidence; using the chain rule, you can show that for any function \(f(x)\): \[\frac{\mathrm{d}}{\mathrm{d}x}\left(\ln(f(x))\right) = \frac{f'(x)}{f(x)}\]
Derivatives of scaled functions
In Guide: Introduction to differentiation and the derivative, a list of common derivatives was given, and then almost immediately expanded upon - the list is reproduced above. It was mentioned that the chain rule was used to find these expanded derivatives; but why was this the case? The answer is that is because of the \(bx\) in the arguments of the function. Technically, an expression like \(f(x) = a\cos(bx)\) is the composition of two functions, \(f(u) = a\cos(u)\) and \(u(x) = bx\). Since the derivative of \(u = bx\) is \(u' = b\), this is where the \(b\) comes from in \(f'(x) = -ab\sin(bx)\).
In general, you can apply this to any function of \(x\). Using the chain rule, you can prove that if \(f\) is any differentiable function, and \(a,b,c\) are real numbers with \(b\neq 0\), then \[\frac{\textrm{d}}{\textrm{d}x}\left(af(bx + c)\right) = abf'(bx + c).\]
This then allows you to quickly work out derivatives of powers of linear expressions \(bx + c\):
You can use the chain rule to show that: \[\frac{\textrm{d}}{\textrm{d}x}\left((bx+c)^n\right) = bn(bx+c)^{n-1}.\]
The function you are trying to differentiate is \(y = (bx+c)^n\), so you can use the chain rule with \(y(u) = u^n\) and \(u(x) = bx+c\). Differentiating \(y\) with respect to \(u\) gives \(\dfrac{\mathrm{d}y}{\mathrm{d}u} = nu^{n-1}\), and differentiating \(u\) with respect to \(x\) gives \(\dfrac{\mathrm{d}u}{\mathrm{d}x} = b\). Putting these into the chain rule gives:
\[ \begin{aligned} \frac{\textrm{d}y}{\textrm{d}x} &= \dfrac{\mathrm{d}y}{\mathrm{d}u}\cdot \dfrac{\mathrm{d}u}{\mathrm{d}x} = nu^{n-1}\cdot b = bnu^{n-1} = bn(bx+c)^{n-1} \end{aligned} \]
and this is exactly as said above. A similar method can be applied to powers of any function you know how to differentiate.
Quick check problems
Using the chain rule, match the following six functions to their derivatives with respect to \(x\). One of the derivatives given does not match a function; you should select which one in the final sentence below.
Function | Derivative | |
---|---|---|
1. | \(3\cos(3x^2)\) | |
2. | \(2\sin(2x^3)\) | |
3. | \(2\cos(3x^3)\) | |
4. | \(3\sin(3x^3)\) | |
5. | \(3\sin(3x^2)\) | |
6. | \(3\sin(2x^3)\) |
where
- \(-18x^2\sin(3x^3)\)
- \(18x\cos(2x^3)\)
- \(-18x\sin(3x^2)\)
- \(27x^2\cos(3x^3)\)
- \(-18x^2\sin(3x^2)\)
- \(18x\cos(3x^2)\)
- \(12x^2\cos(2x^3)\)
So is the odd derivative out.
Further reading
For more questions on the subject, please go to Questions: The chain rule.
For more about techniques of differentiation, please see Guide: The product rule, and Guide: The quotient rule.
For more about why the rules and techniques of differentiation are true, please see Proof sheet: Rules of differentiation.
Version history
v1.0: initial version created 05/25 by tdhc.