CommeUnJeu · L2 MP
Probability spaces
Example
Compute the derivative of \(f(x) = (x^3 - 2 x + 1)/(x^2 + 1)\) on \(\mathbb{R}\).
\(x^2 + 1 > 0\) never vanishes. By the quotient rule: $$ f'(x) = \frac{(3 x^2 - 2)(x^2 + 1) - (x^3 - 2 x + 1)(2 x)}{(x^2 + 1)^2} = \frac{x^4 + 5 x^2 - 2 x - 2}{(x^2 + 1)^2}. $$
Proposition — Chain rule
Let \(f : I \to \mathbb{R}\) with \(f(I) \subset J\), and \(g : J \to \mathbb{R}\). If \(f\) is differentiable at \(a \in I\) and \(g\) is differentiable at \(b = f(a) \in J\), then \(g \circ f\) is differentiable at \(a\) and $$ \textcolor{colorprop}{(g \circ f)'(a) = g'(f(a)) \cdot f'(a)}. $$
Define an auxiliary function \(\tau : J \to \mathbb{R}\) by $$ \tau(y) = \begin{cases} (g(y) - g(b))/(y - b) & \text{if } y \ne b, \\
g'(b) & \text{if } y = b. \end{cases} $$ By the differentiability of \(g\) at \(b\), \(\tau(y) \to g'(b) = \tau(b)\) as \(y \to b\), so \(\tau\) is continuous at \(b\). By definition, \(g(y) - g(b) = \tau(y) (y - b)\) holds for all \(y \in J\) (including \(y = b\), both sides being \(0\)).
Substitute \(y = f(a + h)\) for \(h\) such that \(a + h \in I\): $$ g(f(a + h)) - g(f(a)) = \tau(f(a + h)) \, (f(a + h) - f(a)). $$ For \(h \ne 0\), divide by \(h\): $$ \tau_a^{g \circ f}(h) = \tau(f(a + h)) \cdot \tau_a^f(h). $$ As \(h \to 0\): \(f\) is continuous at \(a\) (P1.2) so \(f(a + h) \to b\), and \(\tau\) is continuous at \(b\) so \(\tau(f(a + h)) \to g'(b)\); finally \(\tau_a^f(h) \to f'(a)\). The product tends to \(g'(b) f'(a)\).
Substitute \(y = f(a + h)\) for \(h\) such that \(a + h \in I\): $$ g(f(a + h)) - g(f(a)) = \tau(f(a + h)) \, (f(a + h) - f(a)). $$ For \(h \ne 0\), divide by \(h\): $$ \tau_a^{g \circ f}(h) = \tau(f(a + h)) \cdot \tau_a^f(h). $$ As \(h \to 0\): \(f\) is continuous at \(a\) (P1.2) so \(f(a + h) \to b\), and \(\tau\) is continuous at \(b\) so \(\tau(f(a + h)) \to g'(b)\); finally \(\tau_a^f(h) \to f'(a)\). The product tends to \(g'(b) f'(a)\).
I
Tribes and probability spaces
Example
Compute the derivatives of \((x^3 + 1)^5\) and \(\sqrt{x^2 + 1}\) on \(\mathbb{R}\) via the chain rule.
For \(f(x) = (x^3 + 1)^5\): take \(u(x) = x^3 + 1\) (\(u'(x) = 3 x^2\)), \(v(y) = y^5\) (\(v'(y) = 5 y^4\)). By chain rule, \(f'(x) = v'(u(x)) \cdot u'(x) = 5 (x^3 + 1)^4 \cdot 3 x^2 = 15 x^2 (x^3 + 1)^4\).
For \(g(x) = \sqrt{x^2 + 1}\): take \(u(x) = x^2 + 1\) (\(u'(x) = 2 x\)), \(v(y) = \sqrt{y}\) defined on \(y > 0\). Derivative of \(v\) at \(a > 0\): for \(h \ne 0\) with \(a + h > 0\), $$ \frac{\sqrt{a + h} - \sqrt{a}}{h} = \frac{(\sqrt{a + h} - \sqrt{a})(\sqrt{a + h} + \sqrt{a})}{h (\sqrt{a + h} + \sqrt{a})} = \frac{1}{\sqrt{a + h} + \sqrt{a}}. $$ As \(h \to 0\), this tends to \(1/(2 \sqrt{a})\), so \(v'(a) = 1/(2 \sqrt{a})\). Then \(g'(x) = v'(u(x)) \cdot u'(x) = (2 x)/(2 \sqrt{x^2 + 1}) = x/\sqrt{x^2 + 1}\).
For \(g(x) = \sqrt{x^2 + 1}\): take \(u(x) = x^2 + 1\) (\(u'(x) = 2 x\)), \(v(y) = \sqrt{y}\) defined on \(y > 0\). Derivative of \(v\) at \(a > 0\): for \(h \ne 0\) with \(a + h > 0\), $$ \frac{\sqrt{a + h} - \sqrt{a}}{h} = \frac{(\sqrt{a + h} - \sqrt{a})(\sqrt{a + h} + \sqrt{a})}{h (\sqrt{a + h} + \sqrt{a})} = \frac{1}{\sqrt{a + h} + \sqrt{a}}. $$ As \(h \to 0\), this tends to \(1/(2 \sqrt{a})\), so \(v'(a) = 1/(2 \sqrt{a})\). Then \(g'(x) = v'(u(x)) \cdot u'(x) = (2 x)/(2 \sqrt{x^2 + 1}) = x/\sqrt{x^2 + 1}\).
I.1
Tribe and probabilisable space
Proposition — Derivative of the inverse map
Let \(I\) be an interval, \(f : I \to \mathbb{R}\) continuous and strictly monotone on \(I\), \(J = f(I)\). Suppose \(f\) is differentiable at \(a \in I\) with \(f'(a) \ne 0\). Then \(f^{-1} : J \to I\) is differentiable at \(b = f(a)\) and $$ \textcolor{colorprop}{(f^{-1})'(b) = \frac{1}{f'(a)} = \frac{1}{f'(f^{-1}(b))}}. $$ - Step 1: \(f^{-1}\) is continuous at \(b\). By the bijection theorem (Limits and continuity T7.2), \(f^{-1} : J \to I\) is continuous and strictly monotone on \(J\).
- Step 2: limit of the inverse difference quotient. Take \(k\) such that \(b + k \in J\) and \(k \ne 0\), and set \(h = f^{-1}(b + k) - a\), so \(a + h = f^{-1}(b + k) \in I\) and \(h \ne 0\) (strict monotonicity of \(f^{-1}\) gives \(h \ne 0\) when \(k \ne 0\)). Then \(f(a + h) = b + k\), so \(f(a + h) - f(a) = k\), hence $$ \tau_b^{f^{-1}}(k) = \frac{f^{-1}(b + k) - f^{-1}(b)}{k} = \frac{h}{f(a + h) - f(a)} = \frac{1}{\tau_a^f(h)}. $$ As \(k \to 0\), \(h = f^{-1}(b + k) - a \to 0\) by continuity of \(f^{-1}\) at \(b\) (Step 1); hence \(\tau_a^f(h) \to f'(a) \ne 0\), and \(1/\tau_a^f(h) \to 1/f'(a)\).
- Step 3: conclusion. \(f^{-1}\) is differentiable at \(b\) and \((f^{-1})'(b) = 1/f'(a) = 1/f'(f^{-1}(b))\).
Figure --- symmetry between graphs of \(f\) and \(f^{-1}\)
Example
Show that \(g : \mathbb{R} \to \mathbb{R}\), \(g(y) = \sqrt[3]{y} = y^{1/3}\), is differentiable on \(\mathbb{R} \setminus \{0\}\) and compute \(g'(b)\) for \(b \ne 0\).
\(g\) is the inverse of \(f : \mathbb{R} \to \mathbb{R}\), \(f(x) = x^3\) (continuous, strictly increasing on \(\mathbb{R}\)). \(f\) is differentiable on \(\mathbb{R}\) with \(f'(x) = 3 x^2\), which vanishes only at \(x = 0\). For \(b \ne 0\), let \(a = g(b) = \sqrt[3]{b} \ne 0\), then \(f'(a) = 3 a^2 \ne 0\) and P2.5 gives $$ g'(b) = \frac{1}{f'(a)} = \frac{1}{3 a^2} = \frac{1}{3 (\sqrt[3]{b})^2}. $$ At \(b = 0\): \(f'(0) = 0\), the hypothesis of P2.5 fails; \(g\) admits a vertical tangent at \(0\) (the tangent of \(f\) at \(0\) being horizontal).
Forward reference. This same method, applied with \(f = \exp\), \(f = \sin_{|[-\pi/2, \pi/2]}\), \(f = \tan_{|]-\pi/2, \pi/2[}\), will give rigorously \((\ln)' = 1/x\), \((\arcsin)' = 1/\sqrt{1 - x^2}\), \((\arctan)' = 1/(1 + x^2)\) in Standard functions.
Forward reference. This same method, applied with \(f = \exp\), \(f = \sin_{|[-\pi/2, \pi/2]}\), \(f = \tan_{|]-\pi/2, \pi/2[}\), will give rigorously \((\ln)' = 1/x\), \((\arcsin)' = 1/\sqrt{1 - x^2}\), \((\arctan)' = 1/(1 + x^2)\) in Standard functions.
Method — Compute a derivative using the operation toolbox
For a function \(f\) built from elementary blocks: - Identify the building blocks (polynomial, rational, \(\sqrt{\cdot}\), composition, …).
- Apply linearity / product / quotient / chain / inverse rules in the order suggested by the structure.
- Simplify the result and state the domain of validity (where denominators do not vanish, etc.).
Skills to practice
- Expressing events with set operations
I.2
Probability and probability space
Ex 3
Ex 4
Ex 5
Ex 6
At an interior point of \(I\) where \(f\) is differentiable, an extremum forces \(f'\) to vanish: this is Fermat's theorem, the bridge from differentiability to the next section on Rolle's theorem and the mean value theorem. The secondary-school notion of « stationary point » becomes here the « critical point ». Crucial: the hypothesis « interior point » is not removable, and the converse is false.
Skills to practice
- Applying \(\sigma\)-additivity
II
Properties and limit theorems
Definition — Local extremum
\(f : I \to \mathbb{R}\), \(a \in I\). We say \(a\) is a local maximum (resp. local minimum) of \(f\) if there exists \(\eta > 0\) such that for all \(x \in I \cap [a - \eta, a + \eta]\), \(f(x) \le f(a)\) (resp. \(\ge\)). A local extremum is a local maximum or local minimum.
II.1
Elementary properties of a probability
Definition — Critical point
\(f : I \to \mathbb{R}\) differentiable at \(a \in I\). We say \(a\) is a critical point of \(f\) if \(\textcolor{colordef}{f'(a) = 0}\). Theorem — Fermat
Let \(f : I \to \mathbb{R}\) and \(a\) an interior point of \(I\) where \(f\) is differentiable. If \(a\) is a local extremum of \(f\), then \(\textcolor{colorprop}{f'(a) = 0}\).
Treat the case of a local maximum; the local minimum case is symmetric (replace \(f\) by \(-f\)). Let \(\eta > 0\) such that \([a - \eta, a + \eta] \subset I\) (possible because \(a\) is interior) and \(f(a + h) \le f(a)\) for every \(h \in [-\eta, \eta]\).
For \(0 < h \le \eta\), we have \(f(a + h) \le f(a)\), so $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} \le 0. $$ Passing to the limit as \(h \to 0^+\) (passage to the limit of a large inequality for function limits, Limits and continuity P4.1) gives \(f'_d(a) \le 0\).
For \(-\eta \le h < 0\), we again have \(f(a + h) \le f(a)\), but now \(h < 0\), so $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} \ge 0. $$ Passing to the limit as \(h \to 0^-\) gives \(f'_g(a) \ge 0\).
Since \(f\) is differentiable at the interior point \(a\), the two lateral derivatives are equal to \(f'(a)\). Hence \(f'(a) \le 0\) and \(f'(a) \ge 0\), so \(f'(a) = 0\).
For \(0 < h \le \eta\), we have \(f(a + h) \le f(a)\), so $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} \le 0. $$ Passing to the limit as \(h \to 0^+\) (passage to the limit of a large inequality for function limits, Limits and continuity P4.1) gives \(f'_d(a) \le 0\).
For \(-\eta \le h < 0\), we again have \(f(a + h) \le f(a)\), but now \(h < 0\), so $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} \ge 0. $$ Passing to the limit as \(h \to 0^-\) gives \(f'_g(a) \ge 0\).
Since \(f\) is differentiable at the interior point \(a\), the two lateral derivatives are equal to \(f'(a)\). Hence \(f'(a) \le 0\) and \(f'(a) \ge 0\), so \(f'(a) = 0\).
Example
Counter-example 1 --- converse of Fermat is false. Show that \(f(x) = x^3\) satisfies \(f'(0) = 0\) but \(0\) is not a local extremum of \(f\).
\(f'(x) = 3 x^2\) so \(f'(0) = 0\). But for any \(\eta > 0\), \(f(-\eta/2) = -\eta^3/8 < 0 < \eta^3/8 = f(\eta/2)\), and \(f(0) = 0\), so \(0\) is neither a local maximum nor a local minimum of \(f\). The point \(0\) is a critical point of \(f\) but not an extremum (it is an inflection point with horizontal tangent).
Skills to practice
- Computing with the probability properties
II.2
Monotone continuity and sub-additivity
Example
Counter-example 2 --- the « interior » hypothesis is essential. Consider \(f : [0, 1] \to \mathbb{R}\), \(f(x) = x\). The maximum of \(f\) is reached at \(1\), an endpoint of \([0, 1]\) (not an interior point), so Fermat does not apply. In fact, for \(h < 0\) with \(1 + h \in [0, 1]\), $$ \tau_1(h) = \frac{f(1 + h) - f(1)}{h} = \frac{(1 + h) - 1}{h} = 1. $$ Thus \(f'_g(1) = 1 \ne 0\). Theorem — Rolle
Let \(a < b\) and \(f : [a, b] \to \mathbb{R}\) continuous on \([a, b]\), differentiable on the open interval \(]a, b[\), with \(f(a) = f(b)\). Then there exists \(c \in {]}a, b{[}\) such that \(\textcolor{colorprop}{f'(c) = 0}\).
By the extreme value theorem (Limits and continuity T7.1), \(f\) admits a maximum \(M\) and a minimum \(m\) on the segment \([a, b]\). Distinguish two cases.
- Case 1: \(M = m\). Then \(f\) is constant on \([a, b]\), hence \(f' \equiv 0\) on \(]a, b[\), and any \(c \in {]}a, b{[}\) works.
- Case 2: \(M \ne m\). Then \(M > m\). If both extremes were reached only at the endpoints \(a\) and \(b\), both would equal \(f(a) = f(b)\), contradicting \(M \ne m\). Hence at least one of \(M, m\) is reached at an interior point \(c \in {]}a, b{[}\). Then \(c\) is a local extremum (in fact a global one), \(c\) is interior, and \(f\) is differentiable at \(c\), so by Fermat T3.1, \(f'(c) = 0\).
Example
Three counter-examples isolating each hypothesis of Rolle. - Continuity at the endpoints fails: \(f : [0, 1] \to \mathbb{R}\), \(f(x) = x\) for \(x \in [0, 1[\) and \(f(1) = 0\). Then \(f(0) = f(1) = 0\), \(f\) is differentiable on \(]0, 1[\), but \(f\) is not continuous at \(1\); and \(f'(x) = 1\) never vanishes.
- Interior differentiability fails: \(f(x) = |x - 1/2|\) on \([0, 1]\) --- continuous, \(f(0) = f(1) = 1/2\), but not differentiable at \(1/2\); and \(f'\) vanishes nowhere on \(]0, 1[ \setminus \{1/2\}\).
- \(f(a) \ne f(b)\): \(f(x) = x\) on \([0, 1]\) --- \(f\) is continuous and differentiable, but \(f(0) = 0 \ne 1 = f(1)\), and \(f'(x) = 1\) never vanishes.
Skills to practice
- Applying monotone continuity
II.3
Negligible events and complete systems
Remark --- Rolle is false over \(\mathbb{C}\)
Rolle does NOT extend to complex-valued functions. This counterexample will be revisited in the complex-valued extension section to explain why the equality form of the mean value theorem has no complex-valued analogue.
Theorem — Mean value theorem (equality form)
Let \(a < b\) and \(f : [a, b] \to \mathbb{R}\) continuous on \([a, b]\) and differentiable on \(]a, b[\). Then there exists \(c \in {]}a, b{[}\) such that $$ \textcolor{colorprop}{f(b) - f(a) = f'(c) (b - a)}. $$
Define the affine function \(d : [a, b] \to \mathbb{R}\) corresponding to the chord: $$ d(x) = f(a) + \frac{f(b) - f(a)}{b - a} (x - a), \qquad d(a) = f(a), \quad d(b) = f(b). $$ Set \(\varphi(x) = f(x) - d(x)\). Then \(\varphi\) is continuous on \([a, b]\) (sum of two continuous), differentiable on \(]a, b[\) with \(\varphi'(x) = f'(x) - (f(b) - f(a))/(b - a)\), and \(\varphi(a) = \varphi(b) = 0\). By Rolle T4.1 applied to \(\varphi\), there exists \(c \in {]}a, b{[}\) with \(\varphi'(c) = 0\), i.e. \(f'(c) = (f(b) - f(a))/(b - a)\), which rearranges to the conclusion.
Figure --- Mean value theorem
Three consequences of the mean value theorem: the mean value inequality --- Lipschitz bound from a bound on \(|f'|\); the link sign-of-\(f'\) \(\leftrightarrow\) monotonicity; the theorem of the limit of the derivative --- a rigorous \(C^1\) extension at a tricky point. The mean value inequality + contraction \(k < 1\) also yields the convergence rate of the recurrent sequences of Real sequences.
Skills to practice
- Manipulating negligible and almost-sure events
III
Conditional probability and independence
Theorem — Mean value inequality
Let \(I\) be an interval and \(f : I \to \mathbb{R}\) continuous on \(I\), differentiable on \(\mathring{I}\) (the interior of \(I\)). If \(|f'(x)| \le K\) for all \(x \in \mathring{I}\), then \(f\) is \(K\)-Lipschitz on \(I\): $$ \textcolor{colorprop}{\forall (x, y) \in I^2, \quad |f(y) - f(x)| \le K |y - x|}. $$
For \(x, y \in I\), treat \(x \ne y\) (if \(x = y\) the inequality is \(0 \le 0\)). WLOG \(x < y\). Then \([x, y] \subset I\), \(f\) is continuous on \([x, y]\) and differentiable on \(]x, y[ \subset \mathring{I}\). By the mean value theorem T4.2 applied on \([x, y]\), there exists \(c \in {]}x, y{[}\) with \(f(y) - f(x) = f'(c) (y - x)\). Hence $$ |f(y) - f(x)| = |f'(c)| \cdot |y - x| \le K |y - x|. $$
III.1
Conditional probability
Method — Lipschitzianity from a bound on \(f'\)
To show \(f\) is \(K\)-Lipschitz on an interval \(I\): - check \(f\) is continuous on \(I\) and differentiable on \(\mathring{I}\);
- bound \(|f'(x)| \le K\) for \(x \in \mathring{I}\);
- conclude by T5.1.
Example
Show that \(f : [0, +\infty[ \to \mathbb{R}\), \(f(x) = \sqrt{x + 1}\), is \((1/2)\)-Lipschitz.
\(f\) is continuous on \([0, +\infty[\) and differentiable on \(]0, +\infty[\) with \(f'(x) = 1/(2 \sqrt{x + 1})\). For \(x \in ]0, +\infty[\), \(x + 1 > 1\) so \(\sqrt{x + 1} > 1\) and \(|f'(x)| = 1/(2 \sqrt{x+1}) \le 1/2\). By T5.1, \(f\) is \((1/2)\)-Lipschitz.
Method — Contraction --- bridge to the recurrent sequences of Real sequences
Setup. \(f : [a, b] \to [a, b]\) continuous (so \([a, b]\) is stable by \(f\)), differentiable on \(]a, b[\), with \(|f'(x)| \le k\) on \(]a, b[\) for some \(k \in [0, 1[\). Then \(f\) has a unique fixed point \(\ell \in [a, b]\), and for every \(u_0 \in [a, b]\) the recurrent sequence \(u_{n+1} = f(u_n)\) stays in \([a, b]\) and converges geometrically to \(\ell\) with \(|u_n - \ell| \le k^n |u_0 - \ell|\). Standard recipe: - Existence. Set \(g(x) = f(x) - x\) on \([a, b]\). Stability gives \(g(a) \ge 0\) and \(g(b) \le 0\); the intermediate value theorem (Limits and continuity T6.1) yields \(\ell \in [a, b]\) with \(f(\ell) = \ell\).
- Stability of \((u_n)\). Induction: \(u_0 \in [a, b]\); if \(u_n \in [a, b]\), then \(u_{n+1} = f(u_n) \in f([a, b]) \subset [a, b]\).
- Geometric rate. Apply T5.1 with \(K = k\) on \([\min(u_n, \ell), \max(u_n, \ell)] \subset [a, b]\): \(|u_{n+1} - \ell| \le k |u_n - \ell|\), hence \(|u_n - \ell| \le k^n |u_0 - \ell| \to 0\).
- Uniqueness. If \(\ell'\) is another fixed point, taking \(u_0 = \ell'\) makes the sequence constant equal to \(\ell'\) while it converges to \(\ell\), so \(\ell' = \ell\).
Proposition — Strict monotonicity --- useful complement
Under the same hypotheses as P5.1, \(f\) is strictly increasing on \(I\) \(\iff\) \(\textcolor{colorprop}{f' \ge 0}\) on \(\mathring{I}\) and \(f'\) is not identically zero on any non-trivial subinterval of \(\mathring{I}\). - \((\Rightarrow)\) If \(f\) is strictly increasing, then \(f' \ge 0\) on \(\mathring{I}\) by P5.1(b). Moreover, if \(f' \equiv 0\) on some non-trivial subinterval \(J \subset \mathring{I}\), then \(f\) would be constant on \(J\) by P5.1(a), contradicting strict monotonicity.
- \((\Leftarrow)\) From \(f' \ge 0\) on \(\mathring{I}\) we get \(f\) increasing on \(I\) (P5.1(b)). Suppose for contradiction that \(f(x) = f(y)\) for some \(x < y\) in \(I\). Since \(f\) is increasing, this forces \(f \equiv f(x)\) on \([x, y]\); then \(f' \equiv 0\) on \({]}x, y{[}\) by P5.1(a), contradicting the hypothesis.
Remark --- \(I\) must be an interval
The hypothesis « \(I\) is an interval » is essential in P5.1. Counterexample: \(D = \,]-\infty, 0[ \,\cup\, ]0, +\infty[ = \mathbb{R}^*\) (which is not an interval --- the same punctured domain that appears naturally when studying the Heaviside example below). Define \(f : D \to \mathbb{R}\) by \(f(x) = 0\) for \(x < 0\) and \(f(x) = 1\) for \(x > 0\). Then \(f' \equiv 0\) on \(D\) (constant on each connected component) but \(f\) is not constant on \(D\).
Example
Study the variations of \(f : \mathbb{R} \to \mathbb{R}\), \(f(x) = x/(1 + x^2)\).
\(f\) is differentiable on \(\mathbb{R}\) (rational function, denominator never zero) with $$ f'(x) = \frac{(1 + x^2) - x \cdot 2 x}{(1 + x^2)^2} = \frac{1 - x^2}{(1 + x^2)^2}. $$ Sign of \(f'(x)\): same sign as \(1 - x^2 = (1 - x)(1 + x)\). Positive for \(x \in {]}-1, 1{[}\), negative outside. By P5.1, \(f\) is increasing on \([-1, 1]\) and decreasing on \(]-\infty, -1]\) and \([1, +\infty[\). Maximum global at \(x = 1\): \(f(1) = 1/2\). Minimum global at \(x = -1\): \(f(-1) = -1/2\).
Skills to practice
- Computing conditional probabilities
III.2
The composed\(\virgule\) total and Bayes formulas
Theorem — Theorem of the limit of the derivative
Let \(I\) be an interval, \(a \in I\), \(f : I \to \mathbb{R}\) continuous on \(I\) and differentiable on \(I \setminus \{a\}\). Suppose \(f'(x) \to \ell \in \overline{\mathbb{R}}\) as \(x \to a\) (with \(x \in I \setminus \{a\}\)). Then: - If \(\ell \in \mathbb{R}\): \(f\) is differentiable at \(a\) and \(\textcolor{colorprop}{f'(a) = \ell}\). If moreover \(f'\) is continuous on \(I \setminus \{a\}\), then \(f' : I \to \mathbb{R}\) (extended by \(f'(a) := \ell\)) is continuous at \(a\), hence \(f\) is \(C^1\) on a neighborhood of \(a\) in \(I\).
- If \(\ell = \pm \infty\): the difference quotient satisfies \(\tau_a(h) \to \ell\) as \(h \to 0\), and the graph of \(f\) admits a vertical (half-)tangent at \(a\).
- Finite case \(\ell \in \mathbb{R}\). Let \(h \ne 0\) be such that \(a + h \in I\). By the mean value theorem T4.2 applied to \(f\) on the segment with endpoints \(a\) and \(a + h\) (continuous on the closed segment, differentiable on the open segment \(\subset I \setminus \{a\}\)), there exists \(c_h\) strictly between \(a\) and \(a + h\) such that $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} = f'(c_h). $$ As \(h \to 0\), \(c_h\) is squeezed between \(a\) and \(a + h\), hence \(c_h \to a\). By hypothesis \(f'(c_h) \to \ell\) (composition of limits), so \(\tau_a(h) \to \ell\). Thus \(f\) is differentiable at \(a\) with \(f'(a) = \ell\). If \(a\) is an endpoint of \(I\), the same argument applies with \(h\) on the single side that fits inside \(I\), and the conclusion holds as a one-sided derivative.
- Case \(\ell = \pm \infty\). The same argument gives \(\tau_a(h) = f'(c_h)\) with \(c_h \to a\) as \(h \to 0\). Hence \(\tau_a(h) \to \pm \infty\), which means the graph admits a vertical tangent at \(a\) (with the convention \(f'(a) = \pm \infty\), not a real derivative).
Remark --- continuity of \(f\) is essential
Distinct from « extending \(f'\) continuously ». The hypothesis « \(f\) continuous at \(a\) » in T5.2 is essential. Counterexample: the Heaviside function \(H(x) = 0\) for \(x < 0\) and \(H(x) = 1\) for \(x \ge 0\). Then \(\lim_{x \to 0^-} H(x) = 0 \ne 1 = H(0)\), so \(H\) is discontinuous at \(0\) and the hypothesis of T5.2 fails. The naive « plug in \(\lim H' = 0\) » conclusion would be wrong: although \(H' \equiv 0\) on \(\mathbb{R}^* = \mathbb{R} \setminus \{0\}\), hence \(H'(x) \to 0\) as \(x \to 0\), the difference quotient at \(0\) reflects the discontinuity: $$ \tau_0(h) = \frac{H(0 + h) - H(0)}{h} = \frac{H(h) - 1}{h} \to \begin{cases} +\infty & (h \to 0^-) \\
0 & (h \to 0^+) \end{cases} $$ so \(H\) is not differentiable at \(0\). Always verify continuity of \(f\) at \(a\) before invoking T5.2.
Example
Show that \(f : \mathbb{R} \to \mathbb{R}\) defined by \(f(x) = x^2\) for \(x \le 0\) and \(f(x) = x^3\) for \(x > 0\) is differentiable at \(0\), via M5.2.
Continuity at \(0\): \(\lim_{x \to 0^-} x^2 = 0 = f(0)\) and \(\lim_{x \to 0^+} x^3 = 0 = f(0)\), so \(f\) is continuous at \(0\).
Differentiability on \(\mathbb{R}^*\): \(f\) is polynomial on each open half-line, hence differentiable with \(f'(x) = 2x\) for \(x < 0\) and \(f'(x) = 3x^2\) for \(x > 0\).
Limit of \(f'\): \(\lim_{x \to 0^-} 2x = 0\) and \(\lim_{x \to 0^+} 3x^2 = 0\), so \(\lim_{x \to 0} f'(x) = 0\) (finite).
Apply T5.2: the hypotheses of T5.2 hold (\(f\) continuous on \(\mathbb{R}\), differentiable on \(\mathbb{R}^*\), \(f'\) has a finite limit at \(0\)), so \(f\) is differentiable at \(0\) with \(f'(0) = 0\).
Differentiability on \(\mathbb{R}^*\): \(f\) is polynomial on each open half-line, hence differentiable with \(f'(x) = 2x\) for \(x < 0\) and \(f'(x) = 3x^2\) for \(x > 0\).
Limit of \(f'\): \(\lim_{x \to 0^-} 2x = 0\) and \(\lim_{x \to 0^+} 3x^2 = 0\), so \(\lim_{x \to 0} f'(x) = 0\) (finite).
Apply T5.2: the hypotheses of T5.2 hold (\(f\) continuous on \(\mathbb{R}\), differentiable on \(\mathbb{R}^*\), \(f'\) has a finite limit at \(0\)), so \(f\) is differentiable at \(0\) with \(f'(0) = 0\).
Skills to practice
- Applying the composed\(\virgule\) total and Bayes formulas
III.3
Independence of events
Ex 16
Ex 17
Ex 18
Ex 19
Ex 20
Ex 21
Skills to practice
- Establishing independence of events
IV
Discrete probability spaces
Iterating differentiation. The class \(C^k\) formalises smoothness. We state and prove the Leibniz formula first (a direct calculation), then deduce the stability of \(C^k\) under product as a corollary. Composition and inverse stability are admitted at this level.
IV.1
Discrete distributions and their support
Definition — \(n\)-th derivative
Recursive definition: \(f^{(0)} = f\). If \(f^{(n)}\) is defined on \(I\) and differentiable on \(I\), then \(f^{(n+1)} = (f^{(n)})'\). The function \(f\) is said to be \(n\)-times differentiable on \(I\) if \(f^{(n)}\) exists on \(I\). Definition — Classes \(C^k\) and \(C^\infty\)
For \(k \in \mathbb{N}\), \(f\) is of class \(C^k\) on \(I\) if \(f^{(k)}\) exists on \(I\) and is continuous on \(I\). \(f\) is of class \(C^\infty\) on \(I\) if \(f\) is \(C^k\) for every \(k \in \mathbb{N}\). Notation: \(\textcolor{colordef}{C^k(I, \mathbb{R})}\), \(\textcolor{colordef}{C^\infty(I, \mathbb{R})}\). Example
Polynomials are \(C^\infty\) on \(\mathbb{R}\).
For \(P(x) = x^n\), by induction \(P^{(k)}(x) = n (n-1) \cdots (n - k + 1) x^{n-k}\) for \(k \le n\), and \(P^{(k)} \equiv 0\) for \(k > n\). Each \(P^{(k)}\) is a polynomial, hence continuous on \(\mathbb{R}\). Same for any polynomial \(\sum c_k x^k\) by linearity. Forward reference: \(\exp\), \(\sin\), \(\cos\) are also \(C^\infty\) on \(\mathbb{R}\); their derivatives are rigorously justified in Standard functions.
Theorem — Leibniz formula
Let \(f, g\) be \(n\) times differentiable on \(I\). Then \(f g\) is \(n\) times differentiable on \(I\) and $$ \textcolor{colorprop}{(f g)^{(n)} = \sum_{p = 0}^{n} \binom{n}{p} f^{(p)} g^{(n - p)}}. $$
Induction on \(n\).
- Base \(n = 0\). \((f g)^{(0)} = f g = \binom{0}{0} f^{(0)} g^{(0)}\), formula holds.
- Inductive step. Assume the formula holds at rank \(n\) for all \(n\)-times differentiable functions, and let \(f, g\) be \((n+1)\)-times differentiable. In particular \(f, g\) are \(n\)-times differentiable, so \((f g)^{(n)} = \sum_{p = 0}^{n} \binom{n}{p} f^{(p)} g^{(n - p)}\). Differentiate once more, using linearity (P2.1) and the product rule (P2.2): $$ \begin{aligned} (f g)^{(n+1)} &= \sum_{p = 0}^{n} \binom{n}{p} \big( f^{(p+1)} g^{(n - p)} + f^{(p)} g^{(n - p + 1)} \big) \\ &= \underbrace{\sum_{p = 0}^{n} \binom{n}{p} f^{(p+1)} g^{(n - p)}}_{S_1} + \underbrace{\sum_{p = 0}^{n} \binom{n}{p} f^{(p)} g^{(n - p + 1)}}_{S_2}. \end{aligned} $$ Re-index \(S_1\) with \(q = p + 1\) (so \(q\) runs from \(1\) to \(n + 1\)); rename \(q\) as \(p\): $$ S_1 = \sum_{p = 1}^{n+1} \binom{n}{p - 1} f^{(p)} g^{(n + 1 - p)}. $$ Keep \(S_2\) as it is. Adding \(S_1 + S_2\), the \(p = 0\) term comes from \(S_2\) alone (coefficient \(\binom{n}{0} = 1 = \binom{n+1}{0}\)), the \(p = n + 1\) term from \(S_1\) alone (coefficient \(\binom{n}{n} = 1 = \binom{n+1}{n+1}\)), and for \(1 \le p \le n\) Pascal's relation \(\binom{n}{p - 1} + \binom{n}{p} = \binom{n + 1}{p}\) (chapter Sums, products and binomial coefficients) groups the two contributions. Hence $$ (f g)^{(n+1)} = \sum_{p = 0}^{n+1} \binom{n+1}{p} f^{(p)} g^{(n + 1 - p)}, $$ which is the formula at rank \(n + 1\).
Skills to practice
- Identifying a discrete distribution
IV.2
The distribution--probability correspondence
Proposition — Stability of \(C^k\)
Let \(f, g \in C^k(I, \mathbb{R})\), \(\lambda, \mu \in \mathbb{R}\). Then: - \(\textcolor{colorprop}{\lambda f + \mu g \in C^k(I)}\) ;
- \(\textcolor{colorprop}{f g \in C^k(I)}\) ;
- if \(g\) does not vanish on \(I\), \(\textcolor{colorprop}{f / g \in C^k(I)}\) ;
- if \(\varphi : J \to I\) is \(C^k\), then \(\textcolor{colorprop}{f \circ \varphi \in C^k(J)}\) ;
- if \(k \ge 1\), \(f \in C^k(I, \mathbb{R})\) with \(f : I \to f(I)\) bijective and \(f'\) not vanishing on \(I\), then \(\textcolor{colorprop}{f^{-1} \in C^k(f(I), \mathbb{R})}\).
- Linear combination. Induction on \(k\). Base \(k = 0\): \(\lambda f + \mu g\) is continuous (sum of continuous functions). Inductive step \(k \to k + 1\): if \(f, g\) are \(C^{k+1}\), they are differentiable and \(f', g'\) are \(C^k\); by P2.1 \((\lambda f + \mu g)' = \lambda f' + \mu g'\) is \(C^k\) by induction, hence \(\lambda f + \mu g\) is \(C^{k+1}\).
- Product. Same induction. Base \(k = 0\): \(f g\) is continuous. Inductive step: \(f, g\) are \(C^{k+1}\), so \((f g)' = f' g + f g'\) (P2.2). Each summand is a product of \(C^k\) functions, hence \(C^k\) by induction; their sum is \(C^k\) by linearity; thus \(f g\) is \(C^{k+1}\).
- Quotient. First show by induction on \(k\) that if \(g\) is \(C^k\) and does not vanish on \(I\), then \(1/g\) is \(C^k\) on \(I\). Base \(k = 0\): \(1/g\) is continuous (continuity of \(g\) and non-vanishing). Inductive step: if \(g\) is \(C^{k+1}\), then \((1/g)' = -g'/g^2\) (P2.3); by the product case \(g^2\) is \(C^k\), \(g^2\) does not vanish, so by the induction hypothesis \(1/g^2\) is \(C^k\), and \(-g'/g^2 = -g' \cdot (1/g^2)\) is \(C^k\) as a product; hence \(1/g\) is \(C^{k+1}\). Then \(f/g = f \cdot (1/g)\) is \(C^k\) as a product.
- Composition and inverse: proofs admitted. The proofs concerning composition and inverse are not required at this level. The results themselves remain in scope and may be used freely.
Method — Compute a higher-order derivative
Three classical patterns: - Pattern A --- \(P(x) \cdot \exp(a x)\). Apply Leibniz; only the first \(\deg P + 1\) terms are non-zero since \(P^{(k)} = 0\) for \(k > \deg P\).
- Pattern B --- rational fraction. Decompose into partial fractions (chapter Rational fractions), then use \((1/(x - a))^{(n)} = (-1)^n n! / (x - a)^{n + 1}\) --- proved by direct induction.
- Pattern C --- trig power. Linearise first (chapter Trigonometry), then differentiate the linearised expression term by term.
Example
Compute \(\big(1/(x^2 - 1)\big)^{(n)}\) on \(\mathbb{R} \setminus \{-1, 1\}\) via Pattern B.
Partial fractions: \(1/(x^2 - 1) = 1/((x - 1)(x + 1)) = (1/2) \big(1/(x - 1) - 1/(x + 1)\big)\). By the identity \((1/(x - a))^{(n)} = (-1)^n n! / (x - a)^{n + 1}\) (induction), with linearity (P2.1 extended to higher orders): for every \(x \in \mathbb{R} \setminus \{-1, 1\}\) (equivalently, on each interval of the domain), $$ \left(\frac{1}{x^2 - 1}\right)^{(n)} = \frac{(-1)^n n!}{2} \left( \frac{1}{(x - 1)^{n + 1}} - \frac{1}{(x + 1)^{n + 1}} \right). $$
Skills to practice
- Defining a probability by its distribution
Jump to section