CommeUnJeu · L2 MP

Probability spaces

⌚ ~83 min ▢ 10 blocks ✓ 21 exercises ➣ Prerequisites : Probabilities on a finite universe, Countability

Example

Compute the derivative of $f(x) = (x^3 - 2 x + 1)/(x^2 + 1)$ on $\mathbb{R}$.

$x^2 + 1 > 0$ never vanishes. By the quotient rule: $$ f'(x) = \frac{(3 x^2 - 2)(x^2 + 1) - (x^3 - 2 x + 1)(2 x)}{(x^2 + 1)^2} = \frac{x^4 + 5 x^2 - 2 x - 2}{(x^2 + 1)^2}. $$

Proposition — Chain rule

Let $f : I \to \mathbb{R}$ with $f(I) \subset J$, and $g : J \to \mathbb{R}$. If $f$ is differentiable at $a \in I$ and $g$ is differentiable at $b = f(a) \in J$, then $g \circ f$ is differentiable at $a$ and $$ \textcolor{colorprop}{(g \circ f)'(a) = g'(f(a)) \cdot f'(a)}. $$

Proof

Define an auxiliary function $\tau : J \to \mathbb{R}$ by $$ \tau(y) = \begin{cases} (g(y) - g(b))/(y - b) & \text{if } y \ne b, \\ g'(b) & \text{if } y = b. \end{cases} $$ By the differentiability of $g$ at $b$, $\tau(y) \to g'(b) = \tau(b)$ as $y \to b$, so $\tau$ is continuous at $b$. By definition, $g(y) - g(b) = \tau(y) (y - b)$ holds for all $y \in J$ (including $y = b$, both sides being $0$).
Substitute $y = f(a + h)$ for $h$ such that $a + h \in I$: $$ g(f(a + h)) - g(f(a)) = \tau(f(a + h)) \, (f(a + h) - f(a)). $$ For $h \ne 0$, divide by $h$: $$ \tau_a^{g \circ f}(h) = \tau(f(a + h)) \cdot \tau_a^f(h). $$ As $h \to 0$: $f$ is continuous at $a$ (P1.2) so $f(a + h) \to b$, and $\tau$ is continuous at $b$ so $\tau(f(a + h)) \to g'(b)$; finally $\tau_a^f(h) \to f'(a)$. The product tends to $g'(b) f'(a)$.

I Tribes and probability spaces

Example

Compute the derivatives of $(x^3 + 1)^5$ and $\sqrt{x^2 + 1}$ on $\mathbb{R}$ via the chain rule.

Answer

For $f(x) = (x^3 + 1)^5$: take $u(x) = x^3 + 1$ ($u'(x) = 3 x^2$), $v(y) = y^5$ ($v'(y) = 5 y^4$). By chain rule, $f'(x) = v'(u(x)) \cdot u'(x) = 5 (x^3 + 1)^4 \cdot 3 x^2 = 15 x^2 (x^3 + 1)^4$.
For $g(x) = \sqrt{x^2 + 1}$: take $u(x) = x^2 + 1$ ($u'(x) = 2 x$), $v(y) = \sqrt{y}$ defined on $y > 0$. Derivative of $v$ at $a > 0$: for $h \ne 0$ with $a + h > 0$, $$ \frac{\sqrt{a + h} - \sqrt{a}}{h} = \frac{(\sqrt{a + h} - \sqrt{a})(\sqrt{a + h} + \sqrt{a})}{h (\sqrt{a + h} + \sqrt{a})} = \frac{1}{\sqrt{a + h} + \sqrt{a}}. $$ As $h \to 0$, this tends to $1/(2 \sqrt{a})$, so $v'(a) = 1/(2 \sqrt{a})$. Then $g'(x) = v'(u(x)) \cdot u'(x) = (2 x)/(2 \sqrt{x^2 + 1}) = x/\sqrt{x^2 + 1}$.

I.1 Tribe and probabilisable space

Proposition — Derivative of the inverse map

Let $I$ be an interval, $f : I \to \mathbb{R}$ continuous and strictly monotone on $I$, $J = f(I)$. Suppose $f$ is differentiable at $a \in I$ with $f'(a) \ne 0$. Then $f^{-1} : J \to I$ is differentiable at $b = f(a)$ and $$ \textcolor{colorprop}{(f^{-1})'(b) = \frac{1}{f'(a)} = \frac{1}{f'(f^{-1}(b))}}. $$

Proof

Step 1: $f^{-1}$ is continuous at $b$. By the bijection theorem (Limits and continuity T7.2), $f^{-1} : J \to I$ is continuous and strictly monotone on $J$.
Step 2: limit of the inverse difference quotient. Take $k$ such that $b + k \in J$ and $k \ne 0$, and set $h = f^{-1}(b + k) - a$, so $a + h = f^{-1}(b + k) \in I$ and $h \ne 0$ (strict monotonicity of $f^{-1}$ gives $h \ne 0$ when $k \ne 0$). Then $f(a + h) = b + k$, so $f(a + h) - f(a) = k$, hence $$ \tau_b^{f^{-1}}(k) = \frac{f^{-1}(b + k) - f^{-1}(b)}{k} = \frac{h}{f(a + h) - f(a)} = \frac{1}{\tau_a^f(h)}. $$ As $k \to 0$, $h = f^{-1}(b + k) - a \to 0$ by continuity of $f^{-1}$ at $b$ (Step 1); hence $\tau_a^f(h) \to f'(a) \ne 0$, and $1/\tau_a^f(h) \to 1/f'(a)$.
Step 3: conclusion. $f^{-1}$ is differentiable at $b$ and $(f^{-1})'(b) = 1/f'(a) = 1/f'(f^{-1}(b))$.

Figure --- symmetry between graphs of $f$ and $f^{-1}$

The graphs of $f$ and $f^{-1}$ are symmetric about the line $y = x$. The tangent of $f$ at $(a, b)$ and the tangent of $f^{-1}$ at $(b, a)$ are mirror images. When $f'(a) = 0$ (horizontal tangent of $f$), the tangent of $f^{-1}$ at $(b, a)$ is vertical: $f^{-1}$ is not differentiable at $b$ but admits a vertical tangent.

Example

Show that $g : \mathbb{R} \to \mathbb{R}$, $g(y) = \sqrt[3]{y} = y^{1/3}$, is differentiable on $\mathbb{R} \setminus \{0\}$ and compute $g'(b)$ for $b \ne 0$.

Answer

$g$ is the inverse of $f : \mathbb{R} \to \mathbb{R}$, $f(x) = x^3$ (continuous, strictly increasing on $\mathbb{R}$). $f$ is differentiable on $\mathbb{R}$ with $f'(x) = 3 x^2$, which vanishes only at $x = 0$. For $b \ne 0$, let $a = g(b) = \sqrt[3]{b} \ne 0$, then $f'(a) = 3 a^2 \ne 0$ and P2.5 gives $$ g'(b) = \frac{1}{f'(a)} = \frac{1}{3 a^2} = \frac{1}{3 (\sqrt[3]{b})^2}. $$ At $b = 0$: $f'(0) = 0$, the hypothesis of P2.5 fails; $g$ admits a vertical tangent at $0$ (the tangent of $f$ at $0$ being horizontal).
Forward reference. This same method, applied with $f = \exp$, $f = \sin_{|[-\pi/2, \pi/2]}$, $f = \tan_{|]-\pi/2, \pi/2[}$, will give rigorously $(\ln)' = 1/x$, $(\arcsin)' = 1/\sqrt{1 - x^2}$, $(\arctan)' = 1/(1 + x^2)$ in Standard functions.

Method — Compute a derivative using the operation toolbox

For a function $f$ built from elementary blocks:

Identify the building blocks (polynomial, rational, $\sqrt{\cdot}$, composition, …).
Apply linearity / product / quotient / chain / inverse rules in the order suggested by the structure.
Simplify the result and state the domain of validity (where denominators do not vanish, etc.).

Ex 1 Ex 2

Skills to practice

Expressing events with set operations

I.2 Probability and probability space

Ex 3 Ex 4 Ex 5 Ex 6

At an interior point of $I$ where $f$ is differentiable, an extremum forces $f'$ to vanish: this is Fermat's theorem, the bridge from differentiability to the next section on Rolle's theorem and the mean value theorem. The secondary-school notion of « stationary point » becomes here the « critical point ». Crucial: the hypothesis « interior point » is not removable, and the converse is false.

Skills to practice

Applying $\sigma$-additivity

II Properties and limit theorems

Definition — Local extremum

$f : I \to \mathbb{R}$, $a \in I$. We say $a$ is a local maximum (resp. local minimum) of $f$ if there exists $\eta > 0$ such that for all $x \in I \cap [a - \eta, a + \eta]$, $f(x) \le f(a)$ (resp. $\ge$). A local extremum is a local maximum or local minimum.

II.1 Elementary properties of a probability

Definition — Critical point

$f : I \to \mathbb{R}$ differentiable at $a \in I$. We say $a$ is a critical point of $f$ if $\textcolor{colordef}{f'(a) = 0}$.

Theorem — Fermat

Let $f : I \to \mathbb{R}$ and $a$ an interior point of $I$ where $f$ is differentiable. If $a$ is a local extremum of $f$, then $\textcolor{colorprop}{f'(a) = 0}$.

Proof

Treat the case of a local maximum; the local minimum case is symmetric (replace $f$ by $-f$). Let $\eta > 0$ such that $[a - \eta, a + \eta] \subset I$ (possible because $a$ is interior) and $f(a + h) \le f(a)$ for every $h \in [-\eta, \eta]$.
For $0 < h \le \eta$, we have $f(a + h) \le f(a)$, so $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} \le 0. $$ Passing to the limit as $h \to 0^+$ (passage to the limit of a large inequality for function limits, Limits and continuity P4.1) gives $f'_d(a) \le 0$.
For $-\eta \le h < 0$, we again have $f(a + h) \le f(a)$, but now $h < 0$, so $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} \ge 0. $$ Passing to the limit as $h \to 0^-$ gives $f'_g(a) \ge 0$.
Since $f$ is differentiable at the interior point $a$, the two lateral derivatives are equal to $f'(a)$. Hence $f'(a) \le 0$ and $f'(a) \ge 0$, so $f'(a) = 0$.

Example

Counter-example 1 --- converse of Fermat is false. Show that $f(x) = x^3$ satisfies $f'(0) = 0$ but $0$ is not a local extremum of $f$.

Answer

$f'(x) = 3 x^2$ so $f'(0) = 0$. But for any $\eta > 0$, $f(-\eta/2) = -\eta^3/8 < 0 < \eta^3/8 = f(\eta/2)$, and $f(0) = 0$, so $0$ is neither a local maximum nor a local minimum of $f$. The point $0$ is a critical point of $f$ but not an extremum (it is an inflection point with horizontal tangent).

Skills to practice

Computing with the probability properties

II.2 Monotone continuity and sub-additivity

Example

Counter-example 2 --- the « interior » hypothesis is essential. Consider $f : [0, 1] \to \mathbb{R}$, $f(x) = x$. The maximum of $f$ is reached at $1$, an endpoint of $[0, 1]$ (not an interior point), so Fermat does not apply. In fact, for $h < 0$ with $1 + h \in [0, 1]$, $$ \tau_1(h) = \frac{f(1 + h) - f(1)}{h} = \frac{(1 + h) - 1}{h} = 1. $$ Thus $f'_g(1) = 1 \ne 0$.

Ex 7 Ex 8

Theorem — Rolle

Let $a < b$ and $f : [a, b] \to \mathbb{R}$ continuous on $[a, b]$, differentiable on the open interval $]a, b[$, with $f(a) = f(b)$. Then there exists $c \in {]}a, b{[}$ such that $\textcolor{colorprop}{f'(c) = 0}$.

Proof

By the extreme value theorem (Limits and continuity T7.1), $f$ admits a maximum $M$ and a minimum $m$ on the segment $[a, b]$. Distinguish two cases.

Case 1: $M = m$. Then $f$ is constant on $[a, b]$, hence $f' \equiv 0$ on $]a, b[$, and any $c \in {]}a, b{[}$ works.
Case 2: $M \ne m$. Then $M > m$. If both extremes were reached only at the endpoints $a$ and $b$, both would equal $f(a) = f(b)$, contradicting $M \ne m$. Hence at least one of $M, m$ is reached at an interior point $c \in {]}a, b{[}$. Then $c$ is a local extremum (in fact a global one), $c$ is interior, and $f$ is differentiable at $c$, so by Fermat T3.1, $f'(c) = 0$.

Example

Three counter-examples isolating each hypothesis of Rolle.

Continuity at the endpoints fails: $f : [0, 1] \to \mathbb{R}$, $f(x) = x$ for $x \in [0, 1[$ and $f(1) = 0$. Then $f(0) = f(1) = 0$, $f$ is differentiable on $]0, 1[$, but $f$ is not continuous at $1$; and $f'(x) = 1$ never vanishes.
Interior differentiability fails: $f(x) = |x - 1/2|$ on $[0, 1]$ --- continuous, $f(0) = f(1) = 1/2$, but not differentiable at $1/2$; and $f'$ vanishes nowhere on $]0, 1[ \setminus \{1/2\}$.
$f(a) \ne f(b)$: $f(x) = x$ on $[0, 1]$ --- $f$ is continuous and differentiable, but $f(0) = 0 \ne 1 = f(1)$, and $f'(x) = 1$ never vanishes.

Each hypothesis is necessary.

Skills to practice

Applying monotone continuity

II.3 Negligible events and complete systems

Remark --- Rolle is false over $\mathbb{C}$

Rolle does NOT extend to complex-valued functions. This counterexample will be revisited in the complex-valued extension section to explain why the equality form of the mean value theorem has no complex-valued analogue.

Theorem — Mean value theorem (equality form)

Let $a < b$ and $f : [a, b] \to \mathbb{R}$ continuous on $[a, b]$ and differentiable on $]a, b[$. Then there exists $c \in {]}a, b{[}$ such that $$ \textcolor{colorprop}{f(b) - f(a) = f'(c) (b - a)}. $$

Proof

Define the affine function $d : [a, b] \to \mathbb{R}$ corresponding to the chord: $$ d(x) = f(a) + \frac{f(b) - f(a)}{b - a} (x - a), \qquad d(a) = f(a), \quad d(b) = f(b). $$ Set $\varphi(x) = f(x) - d(x)$. Then $\varphi$ is continuous on $[a, b]$ (sum of two continuous), differentiable on $]a, b[$ with $\varphi'(x) = f'(x) - (f(b) - f(a))/(b - a)$, and $\varphi(a) = \varphi(b) = 0$. By Rolle T4.1 applied to $\varphi$, there exists $c \in {]}a, b{[}$ with $\varphi'(c) = 0$, i.e. $f'(c) = (f(b) - f(a))/(b - a)$, which rearranges to the conclusion.

Figure --- Mean value theorem

Mean value theorem: there exists an interior point $c$ where the tangent to the graph of $f$ is parallel to the chord joining $(a, f(a))$ and $(b, f(b))$.

Ex 9 Ex 10 Ex 11 Ex 12 Ex 13

Three consequences of the mean value theorem: the mean value inequality --- Lipschitz bound from a bound on $|f'|$; the link sign-of-$f'$ $\leftrightarrow$ monotonicity; the theorem of the limit of the derivative --- a rigorous $C^1$ extension at a tricky point. The mean value inequality + contraction $k < 1$ also yields the convergence rate of the recurrent sequences of Real sequences.

Skills to practice

Manipulating negligible and almost-sure events

III Conditional probability and independence

Theorem — Mean value inequality

Let $I$ be an interval and $f : I \to \mathbb{R}$ continuous on $I$, differentiable on $\mathring{I}$ (the interior of $I$). If $|f'(x)| \le K$ for all $x \in \mathring{I}$, then $f$ is $K$-Lipschitz on $I$: $$ \textcolor{colorprop}{\forall (x, y) \in I^2, \quad |f(y) - f(x)| \le K |y - x|}. $$

Proof

For $x, y \in I$, treat $x \ne y$ (if $x = y$ the inequality is $0 \le 0$). WLOG $x < y$. Then $[x, y] \subset I$, $f$ is continuous on $[x, y]$ and differentiable on $]x, y[ \subset \mathring{I}$. By the mean value theorem T4.2 applied on $[x, y]$, there exists $c \in {]}x, y{[}$ with $f(y) - f(x) = f'(c) (y - x)$. Hence $$ |f(y) - f(x)| = |f'(c)| \cdot |y - x| \le K |y - x|. $$

III.1 Conditional probability

Method — Lipschitzianity from a bound on $f'$

To show $f$ is $K$-Lipschitz on an interval $I$:

check $f$ is continuous on $I$ and differentiable on $\mathring{I}$;
bound $|f'(x)| \le K$ for $x \in \mathring{I}$;
conclude by T5.1.

Example

Show that $f : [0, +\infty[ \to \mathbb{R}$, $f(x) = \sqrt{x + 1}$, is $(1/2)$-Lipschitz.

Answer

$f$ is continuous on $[0, +\infty[$ and differentiable on $]0, +\infty[$ with $f'(x) = 1/(2 \sqrt{x + 1})$. For $x \in ]0, +\infty[$, $x + 1 > 1$ so $\sqrt{x + 1} > 1$ and $|f'(x)| = 1/(2 \sqrt{x+1}) \le 1/2$. By T5.1, $f$ is $(1/2)$-Lipschitz.

Method — Contraction --- bridge to the recurrent sequences of Real sequences

Setup. $f : [a, b] \to [a, b]$ continuous (so $[a, b]$ is stable by $f$), differentiable on $]a, b[$, with $|f'(x)| \le k$ on $]a, b[$ for some $k \in [0, 1[$. Then $f$ has a unique fixed point $\ell \in [a, b]$, and for every $u_0 \in [a, b]$ the recurrent sequence $u_{n+1} = f(u_n)$ stays in $[a, b]$ and converges geometrically to $\ell$ with $|u_n - \ell| \le k^n |u_0 - \ell|$. Standard recipe:

Existence. Set $g(x) = f(x) - x$ on $[a, b]$. Stability gives $g(a) \ge 0$ and $g(b) \le 0$; the intermediate value theorem (Limits and continuity T6.1) yields $\ell \in [a, b]$ with $f(\ell) = \ell$.
Stability of $(u_n)$. Induction: $u_0 \in [a, b]$; if $u_n \in [a, b]$, then $u_{n+1} = f(u_n) \in f([a, b]) \subset [a, b]$.
Geometric rate. Apply T5.1 with $K = k$ on $[\min(u_n, \ell), \max(u_n, \ell)] \subset [a, b]$: $|u_{n+1} - \ell| \le k |u_n - \ell|$, hence $|u_n - \ell| \le k^n |u_0 - \ell| \to 0$.
Uniqueness. If $\ell'$ is another fixed point, taking $u_0 = \ell'$ makes the sequence constant equal to $\ell'$ while it converges to $\ell$, so $\ell' = \ell$.

Proposition — Strict monotonicity --- useful complement

Under the same hypotheses as P5.1, $f$ is strictly increasing on $I$ $\iff$ $\textcolor{colorprop}{f' \ge 0}$ on $\mathring{I}$ and $f'$ is not identically zero on any non-trivial subinterval of $\mathring{I}$.

Proof

$(\Rightarrow)$ If $f$ is strictly increasing, then $f' \ge 0$ on $\mathring{I}$ by P5.1(b). Moreover, if $f' \equiv 0$ on some non-trivial subinterval $J \subset \mathring{I}$, then $f$ would be constant on $J$ by P5.1(a), contradicting strict monotonicity.
$(\Leftarrow)$ From $f' \ge 0$ on $\mathring{I}$ we get $f$ increasing on $I$ (P5.1(b)). Suppose for contradiction that $f(x) = f(y)$ for some $x < y$ in $I$. Since $f$ is increasing, this forces $f \equiv f(x)$ on $[x, y]$; then $f' \equiv 0$ on ${]}x, y{[}$ by P5.1(a), contradicting the hypothesis.

Remark --- $I$ must be an interval

The hypothesis « $I$ is an interval » is essential in P5.1. Counterexample: $D = \,]-\infty, 0[ \,\cup\, ]0, +\infty[ = \mathbb{R}^*$ (which is not an interval --- the same punctured domain that appears naturally when studying the Heaviside example below). Define $f : D \to \mathbb{R}$ by $f(x) = 0$ for $x < 0$ and $f(x) = 1$ for $x > 0$. Then $f' \equiv 0$ on $D$ (constant on each connected component) but $f$ is not constant on $D$.

Example

Study the variations of $f : \mathbb{R} \to \mathbb{R}$, $f(x) = x/(1 + x^2)$.

Answer

$f$ is differentiable on $\mathbb{R}$ (rational function, denominator never zero) with $$ f'(x) = \frac{(1 + x^2) - x \cdot 2 x}{(1 + x^2)^2} = \frac{1 - x^2}{(1 + x^2)^2}. $$ Sign of $f'(x)$: same sign as $1 - x^2 = (1 - x)(1 + x)$. Positive for $x \in {]}-1, 1{[}$, negative outside. By P5.1, $f$ is increasing on $[-1, 1]$ and decreasing on $]-\infty, -1]$ and $[1, +\infty[$. Maximum global at $x = 1$: $f(1) = 1/2$. Minimum global at $x = -1$: $f(-1) = -1/2$.

Skills to practice

Computing conditional probabilities

III.2 The composed$\virgule$ total and Bayes formulas

Theorem — Theorem of the limit of the derivative

Let $I$ be an interval, $a \in I$, $f : I \to \mathbb{R}$ continuous on $I$ and differentiable on $I \setminus \{a\}$. Suppose $f'(x) \to \ell \in \overline{\mathbb{R}}$ as $x \to a$ (with $x \in I \setminus \{a\}$). Then:

If $\ell \in \mathbb{R}$: $f$ is differentiable at $a$ and $\textcolor{colorprop}{f'(a) = \ell}$. If moreover $f'$ is continuous on $I \setminus \{a\}$, then $f' : I \to \mathbb{R}$ (extended by $f'(a) := \ell$) is continuous at $a$, hence $f$ is $C^1$ on a neighborhood of $a$ in $I$.
If $\ell = \pm \infty$: the difference quotient satisfies $\tau_a(h) \to \ell$ as $h \to 0$, and the graph of $f$ admits a vertical (half-)tangent at $a$.

Proof

Finite case $\ell \in \mathbb{R}$. Let $h \ne 0$ be such that $a + h \in I$. By the mean value theorem T4.2 applied to $f$ on the segment with endpoints $a$ and $a + h$ (continuous on the closed segment, differentiable on the open segment $\subset I \setminus \{a\}$), there exists $c_h$ strictly between $a$ and $a + h$ such that $$ \tau_a(h) = \frac{f(a + h) - f(a)}{h} = f'(c_h). $$ As $h \to 0$, $c_h$ is squeezed between $a$ and $a + h$, hence $c_h \to a$. By hypothesis $f'(c_h) \to \ell$ (composition of limits), so $\tau_a(h) \to \ell$. Thus $f$ is differentiable at $a$ with $f'(a) = \ell$. If $a$ is an endpoint of $I$, the same argument applies with $h$ on the single side that fits inside $I$, and the conclusion holds as a one-sided derivative.
Case $\ell = \pm \infty$. The same argument gives $\tau_a(h) = f'(c_h)$ with $c_h \to a$ as $h \to 0$. Hence $\tau_a(h) \to \pm \infty$, which means the graph admits a vertical tangent at $a$ (with the convention $f'(a) = \pm \infty$, not a real derivative).

Remark --- continuity of $f$ is essential

Distinct from « extending $f'$ continuously ». The hypothesis « $f$ continuous at $a$ » in T5.2 is essential. Counterexample: the Heaviside function $H(x) = 0$ for $x < 0$ and $H(x) = 1$ for $x \ge 0$. Then $\lim_{x \to 0^-} H(x) = 0 \ne 1 = H(0)$, so $H$ is discontinuous at $0$ and the hypothesis of T5.2 fails. The naive « plug in $\lim H' = 0$ » conclusion would be wrong: although $H' \equiv 0$ on $\mathbb{R}^* = \mathbb{R} \setminus \{0\}$, hence $H'(x) \to 0$ as $x \to 0$, the difference quotient at $0$ reflects the discontinuity: $$ \tau_0(h) = \frac{H(0 + h) - H(0)}{h} = \frac{H(h) - 1}{h} \to \begin{cases} +\infty & (h \to 0^-) \\ 0 & (h \to 0^+) \end{cases} $$ so $H$ is not differentiable at $0$. Always verify continuity of $f$ at $a$ before invoking T5.2.

Example

Show that $f : \mathbb{R} \to \mathbb{R}$ defined by $f(x) = x^2$ for $x \le 0$ and $f(x) = x^3$ for $x > 0$ is differentiable at $0$, via M5.2.

Answer

Continuity at $0$: $\lim_{x \to 0^-} x^2 = 0 = f(0)$ and $\lim_{x \to 0^+} x^3 = 0 = f(0)$, so $f$ is continuous at $0$.
Differentiability on $\mathbb{R}^*$: $f$ is polynomial on each open half-line, hence differentiable with $f'(x) = 2x$ for $x < 0$ and $f'(x) = 3x^2$ for $x > 0$.
Limit of $f'$: $\lim_{x \to 0^-} 2x = 0$ and $\lim_{x \to 0^+} 3x^2 = 0$, so $\lim_{x \to 0} f'(x) = 0$ (finite).
Apply T5.2: the hypotheses of T5.2 hold ($f$ continuous on $\mathbb{R}$, differentiable on $\mathbb{R}^*$, $f'$ has a finite limit at $0$), so $f$ is differentiable at $0$ with $f'(0) = 0$.

Ex 14 Ex 15

Skills to practice

Applying the composed$\virgule$ total and Bayes formulas

III.3 Independence of events

Ex 16 Ex 17 Ex 18 Ex 19 Ex 20 Ex 21

Skills to practice

Establishing independence of events

IV Discrete probability spaces

Iterating differentiation. The class $C^k$ formalises smoothness. We state and prove the Leibniz formula first (a direct calculation), then deduce the stability of $C^k$ under product as a corollary. Composition and inverse stability are admitted at this level.

IV.1 Discrete distributions and their support

Definition — $n$-th derivative

Recursive definition: $f^{(0)} = f$. If $f^{(n)}$ is defined on $I$ and differentiable on $I$, then $f^{(n+1)} = (f^{(n)})'$. The function $f$ is said to be $n$-times differentiable on $I$ if $f^{(n)}$ exists on $I$.

Definition — Classes $C^k$ and $C^\infty$

For $k \in \mathbb{N}$, $f$ is of class $C^k$ on $I$ if $f^{(k)}$ exists on $I$ and is continuous on $I$. $f$ is of class $C^\infty$ on $I$ if $f$ is $C^k$ for every $k \in \mathbb{N}$. Notation: $\textcolor{colordef}{C^k(I, \mathbb{R})}$, $\textcolor{colordef}{C^\infty(I, \mathbb{R})}$.

Example

Polynomials are $C^\infty$ on $\mathbb{R}$.

Answer

For $P(x) = x^n$, by induction $P^{(k)}(x) = n (n-1) \cdots (n - k + 1) x^{n-k}$ for $k \le n$, and $P^{(k)} \equiv 0$ for $k > n$. Each $P^{(k)}$ is a polynomial, hence continuous on $\mathbb{R}$. Same for any polynomial $\sum c_k x^k$ by linearity. Forward reference: $\exp$, $\sin$, $\cos$ are also $C^\infty$ on $\mathbb{R}$; their derivatives are rigorously justified in Standard functions.

Theorem — Leibniz formula

Let $f, g$ be $n$ times differentiable on $I$. Then $f g$ is $n$ times differentiable on $I$ and $$ \textcolor{colorprop}{(f g)^{(n)} = \sum_{p = 0}^{n} \binom{n}{p} f^{(p)} g^{(n - p)}}. $$

Proof

Induction on $n$.

Base $n = 0$. $(f g)^{(0)} = f g = \binom{0}{0} f^{(0)} g^{(0)}$, formula holds.
Inductive step. Assume the formula holds at rank $n$ for all $n$-times differentiable functions, and let $f, g$ be $(n+1)$-times differentiable. In particular $f, g$ are $n$-times differentiable, so $(f g)^{(n)} = \sum_{p = 0}^{n} \binom{n}{p} f^{(p)} g^{(n - p)}$. Differentiate once more, using linearity (P2.1) and the product rule (P2.2): $$ \begin{aligned} (f g)^{(n+1)} &= \sum_{p = 0}^{n} \binom{n}{p} \big( f^{(p+1)} g^{(n - p)} + f^{(p)} g^{(n - p + 1)} \big) \\ &= \underbrace{\sum_{p = 0}^{n} \binom{n}{p} f^{(p+1)} g^{(n - p)}}_{S_1} + \underbrace{\sum_{p = 0}^{n} \binom{n}{p} f^{(p)} g^{(n - p + 1)}}_{S_2}. \end{aligned} $$ Re-index $S_1$ with $q = p + 1$ (so $q$ runs from $1$ to $n + 1$); rename $q$ as $p$: $$ S_1 = \sum_{p = 1}^{n+1} \binom{n}{p - 1} f^{(p)} g^{(n + 1 - p)}. $$ Keep $S_2$ as it is. Adding $S_1 + S_2$, the $p = 0$ term comes from $S_2$ alone (coefficient $\binom{n}{0} = 1 = \binom{n+1}{0}$), the $p = n + 1$ term from $S_1$ alone (coefficient $\binom{n}{n} = 1 = \binom{n+1}{n+1}$), and for $1 \le p \le n$ Pascal's relation $\binom{n}{p - 1} + \binom{n}{p} = \binom{n + 1}{p}$ (chapter Sums, products and binomial coefficients) groups the two contributions. Hence $$ (f g)^{(n+1)} = \sum_{p = 0}^{n+1} \binom{n+1}{p} f^{(p)} g^{(n + 1 - p)}, $$ which is the formula at rank $n + 1$.

Skills to practice

Identifying a discrete distribution

IV.2 The distribution--probability correspondence

Proposition — Stability of $C^k$

Let $f, g \in C^k(I, \mathbb{R})$, $\lambda, \mu \in \mathbb{R}$. Then:

$\textcolor{colorprop}{\lambda f + \mu g \in C^k(I)}$ ;
$\textcolor{colorprop}{f g \in C^k(I)}$ ;
if $g$ does not vanish on $I$, $\textcolor{colorprop}{f / g \in C^k(I)}$ ;
if $\varphi : J \to I$ is $C^k$, then $\textcolor{colorprop}{f \circ \varphi \in C^k(J)}$ ;
if $k \ge 1$, $f \in C^k(I, \mathbb{R})$ with $f : I \to f(I)$ bijective and $f'$ not vanishing on $I$, then $\textcolor{colorprop}{f^{-1} \in C^k(f(I), \mathbb{R})}$.

Proof

Linear combination. Induction on $k$. Base $k = 0$: $\lambda f + \mu g$ is continuous (sum of continuous functions). Inductive step $k \to k + 1$: if $f, g$ are $C^{k+1}$, they are differentiable and $f', g'$ are $C^k$; by P2.1 $(\lambda f + \mu g)' = \lambda f' + \mu g'$ is $C^k$ by induction, hence $\lambda f + \mu g$ is $C^{k+1}$.
Product. Same induction. Base $k = 0$: $f g$ is continuous. Inductive step: $f, g$ are $C^{k+1}$, so $(f g)' = f' g + f g'$ (P2.2). Each summand is a product of $C^k$ functions, hence $C^k$ by induction; their sum is $C^k$ by linearity; thus $f g$ is $C^{k+1}$.
Quotient. First show by induction on $k$ that if $g$ is $C^k$ and does not vanish on $I$, then $1/g$ is $C^k$ on $I$. Base $k = 0$: $1/g$ is continuous (continuity of $g$ and non-vanishing). Inductive step: if $g$ is $C^{k+1}$, then $(1/g)' = -g'/g^2$ (P2.3); by the product case $g^2$ is $C^k$, $g^2$ does not vanish, so by the induction hypothesis $1/g^2$ is $C^k$, and $-g'/g^2 = -g' \cdot (1/g^2)$ is $C^k$ as a product; hence $1/g$ is $C^{k+1}$. Then $f/g = f \cdot (1/g)$ is $C^k$ as a product.
Composition and inverse: proofs admitted. The proofs concerning composition and inverse are not required at this level. The results themselves remain in scope and may be used freely.

Method — Compute a higher-order derivative

Three classical patterns:

Pattern A --- $P(x) \cdot \exp(a x)$. Apply Leibniz; only the first $\deg P + 1$ terms are non-zero since $P^{(k)} = 0$ for $k > \deg P$.
Pattern B --- rational fraction. Decompose into partial fractions (chapter Rational fractions), then use $(1/(x - a))^{(n)} = (-1)^n n! / (x - a)^{n + 1}$ --- proved by direct induction.
Pattern C --- trig power. Linearise first (chapter Trigonometry), then differentiate the linearised expression term by term.

Admitted iterated derivatives (forward references). $(\exp(a x))^{(k)} = a^k \exp(a x)$ on $\mathbb{R}$ for all $k \in \mathbb{N}$, admitted from Real functions: lycée recap. Iterated derivatives of $\sin$, $\cos$ are admitted from Standard functions.

Example

Compute $\big(1/(x^2 - 1)\big)^{(n)}$ on $\mathbb{R} \setminus \{-1, 1\}$ via Pattern B.

Answer

Partial fractions: $1/(x^2 - 1) = 1/((x - 1)(x + 1)) = (1/2) \big(1/(x - 1) - 1/(x + 1)\big)$. By the identity $(1/(x - a))^{(n)} = (-1)^n n! / (x - a)^{n + 1}$ (induction), with linearity (P2.1 extended to higher orders): for every $x \in \mathbb{R} \setminus \{-1, 1\}$ (equivalently, on each interval of the domain), $$ \left(\frac{1}{x^2 - 1}\right)^{(n)} = \frac{(-1)^n n!}{2} \left( \frac{1}{(x - 1)^{n + 1}} - \frac{1}{(x + 1)^{n + 1}} \right). $$

Ex 22 Ex 23 Ex 24 Ex 25

Skills to practice

Defining a probability by its distribution