CommeUnJeu · L1 PCSI

Convex functions

⌚ ~67 min ▢ 8 blocks ✓ 24 exercises ➣ Prerequisites : Differentiability, Real functions: recap

Convexity captures, in a single inequality, the geometric idea of a curve that always lies below its chords. From this one definition flows a toolbox: a slope-monotonicity characterization, a derivative-based test ($f'$ nondecreasing or $f'' \ge 0$), and the dual « graph above its tangents » statement that fuels classical bounds like $e^x \ge 1 + x$ and $\ln(1+x) \le x$, while the chord inequality for a concave function yields bounds like $\sin x \ge 2x/\pi$ on $[0, \pi/2]$. The chapter closes with a brief, scoped treatment of inflection points.
Throughout, $I$ denotes an interval of $\mathbb{R}$ (possibly unbounded, possibly closed or open). The duality « concave $\Leftrightarrow$ $-f$ convex » is stated once at the start and used implicitly thereafter; concave inequalities reverse the convex ones. Inflection points are introduced in the closing section as a safe sufficient criterion.

I Convex functions --- definition and slope characterization

The geometric idea: a function is convex if its graph stays below every chord drawn between two of its points. We translate this into the chord inequality, then show the slope-monotonicity characterization.

Definition — Convex and concave function

Let $f : I \to \mathbb{R}$. We say that $f$ is convex on $I$ if $$ \textcolor{colordef}{\forall x, y \in I, \ \forall \lambda \in [0, 1], \quad f((1 - \lambda) x + \lambda y) \le (1 - \lambda) f(x) + \lambda f(y)}. $$ We say that $f$ is concave on $I$ if $-f$ is convex on $I$, equivalently $$ \forall x, y \in I, \ \forall \lambda \in [0, 1], \quad f((1 - \lambda) x + \lambda y) \ge (1 - \lambda) f(x) + \lambda f(y). $$

Definition — Chord and secant line

For $x, y \in I$ with $x < y$, the chord of $f$ on $[x, y]$ is the line segment in the plane joining $(x, f(x))$ and $(y, f(y))$. The secant line through these two points is the (full) line containing the chord.

Figure --- convex and concave graphs

Side by side: a parabola $y = \tfrac{1}{2} x^2$ (convex, graph below the chord between any two of its points) and a (translated) logarithm $y = \ln x + 0.3$ (concave, graph above the chord between any two of its points). The vertical scaling and shift are cosmetic --- convexity/concavity is preserved.

The convex graph is below the chord $AB$; the concave graph is above the chord $AB$.

Example

The function $f : \mathbb{R} \to \mathbb{R}$, $f(x) = |x|$, is convex on $\mathbb{R}$.

Answer

For $x, y \in \mathbb{R}$ and $\lambda \in [0, 1]$, the triangle inequality gives $$ |(1 - \lambda) x + \lambda y| \le |(1 - \lambda) x| + |\lambda y| = (1 - \lambda) |x| + \lambda |y|, $$ using $1 - \lambda \ge 0$ and $\lambda \ge 0$ to drop the absolute values on the scalars. Hence $f((1 - \lambda) x + \lambda y) \le (1 - \lambda) f(x) + \lambda f(y)$, so $f$ is convex on $\mathbb{R}$.

Proposition — Position of the graph with respect to a secant

Let $f : I \to \mathbb{R}$ be convex on $I$, and let $x, y \in I$ with $x < y$. Then

the graph of $f$ lies \textcolor{colorprop}{below} the secant line through $(x, f(x))$ and $(y, f(y))$ on the segment $[x, y]$;
the graph of $f$ lies \textcolor{colorprop}{above} that same secant line on $I \setminus [x, y]$.

Proof

The secant has equation $L(t) = f(x) + \frac{f(y) - f(x)}{y - x} (t - x)$.

Below on $[x, y]$. For $t \in [x, y]$, write $t = (1 - \lambda) x + \lambda y$ with $\lambda = (t - x)/(y - x) \in [0, 1]$. By convexity of $f$ (Definition D1.1), $$ f(t) \le (1 - \lambda) f(x) + \lambda f(y) = L(t). $$
Above on $I \setminus [x, y]$. Take $t \in I$ with $t > y$ (the case $t < x$ is analogous). Set $\theta = (y - x)/(t - x) \in \,]0, 1[$. Then $$ y = (1 - \theta) x + \theta t, $$ a genuine convex combination of $x$ and $t$. By convexity of $f$, $$ f(y) \le (1 - \theta) f(x) + \theta f(t). $$ Hence $$ f(t) \ge \frac{f(y) - (1 - \theta) f(x)}{\theta}. $$ Substituting $\theta = (y - x)/(t - x)$ and $1 - \theta = (t - y)/(t - x)$ and rearranging: \begin{align*} f(t) &\ge \frac{(t - x) f(y) - (t - y) f(x)}{y - x} && \text{(substitute $\theta$, $1-\theta$)}
&= f(x) + \frac{f(y) - f(x)}{y - x} (t - x) && \text{(rearrange into secant form)}
&= L(t). \end{align*}

Figure --- graph below/above its secant

The graph of a convex function lies below its secant on $[x, y]$ and above the secant outside $[x, y]$.

Proposition — Slope-monotonicity characterization

Let $f : I \to \mathbb{R}$. The following are equivalent:

(i) $f$ is convex on $I$;
(ii) for every $a \in I$ and every $u, v \in I \setminus \{a\}$ with $u < v$, $$ \textcolor{colorprop}{\frac{f(u) - f(a)}{u - a} \le \frac{f(v) - f(a)}{v - a}}. $$

In particular, in (ii), $t \mapsto (f(t) - f(a))/(t - a)$ is nondecreasing on $I \setminus \{a\}$ for every $a \in I$.

Proof

$(i) \Rightarrow (ii)$. Fix $a \in I$ and $u < v$ in $I \setminus \{a\}$.
- Case $a < u < v$. Write $u = (1 - \lambda) a + \lambda v$ with $\lambda = (u - a)/(v - a) \in {]}0, 1[$. Convexity gives $$ f(u) \le (1 - \lambda) f(a) + \lambda f(v), $$ i.e. $f(u) - f(a) \le \lambda (f(v) - f(a))$. Dividing by $u - a = \lambda (v - a) > 0$ yields the claim.
- Case $u < v < a$. Symmetric: write $v = (1 - \mu) u + \mu a$ with $\mu = (v - u)/(a - u) \in {]}0, 1[$. Convexity gives $f(v) \le (1 - \mu) f(u) + \mu f(a)$, equivalently $f(v) - f(a) \le (1 - \mu)(f(u) - f(a))$. Dividing by $v - a = -(1 - \mu)(a - u) < 0$ flips the inequality and produces the claim.
- Case $u < a < v$. Write $$ a = (1 - \theta) u + \theta v, \qquad \theta = \frac{a - u}{v - u} \in \,]0, 1[. $$ By convexity, $$ f(a) \le (1 - \theta) f(u) + \theta f(v). $$ Since $1 - \theta = (v - a)/(v - u)$ and $\theta = (a - u)/(v - u)$, multiplying by $v - u > 0$: $$ (v - u) f(a) \le (v - a) f(u) + (a - u) f(v). $$ Rearranging, $$ (v - a) (f(a) - f(u)) \le (a - u) (f(v) - f(a)). $$ Dividing by the strictly positive number $(a - u)(v - a) > 0$: $$ \frac{f(a) - f(u)}{a - u} \le \frac{f(v) - f(a)}{v - a}. $$ Since $(f(a) - f(u))/(a - u) = (f(u) - f(a))/(u - a)$, we conclude $$ \frac{f(u) - f(a)}{u - a} \le \frac{f(v) - f(a)}{v - a}. $$
$(ii) \Rightarrow (i)$. Take $x < y$ in $I$ and $\lambda \in {]}0, 1[$; set $t = (1 - \lambda) x + \lambda y$, so $x < t < y$. Applying (ii) with anchor $a = x$ on the pair $(t, y)$ (both in $I \setminus \{x\}$, with $t < y$): $$ \frac{f(t) - f(x)}{t - x} \le \frac{f(y) - f(x)}{y - x}. $$ Since $t - x = \lambda (y - x) > 0$, multiplying by $t - x$ gives $$ f(t) - f(x) \le \lambda (f(y) - f(x)), $$ i.e. $f(t) \le (1 - \lambda) f(x) + \lambda f(y)$. The boundary cases $\lambda \in \{0, 1\}$ are immediate. Hence $f$ is convex.

Proposition — Three-point slope inequality

Let $f : I \to \mathbb{R}$ be convex on $I$, and let $a, b, c \in I$ with $a < b < c$. Then $$ \textcolor{colorprop}{\frac{f(b) - f(a)}{b - a} \le \frac{f(c) - f(a)}{c - a} \le \frac{f(c) - f(b)}{c - b}}. $$

Proof

The left inequality is Proposition P1.2 (slope monotonicity) applied at anchor $a$ with $u = b < c = v$. The right inequality is P1.2 applied at anchor $c$ with $u = a < b = v$ (slopes oriented from $c$): $$ \frac{f(a) - f(c)}{a - c} \le \frac{f(b) - f(c)}{b - c}, $$ which is the same as the displayed inequality after multiplying numerator and denominator of each side by $-1$.

Remark --- continuity on an open interval

A convex function on an open interval is continuous. This useful theorem is admitted here.
(The proof uses the slope-monotonicity characterization P1.2 plus the monotone-limit theorem of Limits and continuity.)

Skills to practice

Establishing convexity from the chord definition
Using the slope characterization

II Convex differentiable functions

When $f$ is (twice) differentiable, convexity has clean derivative-level characterizations: $f$ convex $\Leftrightarrow$ $f'$ nondecreasing $\Leftrightarrow$ graph above all tangents. The $C^2$ version --- $f$ convex $\Leftrightarrow$ $f'' \ge 0$ --- gives the operational test used everywhere. The tangent inequality is the engine behind the classical bounds $e^x \ge 1 + x$, $\ln(1+x) \le x$, $\sin x \le x$, and $\sqrt{x} \le (x+1)/2$.
Derivative convention. In derivative statements below, either $I$ is open, or the assertions involving $f'(a)$ are read for interior points $a \in \mathring{I}$. When an endpoint is used in an example, the function is understood as the restriction of a differentiable function defined on a larger open interval.

Theorem — Characterization of convex differentiable functions

Let $f : I \to \mathbb{R}$ be differentiable on $I$. The following are equivalent:

(i) $f$ is convex on $I$;
(ii) $f'$ is nondecreasing on $I$;
(iii) For every $a \in I$ and every $x \in I$, $\textcolor{colorprop}{f(x) \ge f(a) + f'(a)(x - a)}$ (the graph of $f$ lies above all its tangents).

Proof

Cyclic implication $(i) \Rightarrow (ii) \Rightarrow (iii) \Rightarrow (i)$.

$(i) \Rightarrow (ii)$. Fix $u < v$ in $I$. For every $w \in \,]u, v[$, the three-point slope inequality P1.3 applied to $u < w < v$ gives $$ \frac{f(w) - f(u)}{w - u} \le \frac{f(v) - f(u)}{v - u} \le \frac{f(v) - f(w)}{v - w}. $$ Letting $w \to u^+$ in the left inequality gives $$ f'(u) \le \frac{f(v) - f(u)}{v - u}. $$ Letting $w \to v^-$ in the right inequality gives $$ \frac{f(v) - f(u)}{v - u} \le f'(v). $$ Chaining: $f'(u) \le f'(v)$. Hence $f'$ is nondecreasing.
$(ii) \Rightarrow (iii)$. Fix $a \in I$ and $x \in I$. Define $\varphi(t) := f(t) - f(a) - f'(a)(t - a)$ for $t \in I$. Then $\varphi$ is differentiable on $I$ with $\varphi'(t) = f'(t) - f'(a)$. Since $f'$ is nondecreasing (hypothesis ii), $\varphi'(t) \le 0$ for $t \le a$ and $\varphi'(t) \ge 0$ for $t \ge a$. By the sign-of-$f'$ $\Leftrightarrow$ monotonicity theorem from Differentiability P5.1, $\varphi$ is nonincreasing on $I \cap {]}-\infty, a]$ and nondecreasing on $I \cap [a, +\infty[$. Combined with $\varphi(a) = 0$, this gives $\varphi(t) \ge 0$ for all $t \in I$, i.e. $f(t) \ge f(a) + f'(a)(t - a)$.
$(iii) \Rightarrow (i)$. Fix $x, y \in I$ and $\lambda \in [0, 1]$; set $a := (1 - \lambda) x + \lambda y \in I$. By (iii) at $a$ evaluated at $t = x$ and $t = y$: $$ f(x) \ge f(a) + f'(a)(x - a), \qquad f(y) \ge f(a) + f'(a)(y - a). $$ Multiply the first by $1 - \lambda \ge 0$, the second by $\lambda \ge 0$, and add. The coefficients of $f'(a)$ combine as $$ (1 - \lambda)(x - a) + \lambda (y - a) = (1 - \lambda) x + \lambda y - a = 0, $$ so $f'(a)$ drops out. We obtain $$ (1 - \lambda) f(x) + \lambda f(y) \ge f(a) = f((1 - \lambda) x + \lambda y), $$ which is the convexity inequality D1.1. Hence $f$ is convex.

Figure --- graph above its tangents

A convex differentiable function: at every point, the tangent lies below the curve.

Remark --- concave dual

For $f$ concave differentiable, the inequalities reverse. In particular, the graph of $f$ lies below every one of its tangents: $$ \forall a, x \in I, \qquad f(x) \le f(a) + f'(a)(x - a). $$ This is the engine behind the bounds $\ln(1+x) \le x$, $\sqrt{x} \le (x+1)/2$, and $\sin x \le x$ proved below.

Proposition — $C^2$ characterization of convexity

Let $f \in C^2(I, \mathbb{R})$. Then $f$ is convex on $I$ if and only if $$ \textcolor{colorprop}{\forall x \in I, \quad f''(x) \ge 0}. $$ By duality, $f$ is concave on $I$ if and only if $f''(x) \le 0$ for every $x \in I$.

Proof

By Theorem T2.1, $f$ convex $\Leftrightarrow$ $f'$ nondecreasing on $I$. By the sign-of-$f'$ $\Leftrightarrow$ monotonicity theorem applied to $g = f'$ (which is differentiable since $f \in C^2$): $g = f'$ nondecreasing on $I$ $\Leftrightarrow$ $g' = f'' \ge 0$ on $I$ (Differentiability, P5.1). Composing the two equivalences gives the claim. The concave case follows by replacing $f$ with $-f$.

Method — Show that $f$ is convex (or concave) from $f''$

Given $f$ at least $C^2$ on $I$:

Compute $f''$ explicitly.
Study the sign of $f''$ on $I$ (factor, use known signs, monotonicity).
Conclude with P2.1: $f''(x) \ge 0$ on $I$ $\Rightarrow$ $f$ convex; $f''(x) \le 0$ on $I$ $\Rightarrow$ $f$ concave.

Caveat. The conclusion needs a sign that holds on the whole interval $I$. If $f''$ changes sign, $f$ is convex on the subinterval where $f'' \ge 0$ and concave on the subinterval where $f'' \le 0$ --- it is neither on $I$ as a whole.

Example

$\exp$ is convex on $\mathbb{R}$. Deduce that $e^x \ge 1 + x$ for every $x \in \mathbb{R}$.

Answer

$\exp$ is $C^2$ on $\mathbb{R}$ with $(\exp)''(x) = e^x > 0$, so by P2.1, $\exp$ is convex on $\mathbb{R}$. The tangent of $\exp$ at $a = 0$ is the line $y = 1 + x$. By T2.1(iii) applied at $a = 0$, $$ e^x \ge e^0 + e^0 (x - 0) = 1 + x \qquad (x \in \mathbb{R}). $$

Example

$\ln$ is concave on $\mathbb{R}_+^*$. Deduce $\ln(1 + x) \le x$ for $x > -1$ and $\ln x \le x - 1$ for $x > 0$.

Answer

$\ln$ is $C^2$ on $\mathbb{R}_+^*$ with $(\ln)''(x) = -1/x^2 < 0$, so by P2.1 (concave case), $\ln$ is concave on $\mathbb{R}_+^*$. Apply the concave-dual tangent inequality:

At $a = 1$: $(\ln)(1) = 0$, $(\ln)'(1) = 1$, so the tangent is $y = x - 1$. Hence $\ln x \le x - 1$ for $x > 0$.
Setting $u = 1 + x$ (so $u > 0$ iff $x > -1$) in the previous: $\ln(1 + x) \le (1 + x) - 1 = x$ for $x > -1$.

Example

$\sin$ is concave on $[0, \pi/2]$. Deduce the two inequalities $\sin x \ge \dfrac{2 x}{\pi}$ and $\sin x \le x$, valid for $x \in [0, \pi/2]$.

Answer

$\sin$ is $C^2$ on $[0, \pi/2]$ with $(\sin)''(x) = -\sin x \le 0$ on this interval, so by P2.1 (concave case), $\sin$ is concave on $[0, \pi/2]$. The two inequalities come from two distinct geometric facts:

Above-chord rule (concave version of P1.1). On $[0, \pi/2]$, the graph of $\sin$ lies above its chord between $(0, 0)$ and $(\pi/2, 1)$. That chord has equation $y = (2/\pi) x$, so $$ \sin x \ge \frac{2 x}{\pi} \qquad (x \in [0, \pi/2]). $$
Below-tangent rule (concave dual of T2.1(iii)). The tangent at $a = 0$ is $y = (\sin)(0) + (\sin)'(0) (x - 0) = x$. So $$ \sin x \le x \qquad (x \in [0, \pi/2]). $$

The inequality $\sin x \le x$ in fact holds for every $x \ge 0$: the function $h(x) = x - \sin x$ satisfies $h(0) = 0$ and $h'(x) = 1 - \cos x \ge 0$ on $\mathbb{R}$, hence $h$ is nondecreasing on $\mathbb{R}_+$, giving $h(x) \ge h(0) = 0$ for every $x \ge 0$.

Method — Bound a numerical expression by a convexity inequality

To prove a bound of the form $f(x) \ge L(x)$ (or $\le$, for concave) where $L$ is affine:

Identify a function $f$ for which the bound is the tangent inequality (or the chord inequality) at a specific point $a$.
Verify that $f$ is convex (or concave) on the relevant interval, by sign of $f''$ via P2.1.
Compute $f(a)$ and $f'(a)$; check that $L(x) = f(a) + f'(a)(x - a)$ is indeed the target affine bound.
Conclude by T2.1(iii) for the tangent case, or by P1.1 / above-chord rule for the chord case.

Recognising the right anchor. The anchor $a$ is the point where the bound is sharpest (equality). For $e^x \ge 1 + x$ this is $a = 0$; for $\ln x \le x - 1$ this is $a = 1$; for $\sqrt{x} \le (x+1)/2$ this is $a = 1$.

Example

Show that $\sqrt{x} \le \dfrac{x + 1}{2}$ for every $x \ge 0$.

Answer

The function $g(x) = \sqrt{x}$ is $C^2$ on $\mathbb{R}_+^*$ with $g''(x) = -\frac{1}{4} x^{-3/2} < 0$. By P2.1 (concave case), $g$ is concave on $\mathbb{R}_+^*$. Apply the concave-dual tangent inequality at $a = 1$: $g(1) = 1$, $g'(1) = 1/2$, so the tangent is $y = 1 + \frac{1}{2}(x - 1) = (x + 1)/2$. Hence $$ \sqrt{x} \le \frac{x + 1}{2} \qquad (x > 0). $$ For $x = 0$, the inequality becomes $0 \le 1/2$, which is true. By continuity of $g$ at $0$ (or by direct check), the inequality extends to $x \ge 0$.

Remark --- strict convexity

A function $f : I \to \mathbb{R}$ is strictly convex on $I$ if, for every $x \ne y$ in $I$ and every $\lambda \in {]}0, 1[$, $f((1 - \lambda) x + \lambda y) < (1 - \lambda) f(x) + \lambda f(y)$ (strict inequality). For differentiable functions, $f'$ strictly increasing on $I$ implies $f$ strictly convex. For $f \in C^2$, the condition $f'' > 0$ on $I$ is sufficient but not necessary --- the function $x \mapsto x^4$ is strictly convex on $\mathbb{R}$ yet $f''(0) = 0$.

Skills to practice

Showing convexity from $f''$
Bounding a numerical expression by convexity
Studying tangent position

III Inflection points

Inflection points are not strictly listed in the program, but they are conventional and follow naturally from the $C^2$ characterization of the preceding section on the $f''$-characterization of convexity. We restrict ourselves to a safe sufficient practical criterion: a sign change of $f''$ guarantees an inflection. The converse direction (« $f''(a) = 0$ alone suffices ») is false --- a pitfall flagged below.

Definition — Inflection point

Let $f : I \to \mathbb{R}$ and $a$ an interior point of $I$. We say that $f$ has an inflection point at $a$ if there exists $\eta > 0$ such that $[a - \eta, a + \eta] \subset I$, $f$ is not affine on any neighborhood of $a$, and one of the following holds:

$f$ is convex on $[a - \eta, a]$ and concave on $[a, a + \eta]$, or
$f$ is concave on $[a - \eta, a]$ and convex on $[a, a + \eta]$.

The non-affine condition forces the convex/concave character of $f$ to change effectively at $a$: without it, every point of an affine function would qualify (since an affine function is both convex and concave on every interval).

Proposition — Sufficient practical criterion

Let $f \in C^2(I, \mathbb{R})$ and let $a$ be an interior point of $I$. If $f''$ changes sign at $a$ (i.e. there exists $\eta > 0$ such that $f''(x)$ has one sign on $[a - \eta, a]$ and the opposite sign on $[a, a + \eta]$), then $f$ has an inflection point at $a$. In particular, by continuity of $f''$, $\textcolor{colorprop}{f''(a) = 0}$.

Proof

By P2.1, $f$ is convex on the subinterval where $f'' \ge 0$ and concave on the subinterval where $f'' \le 0$. Since $f''$ has opposite signs on $[a - \eta, a]$ and $[a, a + \eta]$, this gives one of the two scenarios of Definition D3.1 verbatim. Continuity of $f''$ at $a$ (since $f \in C^2$) forces $f''(a) = 0$ at the transition.

Pitfall --- $f''(a) \equal 0$ is not sufficient

For a $C^2$ function satisfying the convex/concave-transition definition above, $f''(a) = 0$ is necessary but not sufficient. The standard counterexample is $f(x) = x^4$ at $a = 0$: $f''(x) = 12 x^2$ vanishes at $0$ but is nonnegative everywhere, so $f$ is convex on all of $\mathbb{R}$ --- there is no convex/concave transition, hence no inflection point. The exo file drills this pitfall explicitly.

Method — Find the inflection points of a function

For $f$ at least $C^2$ on $I$:

Compute $f''$ explicitly on $I$.
Solve $f''(x) = 0$ to obtain the candidate set $\{a_1, a_2, \dots\}$.
For each candidate $a_i$, study the sign of $f''$ on a neighborhood of $a_i$ and check whether it changes sign at $a_i$.
Conclude by P3.1: every candidate at which $f''$ changes sign gives an inflection point; candidates where the sign does not change are discarded in the usual $C^2$ setting.

Discard candidates where the sign does not change. A zero of $f''$ where the sign stays the same (e.g. a double zero of $f''$) is not an inflection point. The $x^4$ example above is the canonical illustration.

Example

Find all inflection points of the cubic $f(x) = a x^3 + b x^2 + c x + d$ with $a \ne 0$.

Answer

$f$ is polynomial, hence $C^\infty(\mathbb{R})$. Compute: $$ \begin{aligned} f'(x) &= 3 a x^2 + 2 b x + c, \\ f''(x) &= 6 a x + 2 b. \end{aligned} $$ $f''$ is affine with leading coefficient $6 a \ne 0$, so it vanishes at exactly one point: $$ f''(x) = 0 \iff x = -\frac{b}{3 a}. $$ On either side of $-b/(3a)$, $f''$ takes opposite signs (an affine function with nonzero slope changes sign at its unique zero). By P3.1, $f$ has a unique inflection point at $x = -b/(3a)$.

Figure --- cubic with its inflection point

The cubic $f(x) = x^3 - 3 x$ has $f''(x) = 6 x$, so the unique inflection point is at $x = 0$. The graph is concave on $]-\infty, 0]$ and convex on $[0, +\infty[$.

Skills to practice

Finding inflection points
Sketching curves using $f''$
Counterexamples to over-strong sufficient conditions