CommeUnJeu · L1 MPSI

Real pre-Hilbert spaces

⌚ ~139 min ▢ 17 blocks ✓ 55 exercises ➣ Prerequisites : Vector spaces, Finite-dimensional vector spaces

At secondary school the scalar product was introduced through angles and lengths: $\vec{u} \cdot \vec{v} = \|\vec{u}\| \|\vec{v}\| \cos(\widehat{\vec{u}, \vec{v}})$. In this chapter we invert that theory. We define the scalar product first as an abstract algebraic object on a real vector space, and we then derive from it the notions of norm, distance, orthogonality, and angle. The pay-off is that the same theory applies to function spaces $C^0([a, b], \mathbb{R})$, polynomial spaces $\mathbb{R}[X]$, matrix spaces $\mathcal{M}_{n, p}(\mathbb{R})$ --- far beyond the geometric $\mathbb{R}^2$ and $\mathbb{R}^3$ context of the lycée.
Angle formula --- secondary-school recollection and preview. In $\mathbb{R}^2$ and $\mathbb{R}^3$ one remembers the formula $\langle u, v \rangle = \|u\| \|v\| \cos \theta$. Later in this chapter, the Cauchy-Schwarz inequality below will give $-1 \le \frac{\langle u, v \rangle}{\|u\| \|v\|} \le 1$ when $u, v \ne 0$; the formula can then be inverted to define the angle by $$ \theta = \arccos\!\left( \frac{\langle u, v \rangle}{\|u\| \|v\|} \right). $$ At this stage this is only a preview --- not yet a definition --- since the right-hand side is not yet known to be in $[-1, 1]$. The figure below is a geometric recollection from secondary school, with $\theta$ as the visible angle between $\vec{u}$ and $\vec{v}$.

Conventions for the chapter.

Throughout, $E$ is an $\mathbb{R}$-vector space, finite or infinite-dimensional unless stated otherwise.
The scalar product is denoted $\langle x, y \rangle$ throughout the chapter. The notations $(x | y)$ and $x \cdot y$ are alternative; we mention them once in the Definition below and then keep $\langle x, y \rangle$.
The induced norm is $\|x\| = \sqrt{\langle x, x \rangle}$; the distance is $d(x, y) = \|x - y\|$.
Orthogonality: $x \perp y$ if and only if $\langle x, y \rangle = 0$; $X^\perp = \{ t \in E \mid \forall x \in X, \langle t, x \rangle = 0 \}$.
Kronecker symbol: $\delta_{i, j} = 1$ if $i = j$, $0$ otherwise.

I Scalar product

I.1 Definition and first examples

A scalar product is an application $E \times E \to \mathbb{R}$ that captures four algebraic properties: bilinearity (linearity in each variable), symmetry, positivity, and separation (or definiteness). The four properties are the bones of the theory; everything that follows --- norms, orthogonality, projections --- is derived from them. We start by isolating the four properties, then state two pedagogical Remarks (a verification shortcut and a Method for checking that a candidate is a scalar product), and finally we test the definition on four stacked examples: one canonical, two counter-examples (one fails separation, one fails positivity), and one valid non-canonical via a symmetric positive-definite matrix.

Definition — Scalar product

A scalar product on an $\mathbb{R}$-vector space $E$ is an application $\langle \cdot, \cdot \rangle : E \times E \to \mathbb{R}$ satisfying the four properties below.

Bilinear: for all $x, x', y, y' \in E$ and $\lambda, \mu \in \mathbb{R}$, $$ \langle \lambda x + \mu x', y \rangle = \lambda \langle x, y \rangle + \mu \langle x', y \rangle, \quad \langle x, \lambda y + \mu y' \rangle = \lambda \langle x, y \rangle + \mu \langle x, y' \rangle. $$
Symmetric: for all $x, y \in E$, $\langle y, x \rangle = \langle x, y \rangle$.
Positive: for all $x \in E$, $\langle x, x \rangle \ge 0$.
Separated (or definite): for all $x \in E$, $\langle x, x \rangle = 0$ implies $x = 0_E$.

Alternative notations: $(x | y)$ and $x \cdot y$ are sometimes used; in this chapter we keep $\langle x, y \rangle$.

Definition — Real pre-Hilbert space$\virgule$ Euclidean space

A real pre-Hilbert space is an $\mathbb{R}$-vector space $E$ equipped with a scalar product $\langle \cdot, \cdot \rangle$. A Euclidean space is a real pre-Hilbert space of finite dimension.

Pedagogical shortcut. The four properties look like four independent checks. In practice, once symmetry is established, the linearity-in-the-first-variable axiom implies the linearity-in-the-second-variable axiom (by symmetry). So one usually proves symmetry first, then linearity in one variable only, and bilinearity is obtained for free. Positivity is then a direct sign check on $\langle x, x \rangle$, and separation requires the « if $\langle x, x \rangle = 0$ then $x = 0_E$ » implication, often the most delicate step.

Method — Verify that an application is a scalar product

Given a candidate $\varphi : E \times E \to \mathbb{R}$, check the four properties in the order below.

Step 1 --- Symmetry. Show $\varphi(y, x) = \varphi(x, y)$ for all $x, y \in E$.
Step 2 --- Linearity in the first variable only. Show $\varphi(\lambda x + \mu x', y) = \lambda \varphi(x, y) + \mu \varphi(x', y)$. By Step 1, linearity in the second variable follows automatically, hence bilinearity.
Step 3 --- Positivity. Show $\varphi(x, x) \ge 0$ for all $x \in E$.
Step 4 --- Separation. Show that $\varphi(x, x) = 0$ implies $x = 0_E$. This is often the only nontrivial step.

If any of the four steps fails, $\varphi$ is not a scalar product. We will see counter-examples below.

Example — Canonical preview on $\mathbb{R}^2$

The map $\varphi((x_1, x_2), (y_1, y_2)) = x_1 y_1 + x_2 y_2$ is a scalar product on $\mathbb{R}^2$. The four-step check:

Symmetry: $\varphi(y, x) = y_1 x_1 + y_2 x_2 = x_1 y_1 + x_2 y_2 = \varphi(x, y)$.
Linearity in the first variable: direct from the formula.
Positivity: $\varphi(x, x) = x_1^2 + x_2^2 \ge 0$.
Separation: if $\varphi(x, x) = x_1^2 + x_2^2 = 0$, then $x_1 = x_2 = 0$, so $x = 0_{\mathbb{R}^2}$.

This will be re-derived in the next subsection as a special case of the canonical scalar product on $\mathbb{R}^n$.

Example — Counter-example: lack of separation

The map $\varphi((x_1, x_2), (y_1, y_2)) = x_1 y_1$ on $\mathbb{R}^2$ is bilinear, symmetric, and positive, but not separated: $\varphi((0, 1), (0, 1)) = 0 \cdot 0 = 0$ while $(0, 1) \ne 0_{\mathbb{R}^2}$. The « positive semi-definite » bilinear form $\varphi$ is therefore not a scalar product. The separation axiom is the one that fails: $\varphi$ « forgets » the second coordinate.

Example — Counter-example: lack of positivity

The map $\varphi((x_1, x_2), (y_1, y_2)) = x_1 y_1 - x_2 y_2$ on $\mathbb{R}^2$ is bilinear and symmetric, but not positive: $\varphi((0, 1), (0, 1)) = -1 < 0$. So $\varphi$ is not a scalar product. Such a symmetric bilinear form is indefinite, not positive definite.

Example — Valid non-canonical scalar product via an SPD matrix

The map $\varphi(X, Y) = X^\top \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix} Y$ is a scalar product on $\mathbb{R}^2$. The matrix is symmetric positive definite (SPD), and bilinearity, symmetry are direct. For positivity and separation, expand for $X = (x_1, x_2)^\top$: $$ X^\top \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix} X = 2 x_1^2 + 2 x_1 x_2 + 2 x_2^2 = x_1^2 + x_2^2 + (x_1 + x_2)^2. $$ The right-hand side is a sum of three squares, so $\ge 0$ (positivity); it vanishes only when $x_1 = x_2 = x_1 + x_2 = 0$, hence $X = 0$ (separation). This example will reappear in the orthonormal-families subsection below to show that the canonical basis of $\mathbb{R}^2$ is not orthonormal for this $\varphi$.

Skills to practice

Checking the four axioms

I.2 Canonical scalar products on $\mathbb{R}^n$ and $\mathcal{M}_{n\virgule p}(\mathbb{R})$

We promote the $\mathbb{R}^2$ preview of the previous subsection to general $\mathbb{R}^n$ and to general matrix spaces $\mathcal{M}_{n, p}(\mathbb{R})$, with one canonical scalar product on each. These two products recover the secondary-school dot-product as a special case and are the working scalar products of the projection-and-distance section below.

Proposition — Canonical scalar product on $\mathbb{R}^n$

The application $$ \textcolor{colorprop}{(X, Y) \longmapsto X^\top Y = \sum_{i = 1}^n x_i y_i} $$ is a scalar product on $\mathbb{R}^n$, called the canonical scalar product. Here $X = (x_1, \dots, x_n)^\top$ and $Y = (y_1, \dots, y_n)^\top$ are viewed as column vectors, so $X^\top Y$ is a $1 \times 1$ matrix identified with its single entry.