\( \definecolor{colordef}{RGB}{249,49,84} \definecolor{colorprop}{RGB}{18,102,241} \)

CommeUnJeu · L2 MP

Isometries of a Euclidean space

⌚ ~83 min ▢ 10 blocks ✓ 19 exercises ➣ Prerequisites : Real pre-Hilbert spaces, Further linear algebra

In the first year, Pre-Hilbert real spaces built the geometry of a Euclidean space: a scalar product \(\langle\cdot\,|\,\cdot\rangle\), the length \(\norme{x} = \sqrt{\langle x\,|\,x\rangle}\) it produces, orthogonality, and orthonormal bases. That chapter studied the space. This one studies the maps of the space into itself that respect this geometry --- the endomorphisms that preserve distance.
The chapter has five sections. Section~1 attaches to every endomorphism \(u\) a partner, its adjoint \(u^*\), the exact analogue of matrix transposition. Section~2 describes the orthogonal matrices --- those preserving the Euclidean structure --- and the groups \(\mathrm{O}_n(\mathbb{R})\) and \(\mathrm{SO}_n(\mathbb{R})\) they form, with the notion of orientation. Section~3 isolates the vector isometries: the endomorphisms preserving length, the group \(\mathrm{O}(E)\), and their four working characterisations. Section~4 classifies them completely in dimension~2 --- the positive ones are the rotations, the negative ones the reflections. Section~5 proves the reduction theorem: in a well-chosen orthonormal basis, every isometry is a diagonal assembly of \(\pm 1\) blocks and plane rotations; section~5 then makes this explicit in dimension~3.
Standing notation. Throughout, \(E\) denotes a Euclidean space --- a finite-dimensional real inner product space --- with \(\dim E = n\); \(\langle\cdot\,|\,\cdot\rangle\) is its scalar product, \(\norme{\cdot}\) the associated norm, \(0_E\) the zero vector. « BON » abbreviates orthonormal basis. The notions of scalar product, norm, Cauchy--Schwarz inequality, polarisation identity, orthogonality, the orthogonal \(F^\perp\) of a subspace, \(E = F \oplus F^\perp\), orthonormal bases, orthogonal projection and Pythagoras' theorem are those of Pre-Hilbert real spaces; the notions of sum and direct sum of subspaces, block matrices, stable subspace and induced endomorphism are those of Complements of linear algebra; \(\mathcal{L}(E)\) is the space of endomorphisms of \(E\), \(\mathrm{GL}(E)\) the group of automorphisms.

I The adjoint of an endomorphism

I.1 Representation of linear forms and the adjoint

A linear form on \(E\) is a linear map \(E \to \mathbb{R}\). The scalar product gives a stock of them: for a fixed vector \(a\), the map \(x \mapsto \langle a\,|\,x\rangle\) is a linear form. The first theorem says these are all of them --- in a Euclidean space, every linear form is « scalar product against a fixed vector ». This single fact lets us attach to each endomorphism a partner, its adjoint.

Theorem — Representation of linear forms

Let \(E\) be a Euclidean space. For every linear form \(\varphi\) on \(E\), there exists a unique vector \(a \in E\) such that $$ \textcolor{colorprop}{\varphi(x) = \langle a\,|\,x\rangle \qquad \text{for all } x \in E.} $$

Consider the map \(\xi \colon E \to \mathcal{L}(E,\mathbb{R})\) sending a vector \(a\) to the linear form \(\langle a\,|\,\cdot\rangle\). It is linear: by bilinearity of the scalar product, \(\langle a + \lambda a'\,|\,\cdot\rangle = \langle a\,|\,\cdot\rangle + \lambda\langle a'\,|\,\cdot\rangle\), that is \(\xi(a + \lambda a') = \xi(a) + \lambda\,\xi(a')\). It is injective: if \(\xi(a) = 0\), then \(\langle a\,|\,x\rangle = 0\) for every \(x\), and taking \(x = a\) gives \(\norme{a}^2 = 0\), hence \(a = 0_E\) --- the scalar product is positive-definite. Finally \(\dim \mathcal{L}(E,\mathbb{R}) = \dim E\), so the injective linear map \(\xi\) between two spaces of the same finite dimension is an isomorphism. In particular \(\xi\) is surjective: every linear form \(\varphi\) is \(\xi(a)\) for a unique \(a \in E\), which is exactly the stated existence and uniqueness.

Definition — Adjoint of an endomorphism

Let \(u \in \mathcal{L}(E)\). The adjoint of \(u\) is the unique endomorphism \(u^* \in \mathcal{L}(E)\) such that $$ \textcolor{colordef}{\langle u(x)\,|\,y\rangle = \langle x\,|\,u^*(y)\rangle \qquad \text{for all } x,y \in E.} $$

Remark. Throughout this chapter \(E\) is a real Euclidean space, and the scalar product takes real values. The complex (hermitian) inner product is hors programme --- it is not treated here, and every result below is stated over \(\mathbb{R}\).

Example — Adjoint of a homothety

The adjoint of \(\mathrm{id}_E\) is \(\mathrm{id}_E\): indeed \(\langle \mathrm{id}_E(x)\,|\,y\rangle = \langle x\,|\,y\rangle = \langle x\,|\,\mathrm{id}_E(y)\rangle\). More generally, for the homothety \(\lambda\,\mathrm{id}_E\) (\(\lambda \in \mathbb{R}\)), \(\langle \lambda x\,|\,y\rangle = \lambda\langle x\,|\,y\rangle = \langle x\,|\,\lambda y\rangle\), so \((\lambda\,\mathrm{id}_E)^* = \lambda\,\mathrm{id}_E\) --- a homothety is its own adjoint.

Example — Adjoint of an orthogonal projection

Let \(p\) be the orthogonal projection onto a subspace \(F\) (parallel to \(F^\perp\), recalled from Pre-Hilbert real spaces). Decompose \(x = x_F + x_\perp\) and \(y = y_F + y_\perp\) along \(E = F \oplus F^\perp\). Since \(F \perp F^\perp\), $$ \langle p(x)\,|\,y\rangle = \langle x_F\,|\,y_F + y_\perp\rangle = \langle x_F\,|\,y_F\rangle = \langle x_F + x_\perp\,|\,y_F\rangle = \langle x\,|\,p(y)\rangle. $$ Hence \(p^* = p\): an orthogonal projection is its own adjoint. This foreshadows § 3.2.

Method — Determine the adjoint of an endomorphism

To find the adjoint \(u^*\) of an endomorphism \(u\), two routes:

via the defining identity: guess a candidate endomorphism \(v\) and verify \(\langle u(x)\,|\,y\rangle = \langle x\,|\,v(y)\rangle\) for all \(x,y\); by uniqueness of the adjoint, \(u^* = v\);
via a matrix: fix an orthonormal basis, write the matrix \(M\) of \(u\) in it --- then, as § 1.2 proves, the matrix of \(u^*\) in the same basis is the transpose \(M^{\mathsf{T}}\). Beware: this works only in an orthonormal basis.

Skills to practice

Determining the adjoint of an endomorphism

I.2 Properties of the adjoint

The adjoint behaves exactly like matrix transposition: the map \(u \mapsto u^*\) is linear, it reverses composition, and it is involutive. The link is not a mere analogy --- in an orthonormal basis, taking the adjoint is transposing the matrix.

Proposition — Algebraic properties of the adjoint

For all \(u,v \in \mathcal{L}(E)\) and all \(\lambda \in \mathbb{R}\): $$ (u + \lambda v)^* = u^* + \lambda\,v^*, \qquad (u \circ v)^* = v^* \circ u^*, \qquad (u^*)^* = u. $$ The map \(u \mapsto u^*\) is thus a linear involution of \(\mathcal{L}(E)\), and it reverses composition.

Each identity is obtained by checking the defining relation and invoking uniqueness of the adjoint. For all \(x,y \in E\):

\(\langle x\,|\,(u + \lambda v)^*(y)\rangle = \langle (u + \lambda v)(x)\,|\,y\rangle = \langle u(x)\,|\,y\rangle + \lambda\langle v(x)\,|\,y\rangle = \langle x\,|\,(u^* + \lambda v^*)(y)\rangle\);
\(\langle x\,|\,(u \circ v)^*(y)\rangle = \langle u(v(x))\,|\,y\rangle = \langle v(x)\,|\,u^*(y)\rangle = \langle x\,|\,v^*(u^*(y))\rangle = \langle x\,|\,(v^* \circ u^*)(y)\rangle\);
\(\langle x\,|\,(u^*)^*(y)\rangle = \langle u^*(x)\,|\,y\rangle = \langle y\,|\,u^*(x)\rangle = \langle u(y)\,|\,x\rangle = \langle x\,|\,u(y)\rangle\), using the symmetry of the scalar product twice.

In each case the relation holds for all \(x\), so uniqueness of the adjoint gives the announced equality.

Proposition — Matrix of the adjoint in an orthonormal basis

Let \(\mathcal{B}\) be an orthonormal basis of \(E\) and \(u \in \mathcal{L}(E)\). Then $$ \textcolor{colorprop}{\mathrm{Mat}_{\mathcal{B}}(u^*) = \mathrm{Mat}_{\mathcal{B}}(u)^{\mathsf{T}}.} $$ In an orthonormal basis, passing to the adjoint is transposing the matrix.

Write \(\mathcal{B} = (e_1,\dots,e_n)\), orthonormal, and \(M = (m_{i,j}) = \mathrm{Mat}_{\mathcal{B}}(u)\). By definition the \(j\)-th column of \(M\) holds the coordinates of \(u(e_j)\) in \(\mathcal{B}\); since \(\mathcal{B}\) is orthonormal, the \(i\)-th coordinate of a vector \(z\) is \(\langle e_i\,|\,z\rangle\), so $$ m_{i,j} = \langle e_i\,|\,u(e_j)\rangle. $$ Likewise, writing \(M' = (m'_{i,j}) = \mathrm{Mat}_{\mathcal{B}}(u^*)\), one has \(m'_{i,j} = \langle e_i\,|\,u^*(e_j)\rangle\). Now $$ m'_{i,j} = \langle e_i\,|\,u^*(e_j)\rangle = \langle u(e_i)\,|\,e_j\rangle = \langle e_j\,|\,u(e_i)\rangle = m_{j,i}. $$ Thus \(m'_{i,j} = m_{j,i}\) for all \(i,j\), that is \(M' = M^{\mathsf{T}}\).

Example — Invariants shared by an endomorphism and its adjoint

In an orthonormal basis, \(u\) has matrix \(M\) and \(u^*\) has matrix \(M^{\mathsf{T}}\). A matrix and its transpose have the same rank, the same determinant (\(\det M^{\mathsf{T}} = \det M\)) and the same trace (\(\mathrm{Tr}\,M^{\mathsf{T}} = \mathrm{Tr}\,M\)). Hence \(u\) and \(u^*\) have the same rank, the same determinant and the same trace. (The characteristic polynomial is deliberately not invoked: it belongs to a later chapter.)

Example — Composition reversal on a concrete pair

On \(\mathbb{R}^2\) with its canonical (orthonormal) basis, let \(u\) and \(v\) be the endomorphisms with matrices $$ A = \begin{pmatrix} 1 & 2 \\ 0 & 1 \end{pmatrix}, \qquad B = \begin{pmatrix} 1 & 0 \\ 3 & 1 \end{pmatrix}. $$ Then \(u \circ v\) has matrix \(AB = \begin{pmatrix} 7 & 2 \\ 3 & 1 \end{pmatrix}\), so by the Proposition above \((u \circ v)^*\) has matrix \((AB)^{\mathsf{T}} = \begin{pmatrix} 7 & 3 \\ 2 & 1 \end{pmatrix}\). On the other hand \(v^* \circ u^*\) has matrix $$ B^{\mathsf{T}} A^{\mathsf{T}} = \begin{pmatrix} 1 & 3 \\ 0 & 1 \end{pmatrix}\begin{pmatrix} 1 & 0 \\ 2 & 1 \end{pmatrix} = \begin{pmatrix} 7 & 3 \\ 2 & 1 \end{pmatrix}. $$ The two agree --- the identity \((u \circ v)^* = v^* \circ u^*\) in action, with the order of the factors visibly reversed.

Proposition — Adjoint and stable subspaces

Let \(u \in \mathcal{L}(E)\) and \(F\) a subspace of \(E\). If \(F\) is stable by \(u\), then \(F^\perp\) is stable by \(u^*\).

Suppose \(F\) stable by \(u\), i.e. \(u(F) \subset F\). Let \(y \in F^\perp\); we show \(u^*(y) \in F^\perp\), that is \(\langle u^*(y)\,|\,x\rangle = 0\) for every \(x \in F\). For such an \(x\), $$ \langle u^*(y)\,|\,x\rangle = \langle x\,|\,u^*(y)\rangle = \langle u(x)\,|\,y\rangle. $$ Since \(x \in F\) and \(F\) is stable, \(u(x) \in F\); and \(y \in F^\perp\), so \(\langle u(x)\,|\,y\rangle = 0\). Hence \(u^*(y) \in F^\perp\), and \(F^\perp\) is stable by \(u^*\).

Example — Adjoint of a matrix in the canonical basis

Let \(u\) be the endomorphism of \(\mathbb{R}^3\) (with its canonical scalar product) whose matrix in the canonical basis is $$ A = \begin{pmatrix} 2 & 1 & 0 \\ 0 & 1 & -1 \\ 3 & 0 & 1 \end{pmatrix}. $$ Determine the matrix of the adjoint \(u^*\) in the canonical basis.

The canonical basis of \(\mathbb{R}^3\) is orthonormal for the canonical scalar product. By the Proposition on the matrix of the adjoint, the matrix of \(u^*\) in that basis is the transpose of \(A\): $$ \mathrm{Mat}(u^*) = A^{\mathsf{T}} = \begin{pmatrix} 2 & 0 & 3 \\ 1 & 1 & 0 \\ 0 & -1 & 1 \end{pmatrix}. $$

Skills to practice

Using the properties of the adjoint

II Orthogonal matrices

II.1 Orthogonal matrices and the orthogonal group

Among the square real matrices, some preserve the Euclidean structure of \(\mathbb{R}^n\). They are characterised by a single algebraic relation between a matrix and its transpose, and they form two nested groups --- the central objects of this chapter on the matrix side.

Definition — Orthogonal matrix

A matrix \(M \in \mathcal{M}_n(\mathbb{R})\) is orthogonal when $$ \textcolor{colordef}{M^{\mathsf{T}} M = M M^{\mathsf{T}} = I_n,} $$ equivalently when \(M\) is invertible with \(M^{-1} = M^{\mathsf{T}}\).

For a square matrix, either of the two equalities \(M^{\mathsf{T}} M = I_n\) and \(M M^{\mathsf{T}} = I_n\) forces the other: a one-sided inverse of a square matrix is automatically a two-sided inverse. Checking \(M^{\mathsf{T}} M = I_n\) alone is therefore enough.

Theorem — The orthogonal group and the special orthogonal group

The set of orthogonal matrices of size \(n\), written \(\mathrm{O}_n(\mathbb{R})\), is a subgroup of \(\mathrm{GL}_n(\mathbb{R})\) --- the orthogonal group. The determinant of an orthogonal matrix is \(+1\) or \(-1\). The subset $$ \mathrm{SO}_n(\mathbb{R}) = \{ M \in \mathrm{O}_n(\mathbb{R}) : \det M = 1 \} $$ is itself a subgroup, the special orthogonal group.

An orthogonal matrix is invertible (its inverse is \(M^{\mathsf{T}}\)), so \(\mathrm{O}_n(\mathbb{R}) \subset \mathrm{GL}_n(\mathbb{R})\). We check the subgroup axioms.

\(I_n \in \mathrm{O}_n(\mathbb{R})\), since \(I_n^{\mathsf{T}} I_n = I_n\).
Stable by product: if \(M,N\) are orthogonal, \((MN)^{\mathsf{T}}(MN) = N^{\mathsf{T}} M^{\mathsf{T}} M N = N^{\mathsf{T}} I_n N = N^{\mathsf{T}} N = I_n\).
Stable by inverse: if \(M\) is orthogonal, \(M^{-1} = M^{\mathsf{T}}\), and \((M^{-1})^{\mathsf{T}} M^{-1} = (M^{\mathsf{T}})^{\mathsf{T}} M^{\mathsf{T}} = M M^{\mathsf{T}} = I_n\).

So \(\mathrm{O}_n(\mathbb{R})\) is a subgroup of \(\mathrm{GL}_n(\mathbb{R})\). For the determinant, applying \(\det\) to \(M^{\mathsf{T}} M = I_n\) gives \(\det(M^{\mathsf{T}})\det(M) = 1\), and \(\det(M^{\mathsf{T}}) = \det(M)\), hence \(\det(M)^2 = 1\) and \(\det M \in \{-1,1\}\). Finally \(\det \colon \mathrm{O}_n(\mathbb{R}) \to (\{-1,1\},\times)\) is a group morphism, and \(\mathrm{SO}_n(\mathbb{R})\) is its kernel --- the kernel of a group morphism is a subgroup.

Example — Three orthogonal matrices

The identity \(I_n\) is orthogonal with \(\det = 1\), so \(I_n \in \mathrm{SO}_n(\mathbb{R})\). For \(\theta \in \mathbb{R}\), the matrix \(\begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}\) satisfies \(M^{\mathsf{T}} M = I_2\) and \(\det M = \cos^2\theta + \sin^2\theta = 1\), so it lies in \(\mathrm{SO}_2(\mathbb{R})\). The permutation matrix $$ P = \begin{pmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix} $$ satisfies \(P^{\mathsf{T}} P = I_3\) (a direct check) and \(\det P = 1\), so \(P \in \mathrm{SO}_3(\mathbb{R})\).

Example — Determinant one does not imply orthogonal

Having determinant \(\pm 1\) is necessary but not sufficient for orthogonality. The matrix \(M = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}\) has \(\det M = 1\), yet $$ M^{\mathsf{T}} M = \begin{pmatrix} 1 & 0 \\ 1 & 1 \end{pmatrix}\begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix} \ne I_2, $$ so \(M\) is not orthogonal. Orthogonality is the relation \(M^{\mathsf{T}} M = I_n\), never the value of the determinant alone.

Skills to practice

Recognising orthogonal matrices

II.2 Orthonormal bases and orientation

Orthogonal matrices admit a transparent reading: they are exactly the matrices whose columns form an orthonormal family, and exactly the matrices that change one orthonormal basis into another. Splitting the orthogonal matrices by the sign of their determinant then divides the orthonormal bases into two families --- the two orientations of the space.

Proposition — Orthogonal matrices and orthonormal families

A matrix \(M \in \mathcal{M}_n(\mathbb{R})\) is orthogonal if and only if its columns form an orthonormal family of \(\mathbb{R}^n\) (for the canonical scalar product), if and only if its rows do.

Let \(C_1,\dots,C_n\) be the columns of \(M\). The entry of \(M^{\mathsf{T}} M\) in position \((i,j)\) is the product of row \(i\) of \(M^{\mathsf{T}}\) --- that is, column \(i\) of \(M\) --- with column \(j\) of \(M\): $$ (M^{\mathsf{T}} M)_{i,j} = C_i^{\mathsf{T}} C_j = \langle C_i\,|\,C_j\rangle, $$ the canonical scalar product of \(\mathbb{R}^n\). Hence \(M^{\mathsf{T}} M = I_n\) if and only if \(\langle C_i\,|\,C_j\rangle = \delta_{i,j}\) for all \(i,j\), i.e. the columns form an orthonormal family. Applying this to \(M^{\mathsf{T}}\), whose columns are the rows of \(M\), gives the row statement; and \(M\) is orthogonal iff \(M^{\mathsf{T}}M = I_n\) iff \(MM^{\mathsf{T}} = I_n\).

Proposition — Orthogonal matrices and changes of orthonormal basis

The change-of-basis matrix between two orthonormal bases of \(E\) is orthogonal; conversely, given an orthonormal basis \(\mathcal{B}\) and an orthogonal matrix \(M\), there is an orthonormal basis \(\mathcal{B}'\) whose change-of-basis matrix from \(\mathcal{B}\) is \(M\). Throughout, the change-of-basis matrix from \(\mathcal{B}\) to \(\mathcal{B}'\) has, for its columns, the vectors of \(\mathcal{B}'\) expressed in \(\mathcal{B}\).

Let \(\mathcal{B}, \mathcal{B}'\) be two orthonormal bases and \(P\) the change-of-basis matrix from \(\mathcal{B}\) to \(\mathcal{B}'\). Its columns are the coordinate vectors, in \(\mathcal{B}\), of the vectors of \(\mathcal{B}'\). Since \(\mathcal{B}\) is orthonormal, the scalar product of two vectors equals the canonical scalar product of their coordinate columns in \(\mathcal{B}\). So columns \(i\) and \(j\) of \(P\) have canonical scalar product \(\langle e'_i\,|\,e'_j\rangle = \delta_{i,j}\), the family \(\mathcal{B}'\) being orthonormal: the columns of \(P\) form an orthonormal family, so \(P\) is orthogonal by the previous Proposition.
Conversely, let \(M\) be orthogonal and \(\mathcal{B}\) an orthonormal basis. Let \(\mathcal{B}'\) be the family whose coordinate columns in \(\mathcal{B}\) are the columns of \(M\). Those columns form an orthonormal family of \(\mathbb{R}^n\); since \(\mathcal{B}\) is orthonormal, \(\mathcal{B}'\) is an orthonormal family of \(n\) vectors of \(E\), hence an orthonormal basis, and \(M\) is by construction the change-of-basis matrix from \(\mathcal{B}\) to \(\mathcal{B}'\).

Definition — Orientation of a Euclidean space

Two orthonormal bases of \(E\) define the same orientation when the change-of-basis matrix between them has determinant \(+1\). To orient \(E\) is to choose one of the resulting classes; its bases are then called direct, the others indirect.

Proposition — The two orientations

« Defining the same orientation » is an equivalence relation on the set of orthonormal bases of \(E\), and for \(n \ge 1\) it has exactly two classes.

Write \(P_{\mathcal{B} \to \mathcal{B}'}\) for the change-of-basis matrix. It is reflexive (\(P_{\mathcal{B} \to \mathcal{B}} = I_n\), \(\det = 1\)); symmetric (\(P_{\mathcal{B}' \to \mathcal{B}} = P_{\mathcal{B} \to \mathcal{B}'}^{-1}\), so its determinant is the inverse of \(\det P_{\mathcal{B} \to \mathcal{B}'}\), equal to \(1\) when the latter is); transitive (\(P_{\mathcal{B} \to \mathcal{B}''} = P_{\mathcal{B} \to \mathcal{B}'}\, P_{\mathcal{B}' \to \mathcal{B}''}\), so the determinants multiply). It is therefore an equivalence relation.
For the count: between two orthonormal bases the change-of-basis matrix is orthogonal, so its determinant is \(+1\) or \(-1\) --- two bases are in the same class iff this determinant is \(+1\). Both values occur: starting from an orthonormal basis \((e_1,\dots,e_n)\), the family \((-e_1,e_2,\dots,e_n)\) is again an orthonormal basis, and the change-of-basis matrix \(\mathrm{diag}(-1,1,\dots,1)\) has determinant \(-1\) --- this works for every \(n \ge 1\). Finally there are at most two classes: if \(\mathcal{B}_2\) and \(\mathcal{B}_3\) are both in a different class from \(\mathcal{B}_1\), then \(\det P_{\mathcal{B}_1 \to \mathcal{B}_2} = \det P_{\mathcal{B}_1 \to \mathcal{B}_3} = -1\), hence \(\det P_{\mathcal{B}_2 \to \mathcal{B}_3} = \det P_{\mathcal{B}_2 \to \mathcal{B}_1}\,\det P_{\mathcal{B}_1 \to \mathcal{B}_3} = (-1)(-1) = 1\), so \(\mathcal{B}_2\) and \(\mathcal{B}_3\) are in the same class. Exactly two classes.

Example — Orientations of the canonical basis

On \(\mathbb{R}^n\) one usually orients the space by declaring the canonical basis \((e_1,\dots,e_n)\) direct. Negating one vector --- replacing \(e_1\) by \(-e_1\) --- gives an orthonormal basis with change-of-basis determinant \(-1\), hence indirect; this is valid for every \(n \ge 1\). For \(n \ge 2\), swapping two vectors --- exchanging \(e_1\) and \(e_2\) --- also gives change-of-basis determinant \(-1\), hence an indirect basis.

Method — Check whether a matrix is orthogonal

To decide whether a matrix \(M \in \mathcal{M}_n(\mathbb{R})\) is orthogonal, and whether it lies in \(\mathrm{SO}_n(\mathbb{R})\):

orthogonality: either compute \(M^{\mathsf{T}} M\) and check it equals \(I_n\), or --- often faster --- check that the columns of \(M\) form an orthonormal family (each column of norm \(1\), distinct columns orthogonal);
membership in \(\mathrm{SO}_n\): once \(M\) is known orthogonal, compute \(\det M\); then \(M \in \mathrm{SO}_n(\mathbb{R})\) iff \(\det M = 1\), and \(M \in \mathrm{O}_n(\mathbb{R}) \setminus \mathrm{SO}_n(\mathbb{R})\) iff \(\det M = -1\).

Skills to practice

Orienting a Euclidean space

III Vector isometries

III.1 Vector isometries and their characterisations

A vector isometry is an endomorphism that does not change the length of any vector. The definition uses the norm, but it has an immediate equivalent reading in terms of the scalar product, and three further characterisations --- via orthonormal bases, via the adjoint, via the matrix --- each useful in practice. They are gathered in the theorem below.

Definition — Vector isometry

A vector isometry of \(E\) is an endomorphism \(u \in \mathcal{L}(E)\) that preserves the norm: $$ \textcolor{colordef}{\norme{u(x)} = \norme{x} \qquad \text{for all } x \in E.} $$

Proposition — Preservation of the scalar product

An endomorphism \(u \in \mathcal{L}(E)\) preserves the norm if and only if it preserves the scalar product, that is $$ \langle u(x)\,|\,u(y)\rangle = \langle x\,|\,y\rangle \qquad \text{for all } x,y \in E. $$

If \(u\) preserves the scalar product, then for every \(x\), \(\norme{u(x)}^2 = \langle u(x)\,|\,u(x)\rangle = \langle x\,|\,x\rangle = \norme{x}^2\), so \(u\) preserves the norm. Conversely, suppose \(u\) preserves the norm. The polarisation identity, recalled from Pre-Hilbert real spaces, expresses the scalar product through the norm: $$ \langle a\,|\,b\rangle = \tfrac{1}{2}\bigl( \norme{a + b}^2 - \norme{a}^2 - \norme{b}^2 \bigr). $$ Applying it to \(a = u(x)\), \(b = u(y)\) and using \(u(x) + u(y) = u(x + y)\) (linearity of \(u\)), $$ \begin{aligned} \langle u(x)\,|\,u(y)\rangle &= \tfrac{1}{2}\bigl( \norme{u(x) + u(y)}^2 - \norme{u(x)}^2 - \norme{u(y)}^2 \bigr)\\ &= \tfrac{1}{2}\bigl( \norme{u(x + y)}^2 - \norme{u(x)}^2 - \norme{u(y)}^2 \bigr) && \text{(linearity)}\\ &= \tfrac{1}{2}\bigl( \norme{x + y}^2 - \norme{x}^2 - \norme{y}^2 \bigr) && \text{(}u\text{ preserves the norm)}\\ &= \langle x\,|\,y\rangle. && \text{(polarisation)} \end{aligned} $$

Proposition — A vector isometry is an automorphism

Every vector isometry of \(E\) is an automorphism of \(E\).

Let \(u\) be a vector isometry and \(x \in \mathrm{Ker}\,u\). Then \(\norme{x} = \norme{u(x)} = \norme{0_E} = 0\), so \(x = 0_E\): thus \(\mathrm{Ker}\,u = \{0_E\}\) and \(u\) is injective. As \(E\) is finite-dimensional, an injective endomorphism is bijective, so \(u \in \mathrm{GL}(E)\).

Definition — The orthogonal group

The set of vector isometries of \(E\) is written \(\mathrm{O}(E)\) and called the orthogonal group of \(E\).

Proposition — The orthogonal group is a subgroup

\(\mathrm{O}(E)\) is a subgroup of \(\mathrm{GL}(E)\).

By the previous Proposition \(\mathrm{O}(E) \subset \mathrm{GL}(E)\). The identity preserves the norm, so \(\mathrm{id}_E \in \mathrm{O}(E)\). If \(u,v \in \mathrm{O}(E)\), then \(\norme{(u \circ v)(x)} = \norme{u(v(x))} = \norme{v(x)} = \norme{x}\), so \(u \circ v \in \mathrm{O}(E)\). If \(u \in \mathrm{O}(E)\), then for every \(x\), applying norm-preservation of \(u\) to the vector \(u^{-1}(x)\) gives \(\norme{u^{-1}(x)} = \norme{u(u^{-1}(x))} = \norme{x}\), so \(u^{-1} \in \mathrm{O}(E)\). Hence \(\mathrm{O}(E)\) is a subgroup of \(\mathrm{GL}(E)\).

Theorem — Characterisations of a vector isometry

For \(u \in \mathcal{L}(E)\), the following are equivalent:

[(i)] \(u\) is a vector isometry;
[(ii)] the image by \(u\) of one orthonormal basis --- equivalently, of every orthonormal basis --- is an orthonormal basis;
[(iii)] \(u^* \circ u = \mathrm{id}_E\);
[(iv)] the matrix of \(u\) in one orthonormal basis --- equivalently, in every orthonormal basis --- is orthogonal.

Fix an orthonormal basis \(\mathcal{B} = (e_1,\dots,e_n)\).
(i)\(\Rightarrow\)(ii). If \(u\) is an isometry it preserves the scalar product, so \(\langle u(e_i)\,|\,u(e_j)\rangle = \langle e_i\,|\,e_j\rangle = \delta_{i,j}\): the family \(\bigl(u(e_1),\dots,u(e_n)\bigr)\) is orthonormal, and being made of \(n\) vectors it is an orthonormal basis.
(ii)\(\Rightarrow\)(i). Suppose \(\bigl(u(e_1),\dots,u(e_n)\bigr)\) is an orthonormal basis. For \(x = \sum_i x_i e_i\), linearity gives \(u(x) = \sum_i x_i\, u(e_i)\). The \(x_i\) are the coordinates of \(u(x)\) in the orthonormal basis \(\bigl(u(e_1),\dots,u(e_n)\bigr)\), so \(\norme{u(x)}^2 = \sum_i x_i^2\); and \(\norme{x}^2 = \sum_i x_i^2\) as well, \(\mathcal{B}\) being orthonormal. Hence \(\norme{u(x)} = \norme{x}\): \(u\) is an isometry. So (i)\(\Leftrightarrow\)(ii), and since (i) does not depend on the chosen basis, (ii) holds for one basis iff for every basis.
(i)\(\Rightarrow\)(iii). An isometry \(u\) is an automorphism. For all \(x,y\), preservation of the scalar product reads \(\langle x\,|\,y\rangle = \langle u(x)\,|\,u(y)\rangle = \langle x\,|\,(u^* \circ u)(y)\rangle\), so \(\langle x\,|\,(u^* \circ u)(y) - y\rangle = 0\) for every \(x\); taking \(x = (u^* \circ u)(y) - y\) gives \((u^* \circ u)(y) = y\). Thus \(u^* \circ u = \mathrm{id}_E\).
(iii)\(\Rightarrow\)(i). If \(u^* \circ u = \mathrm{id}_E\) then \(u\) is injective, hence bijective (\(E\) finite-dimensional). For all \(x,y\), \(\langle u(x)\,|\,u(y)\rangle = \langle x\,|\,(u^* \circ u)(y)\rangle = \langle x\,|\,y\rangle\), so \(u\) preserves the scalar product, hence the norm: \(u\) is an isometry.
(iii)\(\Leftrightarrow\)(iv). In the orthonormal basis \(\mathcal{B}\), write \(M = \mathrm{Mat}_{\mathcal{B}}(u)\); then \(\mathrm{Mat}_{\mathcal{B}}(u^*) = M^{\mathsf{T}}\), so \(\mathrm{Mat}_{\mathcal{B}}(u^* \circ u) = M^{\mathsf{T}} M\). Hence \(u^* \circ u = \mathrm{id}_E \iff M^{\mathsf{T}} M = I_n \iff M\) is orthogonal. As (iii) is basis-free, (iv) holds for one orthonormal basis iff for every one.

Definition — Positive and negative isometries

Let \(u \in \mathrm{O}(E)\). Its matrix in an orthonormal basis is orthogonal, so \(\det u \in \{-1,1\}\). The isometry \(u\) is positive (or direct) when \(\det u = +1\), negative (or indirect) when \(\det u = -1\). The positive isometries form the special orthogonal group \(\mathrm{SO}(E)\), a subgroup of \(\mathrm{O}(E)\).

Example — The identity and its opposite

The identity \(\mathrm{id}_E\) is an isometry with \(\det \mathrm{id}_E = 1\): it is positive. Its opposite \(-\mathrm{id}_E\) satisfies \(\norme{-x} = \norme{x}\), so it too is an isometry; its matrix in any orthonormal basis is \(-I_n\), of determinant \((-1)^n\). Hence \(-\mathrm{id}_E\) is positive when \(n\) is even, negative when \(n\) is odd.

Example — An orthogonal symmetry is an isometry

Let \(p\) be the orthogonal projection onto a subspace \(F\), and \(s = 2p - \mathrm{id}_E\) the associated orthogonal symmetry. For \(x = x_F + x_\perp\) along \(E = F \oplus F^\perp\), one has \(s(x) = x_F - x_\perp\), and by Pythagoras \(\norme{s(x)}^2 = \norme{x_F}^2 + \norme{x_\perp}^2 = \norme{x}^2\). So \(s\) is an isometry --- the next subsection studies these symmetries in detail.

Method — Show that an endomorphism is an isometry

To prove \(u \in \mathcal{L}(E)\) is a vector isometry, pick the cheapest characterisation for the data at hand:

if \(u\) is given by a formula, check \(\norme{u(x)} = \norme{x}\) directly, or \(\langle u(x)\,|\,u(y)\rangle = \langle x\,|\,y\rangle\);
if \(u\) is given by its action on an orthonormal basis, check that the images form an orthonormal basis;
if \(u\) is given by a matrix \(M\) in an orthonormal basis, check that \(M\) is orthogonal (\(M^{\mathsf{T}} M = I_n\));
if the adjoint is easy to compute, check \(u^* \circ u = \mathrm{id}_E\).

Skills to practice

Showing that an endomorphism is an isometry

III.2 Orthogonal symmetries

After \(\pm\mathrm{id}_E\), the simplest isometries are the orthogonal symmetries: fix a subspace \(F\), leave it untouched, and reverse the orthogonal \(F^\perp\). Recall from the first year that, for a direct sum \(E = F \oplus G\), the symmetry with respect to \(F\) parallel to \(G\) sends \(x = x_F + x_G\) to \(x_F - x_G\) and satisfies \(s \circ s = \mathrm{id}_E\); the orthogonal symmetry is the case \(G = F^\perp\). When \(F\) is a hyperplane the symmetry is called a reflection. Two characterisations make them recognisable --- one geometric (a symmetry is orthogonal exactly when it is an isometry), one matricial (exactly when its matrix in an orthonormal basis is symmetric).

Definition — Orthogonal symmetry and reflection

Let \(F\) be a subspace of \(E\) and \(E = F \oplus F^\perp\). The orthogonal symmetry with respect to \(F\) is the symmetry with respect to \(F\) parallel to \(F^\perp\): it sends \(x = x_F + x_\perp\) to \(x_F - x_\perp\). When \(F\) is a hyperplane, this orthogonal symmetry is called a reflection.

Proposition — Orthogonal symmetries and isometries

A vector symmetry of \(E\) is orthogonal if and only if it is a vector isometry.

\((\Rightarrow)\) Let \(s\) be the orthogonal symmetry with respect to \(F\). For \(x = x_F + x_\perp\) along \(E = F \oplus F^\perp\), \(s(x) = x_F - x_\perp\). Since \(x_F \in F\) and \(x_\perp \in F^\perp\) are orthogonal, Pythagoras gives $$ \norme{s(x)}^2 = \norme{x_F}^2 + \norme{-x_\perp}^2 = \norme{x_F}^2 + \norme{x_\perp}^2 = \norme{x}^2, $$ so \(s\) is an isometry.
\((\Leftarrow)\) Let \(s\) be a symmetry with respect to \(F\) parallel to a direction \(G\), with \(E = F \oplus G\), and suppose \(s\) is an isometry. Take \(x \in F\) and \(y \in G\); then \(s(x + y) = x - y\), and since \(s\) is an isometry, $$ \norme{x + y}^2 = \norme{s(x + y)}^2 = \norme{x - y}^2. $$ Expanding both sides, \(\norme{x}^2 + 2\langle x\,|\,y\rangle + \norme{y}^2 = \norme{x}^2 - 2\langle x\,|\,y\rangle + \norme{y}^2\), hence \(4\langle x\,|\,y\rangle = 0\), that is \(\langle x\,|\,y\rangle = 0\). So every vector of \(G\) is orthogonal to every vector of \(F\): \(G \subset F^\perp\). As \(E = F \oplus G\) and \(E = F \oplus F^\perp\), the subspaces \(G\) and \(F^\perp\) have the same dimension \(n - \dim F\), so \(G = F^\perp\). Thus \(s\) is the orthogonal symmetry with respect to \(F\).

Proposition — Matrix characterisation of an orthogonal symmetry

A vector symmetry of \(E\) is orthogonal if and only if its matrix in one orthonormal basis --- equivalently, in every orthonormal basis --- is symmetric.

\((\Rightarrow)\) Let \(s\) be an orthogonal symmetry. By the previous Proposition \(s\) is an isometry, and a symmetry satisfies \(s \circ s = \mathrm{id}_E\). Characterisation (iii) gives \(s^* \circ s = \mathrm{id}_E\), so \(s^* = s^{-1} = s\): the symmetry is self-adjoint. In any orthonormal basis \(\mathcal{B}\), \(\mathrm{Mat}_{\mathcal{B}}(s^*) = \mathrm{Mat}_{\mathcal{B}}(s)^{\mathsf{T}}\), and \(s^* = s\) gives \(\mathrm{Mat}_{\mathcal{B}}(s)^{\mathsf{T}} = \mathrm{Mat}_{\mathcal{B}}(s)\) --- the matrix is symmetric, in every orthonormal basis.
\((\Leftarrow)\) Let \(s\) be a symmetry whose matrix \(M\) in an orthonormal basis is symmetric, \(M = M^{\mathsf{T}}\). A symmetry satisfies \(s \circ s = \mathrm{id}_E\), so \(M^2 = I_n\). Then \(M M^{\mathsf{T}} = M\,M = M^2 = I_n\), so \(M\) is orthogonal; by characterisation (iv), \(s\) is an isometry, hence an orthogonal symmetry by the previous Proposition.

Example — The reflection through a hyperplane

Let \(a\) be a unit vector of \(E\) and \(H = \{a\}^\perp\) the hyperplane orthogonal to \(a\). Express the reflection \(\sigma\) through \(H\) in terms of \(x\) and \(a\), and verify directly that it is an isometry.

Decompose \(x\) along \(E = H \oplus \mathrm{Vect}(a)\). The component along the line \(\mathrm{Vect}(a)\) is the orthogonal projection of \(x\) onto it; since \(a\) is a unit vector, that projection is \(\langle a\,|\,x\rangle\, a\), and the component in \(H\) is \(x - \langle a\,|\,x\rangle\, a\). The reflection through \(H\) keeps the \(H\)-component and reverses the other: $$ \sigma(x) = \bigl(x - \langle a\,|\,x\rangle\, a\bigr) - \langle a\,|\,x\rangle\, a = \textcolor{colorprop}{x - 2\langle a\,|\,x\rangle\, a}. $$ It is an isometry: using \(\norme{a} = 1\), $$ \norme{\sigma(x)}^2 = \norme{x}^2 - 4\langle a\,|\,x\rangle\langle a\,|\,x\rangle + 4\langle a\,|\,x\rangle^2\norme{a}^2 = \norme{x}^2. $$

The orthogonal symmetry across a line \(F\) of the plane: the component along \(F\) is kept, the component along \(F^\perp\) is reversed.

Skills to practice

Studying orthogonal symmetries

IV Isometries in dimension 2

IV.1 The orthogonal matrices of size 2

In dimension \(2\) the orthogonal matrices can be written down completely: each one depends on a single real parameter \(\theta\), and the sign of the determinant splits them into two families given by two explicit normal forms. This is the computational backbone of the whole classification of plane isometries.

Theorem — Normal form of an orthogonal matrix of size 2

A matrix \(M \in \mathcal{M}_2(\mathbb{R})\) is orthogonal if and only if there exists \(\theta \in \mathbb{R}\) such that $$ M = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix} \quad \text{or} \quad M = \begin{pmatrix} \cos\theta & \sin\theta \\ \sin\theta & -\cos\theta \end{pmatrix}. $$ The first form has determinant \(+1\), the second \(-1\); hence \(M \in \mathrm{SO}_2(\mathbb{R})\) if and only if \(M\) has the first form.

Each of the two displayed matrices is orthogonal: its columns have norm \(1\) and are orthogonal (their scalar product is \(\mp\cos\theta\sin\theta \pm \sin\theta\cos\theta = 0\)). Conversely, let \(M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\) be orthogonal. Its columns form an orthonormal family of \(\mathbb{R}^2\): $$ a^2 + c^2 = 1, \qquad b^2 + d^2 = 1, \qquad ab + cd = 0. $$ From \(a^2 + c^2 = 1\) there is \(\theta \in \mathbb{R}\) with \(a = \cos\theta\), \(c = \sin\theta\); from \(b^2 + d^2 = 1\) there is \(\varphi \in \mathbb{R}\) with \(b = \cos\varphi\), \(d = \sin\varphi\). The relation \(ab + cd = 0\) reads \(\cos\theta\cos\varphi + \sin\theta\sin\varphi = \cos(\theta - \varphi) = 0\), so \(\theta - \varphi \equiv \tfrac{\pi}{2} \pmod{\pi}\), that is \(\varphi \equiv \theta - \tfrac{\pi}{2}\) or \(\varphi \equiv \theta + \tfrac{\pi}{2} \pmod{2\pi}\).

If \(\varphi \equiv \theta + \tfrac{\pi}{2}\): \(b = \cos(\theta + \tfrac{\pi}{2}) = -\sin\theta\), \(d = \sin(\theta + \tfrac{\pi}{2}) = \cos\theta\) --- the first form, with \(\det M = \cos^2\theta + \sin^2\theta = 1\).
If \(\varphi \equiv \theta - \tfrac{\pi}{2}\): \(b = \cos(\theta - \tfrac{\pi}{2}) = \sin\theta\), \(d = \sin(\theta - \tfrac{\pi}{2}) = -\cos\theta\) --- the second form, with \(\det M = -\cos^2\theta - \sin^2\theta = -1\).

Example — The two shapes for a given angle

For \(\theta = \tfrac{\pi}{3}\), one has \(\cos\theta = \tfrac{1}{2}\) and \(\sin\theta = \tfrac{\sqrt{3}}{2}\). The two orthogonal matrices of size \(2\) attached to this angle are $$ \begin{pmatrix} \tfrac{1}{2} & -\tfrac{\sqrt{3}}{2} \\ \tfrac{\sqrt{3}}{2} & \tfrac{1}{2} \end{pmatrix} \in \mathrm{SO}_2(\mathbb{R}), \qquad \begin{pmatrix} \tfrac{1}{2} & \tfrac{\sqrt{3}}{2} \\ \tfrac{\sqrt{3}}{2} & -\tfrac{1}{2} \end{pmatrix} \in \mathrm{O}_2(\mathbb{R}) \setminus \mathrm{SO}_2(\mathbb{R}). $$ The first has determinant \(+1\), the second \(-1\).

Skills to practice

Describing the orthogonal matrices of size 2

IV.2 Rotations and the classification of plane isometries

Throughout this subsection \(E\) is an oriented Euclidean plane. The \(\mathrm{SO}_2(\mathbb{R})\) normal form depends on a single angle, and angles add when the matrices multiply: this turns \(\mathrm{SO}_2(\mathbb{R})\) into a transparent abelian group and lets us define the rotation of angle \(\theta\). The classification then follows: the positive isometries of the plane are the rotations, the negative ones the reflections.

Proposition — The morphism of plane rotations

For \(\theta \in \mathbb{R}\) write \(R(\theta) = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}\). The map \(R \colon \theta \mapsto R(\theta)\) is a surjective group morphism from \((\mathbb{R}, +)\) onto \((\mathrm{SO}_2(\mathbb{R}), \times)\), with kernel \(2\pi\mathbb{Z}\). Consequently \(\mathrm{SO}_2(\mathbb{R})\) is abelian, and the parameter \(\theta\) of a matrix of \(\mathrm{SO}_2(\mathbb{R})\) is determined modulo \(2\pi\).

Computing the product and using the addition formulas for cosine and sine, $$ R(\theta)R(\theta') = \begin{pmatrix} \cos\theta\cos\theta' - \sin\theta\sin\theta' & -\cos\theta\sin\theta' - \sin\theta\cos\theta' \\ \sin\theta\cos\theta' + \cos\theta\sin\theta' & \cos\theta\cos\theta' - \sin\theta\sin\theta' \end{pmatrix} = R(\theta + \theta'). $$ Each \(R(\theta)\) lies in \(\mathrm{SO}_2(\mathbb{R})\) by the § 4.1 Theorem, so \(R\) maps \(\mathbb{R}\) into the group \((\mathrm{SO}_2(\mathbb{R}), \times)\); the identity \(R(\theta)R(\theta') = R(\theta + \theta')\) then makes \(R\) a group morphism \((\mathbb{R}, +) \to (\mathrm{SO}_2(\mathbb{R}), \times)\). It is surjective onto \(\mathrm{SO}_2(\mathbb{R})\): the § 4.1 Theorem says every matrix of \(\mathrm{SO}_2(\mathbb{R})\) is some \(R(\theta)\). Its kernel: \(R(\theta) = I_2\) iff \(\cos\theta = 1\) and \(\sin\theta = 0\), iff \(\theta \in 2\pi\mathbb{Z}\). Finally \(\mathrm{SO}_2(\mathbb{R})\) is abelian because \(R(\theta)R(\theta') = R(\theta + \theta') = R(\theta' + \theta) = R(\theta')R(\theta)\), and \(R(\theta) = R(\theta')\) iff \(R(\theta - \theta') = I_2\) iff \(\theta - \theta' \in 2\pi\mathbb{Z}\) --- the parameter is determined modulo \(2\pi\).

Definition — Rotation of an oriented Euclidean plane

Let \(E\) be an oriented Euclidean plane and \(\theta \in \mathbb{R}\). The rotation of angle \(\theta\) is the endomorphism of \(E\) whose matrix in a direct orthonormal basis is \(R(\theta)\).

This definition does not depend on the chosen direct orthonormal basis. If \(\mathcal{B}\) and \(\mathcal{B}'\) are two direct orthonormal bases, the change-of-basis matrix \(P\) is orthogonal with determinant \(+1\), so \(P \in \mathrm{SO}_2(\mathbb{R})\). The matrix of the rotation in \(\mathcal{B}'\) is then $$ P^{-1} R(\theta)\, P = P^{-1} P\, R(\theta) = R(\theta), $$ since \(\mathrm{SO}_2(\mathbb{R})\) is abelian, so \(R(\theta)\) and \(P\) commute --- which is what lets \(R(\theta)\,P\) be rewritten \(P\,R(\theta)\) above. The matrix is \(R(\theta)\) in every direct orthonormal basis, and the rotation of angle \(\theta\) is well defined.

Example — Unit complex numbers and plane rotations

Let \(\mathbb{U}\) be the group \((\{z \in \mathbb{C} : |z| = 1\}, \times)\) of unit complex numbers, recalled from Complex numbers; every element of \(\mathbb{U}\) is \(\mathrm{e}^{\mathrm{i}\theta}\) for some \(\theta \in \mathbb{R}\), and \(\mathrm{e}^{\mathrm{i}\theta} = \mathrm{e}^{\mathrm{i}\theta'}\) exactly when \(\theta - \theta' \in 2\pi\mathbb{Z}\). Set \(\Phi(\mathrm{e}^{\mathrm{i}\theta}) = R(\theta)\). This is well defined: if \(\mathrm{e}^{\mathrm{i}\theta} = \mathrm{e}^{\mathrm{i}\theta'}\) then \(\theta - \theta' \in 2\pi\mathbb{Z}\), so \(R(\theta) = R(\theta')\). It is a morphism: \(\Phi(\mathrm{e}^{\mathrm{i}\theta}\,\mathrm{e}^{\mathrm{i}\theta'}) = \Phi(\mathrm{e}^{\mathrm{i}(\theta+\theta')}) = R(\theta+\theta') = R(\theta)R(\theta')\). It is bijective: surjective since every element of \(\mathrm{SO}_2(\mathbb{R})\) is some \(R(\theta) = \Phi(\mathrm{e}^{\mathrm{i}\theta})\), injective since \(\Phi(\mathrm{e}^{\mathrm{i}\theta}) = I_2\) forces \(\theta \in 2\pi\mathbb{Z}\), i.e. \(\mathrm{e}^{\mathrm{i}\theta} = 1\). So \(\Phi\) is a group isomorphism $$ \mathbb{U} \xrightarrow{\ \sim\ } \mathrm{SO}_2(\mathbb{R}), \qquad \mathrm{e}^{\mathrm{i}\theta} \longmapsto R(\theta), $$ the isomorphism \(\mathbb{U} \simeq \mathrm{SO}_2(\mathbb{R})\) of the program: a plane rotation « is » a unit complex number.

A rotation of angle \(\theta\) acts on a vector by turning it through the angle \(\theta\); the angle is read directly from the matrix \(R(\theta)\).

Theorem — Classification of plane isometries

Let \(E\) be an oriented Euclidean plane. A positive isometry of \(E\) is a rotation; a negative isometry of \(E\) is a reflection across a line.

Let \(u \in \mathrm{O}(E)\) and \(M\) its matrix in an orthonormal basis; \(M \in \mathrm{O}_2(\mathbb{R})\).
Positive case (\(\det u = +1\)). Take a direct orthonormal basis; the matrix of \(u\) in it is orthogonal with determinant \(+1\), hence lies in \(\mathrm{SO}_2(\mathbb{R})\), hence equals \(R(\theta)\) for some \(\theta\) by the § 4.1 Theorem. So \(u\) is the rotation of angle \(\theta\).
Negative case (\(\det u = -1\)). By the § 4.1 Theorem, \(M = \begin{pmatrix} \cos\theta & \sin\theta \\ \sin\theta & -\cos\theta \end{pmatrix}\) in an orthonormal basis. A direct computation gives $$ M^2 = \begin{pmatrix} \cos^2\theta + \sin^2\theta & \cos\theta\sin\theta - \sin\theta\cos\theta \\ \sin\theta\cos\theta - \cos\theta\sin\theta & \sin^2\theta + \cos^2\theta \end{pmatrix} = I_2, $$ Since \(M^2 = I_2\), \(u^2 = \mathrm{id}_E\), so \(u\) is a vector symmetry; and \(M\) is symmetric, so by the § 3.2 matrix characterisation \(u\) is an orthogonal symmetry with respect to some subspace \(F\). In an orthonormal basis adapted to \(E = F \oplus F^\perp\), that symmetry has the diagonal matrix carrying \(\dim F\) entries \(+1\) and \(\dim F^\perp\) entries \(-1\), so \(\det u = (-1)^{\dim F^\perp}\); here \(\det u = -1\) forces \(\dim F^\perp = 1\), so \(F\) is a line: \(u\) is the reflection across the line \(F\).

Example — The classification table of plane isometries

The isometries of a Euclidean plane fall into four types, sorted by determinant and by real eigenvalues:

\(\mathrm{id}_E\): positive, \(R(0) = I_2\), only real eigenvalue \(1\) (eigenspace \(E\));
\(-\mathrm{id}_E\): positive, \(R(\pi) = -I_2\), only real eigenvalue \(-1\) (eigenspace \(E\));
rotation of angle \(\theta \notin \pi\mathbb{Z}\): positive, \(R(\theta)\), no real eigenvalue --- the eigenspaces \(E_1\) and \(E_{-1}\) are both \(\{0_E\}\);
reflection across a line: negative, real eigenvalues \(1\) and \(-1\), eigenspaces the line and its orthogonal.

Among the four types, the rotation of angle \(\theta \notin \pi\mathbb{Z}\) is the only one with no real eigenvalue at all.

Method — Identify a plane isometry from its matrix

Given the matrix \(M \in \mathrm{O}_2(\mathbb{R})\) of a plane isometry \(u\) in a direct orthonormal basis of the oriented plane:

compute \(\det M\);
if \(\det M = +1\): then \(M = R(\theta)\), and \(u\) is the rotation of angle \(\theta\) --- read \(\cos\theta\) off the entry \((1,1)\) and \(\sin\theta\) off the entry \((2,1)\) (the basis being direct, this \(\sin\theta\) is the correctly signed angle);
if \(\det M = -1\): then \(u\) is a reflection; its axis is the line of fixed vectors, the eigenspace \(\mathrm{Ker}(u - \mathrm{id}_E)\).

Skills to practice

Identifying a plane isometry

V Reduction of isometries

V.1 The reduction theorem

Dimensions \(1\) and \(2\) are now fully understood: an isometry is \(\pm\mathrm{id}\), a rotation, or a reflection. The reduction theorem extends this to every dimension: in a well-chosen orthonormal basis, an isometry breaks into independent blocks of size \(1\) or \(2\) --- a \(+1\), a \(-1\), or a plane rotation \(R(\theta)\). The proof rests on two facts: the orthogonal of a stable subspace is stable, and every endomorphism admits a stable line or a stable plane.

Recall (polynomial of an endomorphism). For a polynomial \(P = \sum_k a_k X^k \in \mathbb{R}[X]\) and an endomorphism \(u \in \mathcal{L}(E)\), one writes \(P(u) = \sum_k a_k\, u^k\), with \(u^0 = \mathrm{id}_E\) --- the polynomial of the endomorphism \(u\), introduced in the first year. Two polynomials in the same \(u\) commute, and \((PQ)(u) = P(u) \circ Q(u)\). These facts are recalled from the first year and used in the next proof; nothing beyond them is needed.

Proposition — The orthogonal of a stable subspace

Let \(u \in \mathrm{O}(E)\) and \(F\) a subspace of \(E\) stable by \(u\). Then \(F^\perp\) is stable by \(u\).

Since \(u\) is an isometry it is injective, so its restriction \(u_{|F} \colon F \to F\) is injective; as \(F\) is finite-dimensional, \(u_{|F}\) is bijective, hence \(u(F) = F\). Now let \(y \in F^\perp\); we show \(u(y) \in F^\perp\). Take any \(x' \in F\). Since \(u(F) = F\), there is \(x \in F\) with \(x' = u(x)\), and then $$ \langle u(y)\,|\,x'\rangle = \langle u(y)\,|\,u(x)\rangle = \langle y\,|\,x\rangle = 0, $$ the scalar product being preserved by \(u\) and \(y\) orthogonal to \(x \in F\). So \(u(y) \in F^\perp\), and \(F^\perp\) is stable by \(u\).

Proposition — Existence of a stable line or plane

Every endomorphism of a real vector space of finite dimension at least \(1\) has a stable line or a stable plane.

Let \(u \in \mathcal{L}(E)\) with \(\dim E \ge 1\), and pick \(x \ne 0_E\). The family \(\bigl(u^k(x)\bigr)_{k \ge 0}\) has infinitely many vectors in the finite-dimensional space \(E\), so it is linearly dependent: there is a non-zero polynomial \(P \in \mathbb{R}[X]\) with \(P(u)(x) = 0_E\). As \(x \ne 0_E\), \(P\) cannot be a non-zero constant (a constant \(c \ne 0\) would give \(P(u)(x) = c\,x \ne 0_E\)), so \(\deg P \ge 1\). Factor \(P\) over \(\mathbb{R}\) --- recalled from Arithmetic of polynomials, every real polynomial factors into irreducible factors of degree \(1\) and \(2\): $$ P = \alpha\, Q_1 \cdots Q_m, $$ with \(\alpha \ne 0\) the leading coefficient and each \(Q_j\) monic of degree \(1\) or \(2\). Since \(\alpha \ne 0\), the relation \(P(u)(x) = 0_E\) gives \((Q_1 \cdots Q_m)(u)(x) = 0_E\).
Set \(v_0 = x\) and \(v_j = \bigl(Q_{m-j+1}(u) \circ \dots \circ Q_m(u)\bigr)(x)\) for \(1 \le j \le m\), so \(v_0 = x \ne 0_E\) and \(v_m = (Q_1 \cdots Q_m)(u)(x) = 0_E\). Let \(j\) be the largest index with \(v_j \ne 0_E\) --- it exists, since \(v_0 \ne 0_E\), and \(j < m\) since \(v_m = 0_E\). Put \(y = v_j \ne 0_E\) and \(Q = Q_{m-j}\), of degree \(1\) or \(2\); then $$ Q(u)(y) = Q_{m-j}(u)(v_j) = v_{j+1} = 0_E. $$

If \(\deg Q = 1\): \(Q\) monic gives \(Q = X - \lambda\), so \((u - \lambda\,\mathrm{id}_E)(y) = 0_E\), i.e. \(u(y) = \lambda y\). Then \(\mathrm{Vect}(y)\) is a stable line.
If \(\deg Q = 2\): \(Q\) monic gives \(Q = X^2 + bX + c\), so \(u^2(y) = -b\,u(y) - c\,y\). The subspace \(\mathrm{Vect}(y, u(y))\) contains \(u(y)\) and \(u(u(y)) = u^2(y) = -b\,u(y) - c\,y\), so it is \(u\)-stable; it is a stable line or a stable plane.

The argument uses only the factorisation of \(\mathbb{R}[X]\) into degree-\(1\) and degree-\(2\) irreducibles --- no minimal and no characteristic polynomial.

Theorem — Reduction of a vector isometry

Let \(u \in \mathrm{O}(E)\). There exists an orthonormal basis of \(E\) in which the matrix of \(u\) is block-diagonal, $$ \mathrm{Mat}(u) = I_p \oplus (-I_q) \oplus R(\theta_1) \oplus \dots \oplus R(\theta_r), $$ with \(p + q + 2r = n\) and each \(\theta_i \in \mathbb{R} \setminus \pi\mathbb{Z}\). Here \(\oplus\) denotes block-diagonal juxtaposition: an \(I_p\) block, a \(-I_q\) block, and \(r\) rotation blocks \(R(\theta_i)\) of size \(2\); any of \(p\), \(q\), \(r\) may be \(0\).

Strong induction on \(n = \dim E\).
Base cases. For \(n = 0\), \(E = \{0_E\}\) and the statement holds vacuously, with the empty basis and \(p = q = r = 0\). For \(n = 1\), \(u \in \mathrm{O}(E)\) sends a unit vector \(e\) to \(u(e) = \lambda e\) with \(|\lambda| = \norme{u(e)} = 1\), so \(\lambda = \pm 1\): the matrix is \(I_1\) or \(-I_1\). For \(n = 2\), fix any orientation of the plane (the resulting block form does not depend on it) and apply the § 4.2 classification: \(u\) a rotation, with matrix \(R(\theta)\) in a suitable basis --- which is \(I_2\) if \(\theta \in 2\pi\mathbb{Z}\), \(-I_2\) if \(\theta \in \pi + 2\pi\mathbb{Z}\), a genuine \(R(\theta)\) block with \(\theta \notin \pi\mathbb{Z}\) otherwise; or \(u\) a reflection, with matrix \(\mathrm{diag}(1,-1) = I_1 \oplus (-I_1)\) in an orthonormal eigenbasis. Every case has the announced form.
Inductive step (\(n \ge 3\), the result assumed in every dimension \(< n\)). By the « stable line or plane » Proposition, \(u\) has a stable subspace \(F\) with \(\dim F \in \{1,2\}\). By the previous Proposition \(F^\perp\) is stable, and \(E = F \oplus F^\perp\) with \(F \perp F^\perp\). The induced endomorphisms \(u_{|F}\) and \(u_{|F^\perp}\) preserve the norm, so they are isometries of \(F\) and of \(F^\perp\) respectively, two spaces of dimension \(< n\). The induction hypothesis gives an orthonormal basis of \(F\) and one of \(F^\perp\) in which the matrices have the announced block-diagonal form. Concatenating these two bases yields an orthonormal basis of \(E\) (since \(E = F \oplus F^\perp\) is an orthogonal direct sum), in which the matrix of \(u\) is the juxtaposition of the two block forms --- again of the announced type.

Example — Determinant of a reduced isometry

From the reduced form, the determinant is read off at once. Each rotation block has \(\det R(\theta_i) = 1\), the \(I_p\) block contributes \(1\), the \(-I_q\) block contributes \((-1)^q\): $$ \det u = 1^p \cdot (-1)^q \cdot 1^r = (-1)^q. $$ So \(u\) is a positive isometry if and only if \(q\) --- the number of \(-1\) blocks --- is even.

Method — Reduce a vector isometry

To put an isometry \(u \in \mathrm{O}(E)\) into reduced form:

compute the eigenspaces \(E_1 = \mathrm{Ker}(u - \mathrm{id}_E)\) and \(E_{-1} = \mathrm{Ker}(u + \mathrm{id}_E)\) --- they carry the \(I_p\) and \(-I_q\) blocks;
the orthogonal \((E_1 \oplus E_{-1})^\perp\) is \(u\)-stable (the orthogonal of a \(u\)-stable subspace, by § 5.1); the rotation blocks \(R(\theta_i)\) have \(\theta_i \notin \pi\mathbb{Z}\), hence no \(\pm 1\)-eigenvector, so every \(\pm 1\)-eigenvector lies in \(E_1 \oplus E_{-1}\) and this orthogonal carries only rotation blocks --- a subspace of even dimension \(2r\);
in dimension \(2\) this is exactly § 4; in dimension \(3\), § 5.2 below makes the reduced form explicit (\(I_1 \oplus R(\theta)\) or \((-I_1) \oplus R(\theta)\)), with \(\cos\theta\) read off the trace;
in dimension \(> 3\) the method is only structural: \(E_1\) and \(E_{-1}\) give the \(\pm 1\) blocks, but splitting \((E_1 \oplus E_{-1})^\perp\) into rotation planes is guaranteed by the theorem, not computed by this method.

Skills to practice

Reducing a vector isometry

V.2 Isometries in dimension 3

In dimension \(3\) the reduction theorem becomes fully explicit. Throughout this subsection \(E\) is an oriented Euclidean space of dimension \(3\) --- an orientation fixed once and for all, in the sense of § 2.2, so that « direct orthonormal basis » is meaningful. A positive isometry is then a rotation about an axis; a negative one combines a rotation with a reflection.

Theorem — Positive isometries of a three-dimensional space

Let \(E\) be an oriented Euclidean space of dimension \(3\) and \(u \in \mathrm{SO}(E)\). There is a direct orthonormal basis in which $$ \mathrm{Mat}(u) = I_1 \oplus R(\theta), \qquad \text{and} \qquad \mathrm{Tr}\,u = 1 + 2\cos\theta. $$ The isometry \(u\) is a rotation. When \(u \ne \mathrm{id}_E\), its axis is the line \(\mathrm{Ker}(u - \mathrm{id}_E)\).

Apply the reduction Theorem to \(u\): in some orthonormal basis, \(\mathrm{Mat}(u) = I_p \oplus (-I_q) \oplus R(\theta_1) \oplus \dots \oplus R(\theta_r)\), with \(p + q + 2r = 3\). Since \(u \in \mathrm{SO}(E)\), \(\det u = (-1)^q = +1\), so \(q\) is even. A pair of \(+1\) blocks reads as \(R(0)\) and a pair of \(-1\) blocks as \(R(\pi)\); regrouping, and using \(p + q + 2r = 3\) with \(q\) even, the only possible shape is \(I_1 \oplus R(\theta)\), where \(\theta = 0\) when \(u = \mathrm{id}_E\), \(\theta = \pi\) when the matrix is \(\mathrm{diag}(1,-1,-1)\), and \(\theta \notin \pi\mathbb{Z}\) otherwise. The trace of \(I_1 \oplus R(\theta)\) is \(1 + 2\cos\theta\).
The basis can be taken direct: if the one obtained is indirect, negate its first vector --- the one spanning the \(I_1\) block. That vector is fixed by \(u\), so the matrix \(I_1 \oplus R(\theta)\) is unchanged, while the orientation of the basis is reversed. Finally, when \(u \ne \mathrm{id}_E\) (so \(\theta \notin 2\pi\mathbb{Z}\)), the rotation block \(R(\theta)\) fixes no non-zero vector, so \(\mathrm{Ker}(u - \mathrm{id}_E)\) is exactly the line of the \(I_1\) block --- the axis.

Theorem — Negative isometries of a three-dimensional space

Let \(E\) be an oriented Euclidean space of dimension \(3\) and \(u \in \mathrm{O}(E) \setminus \mathrm{SO}(E)\). There is a direct orthonormal basis in which $$ \mathrm{Mat}(u) = (-I_1) \oplus R(\theta). $$ Such a \(u\) is the composite of the rotation of angle \(\theta\) in the plane of the \(R(\theta)\) block, extended by the identity on the orthogonal line, with the reflection across that plane. In particular \(u = -\mathrm{id}_E\) when \(\theta = \pi\), and \(u\) is a plane reflection when \(\theta = 0\).

Apply the reduction Theorem: \(\mathrm{Mat}(u) = I_p \oplus (-I_q) \oplus R(\theta_1) \oplus \dots \oplus R(\theta_r)\) with \(p + q + 2r = 3\) and \(\det u = (-1)^q = -1\), so \(q\) is odd. Regrouping pairs of equal \(\pm 1\) blocks as \(R(0)\) or \(R(\pi)\) leaves exactly one \(-1\) block, hence the shape \((-I_1) \oplus R(\theta)\). As in the positive case the basis can be taken direct: negating the vector of the \(-I_1\) block reverses the orientation and leaves the matrix unchanged. The matrix \((-I_1) \oplus R(\theta)\) visibly factors as the commuting product of the rotation \(I_1 \oplus R(\theta)\) and the reflection \(\mathrm{diag}(-1,1,1)\) across the \(R(\theta)\)-plane.

Remark (hors attendu). The program states, verbatim: « La pratique du calcul des éléments géométriques d'un élément de \(\mathrm{SO}_3(\mathbb{R})\) n'est pas un attendu du programme. » The chapter does give the axis \(\mathrm{Ker}(u - \mathrm{id}_E)\) and the value \(\cos\theta = \tfrac{1}{2}(\mathrm{Tr}\,u - 1)\), but the full determination of the geometric elements --- in particular the sign of the rotation angle --- is not drilled here as an examinable method.

Example — A rotation of three-dimensional space

Take \(\mathbb{R}^3\) with its canonical orientation and the matrix $$ A = \begin{pmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix}. $$ Its columns form a direct orthonormal family, so \(A \in \mathrm{SO}_3(\mathbb{R})\): by the Theorem, \(A\) is a rotation. Its trace is \(\mathrm{Tr}\,A = 0\), so \(1 + 2\cos\theta = 0\), that is \(\cos\theta = -\tfrac{1}{2}\). The axis is \(\mathrm{Ker}(A - I_3)\): solving \((A - I_3)v = 0_E\) gives \(v_1 = v_2 = v_3\), so the axis is the line \(\mathrm{Vect}\bigl((1,1,1)\bigr)\). Thus \(A\) is the rotation about \(\mathrm{Vect}\bigl((1,1,1)\bigr)\) with \(\cos\theta = -\tfrac{1}{2}\) --- the precise signed angle being, as noted above, hors attendu.

A rotation of three-dimensional space fixes its axis \(D\) pointwise and turns the orthogonal plane \(D^\perp\) through the angle \(\theta\).

Going further

This chapter classified the endomorphisms that preserve the geometry of a Euclidean space: the isometries, completely reduced into \(\pm 1\) blocks and plane rotations. Chapter Self-adjoint endomorphisms treats the other distinguished family --- the endomorphisms equal to their own adjoint --- and proves the spectral theorem: a self-adjoint endomorphism is diagonalisable in an orthonormal basis. Isometries and self-adjoint endomorphisms are the two halves of the geometry of a Euclidean space, and the reduction techniques of this chapter and the next are their two complementary faces.

Skills to practice

Studying isometries in dimension 3

Jump to section