CommeUnJeu · L2 MP
Polynomials of an endomorphism
The chapter Reduction: eigen-elements and diagonalisation reduced an endomorphism through its geometry --- eigenvalues, eigenspaces, the characteristic polynomial. This chapter adds the algebraic tool that makes reduction systematic: a polynomial can be evaluated not only at a number but at an endomorphism \(u\), producing \(P(u)\), and a single relation \(P(u) = 0\) turns out to govern the whole reduction.
The chapter has four sections. Section~1 builds \(P(u)\) and the subalgebra \(\mathbb{K}[u]\), then isolates the annihilating polynomials --- those with \(P(u) = 0\) --- and the smallest of them, the minimal polynomial \(\pi_u\). Section~2 proves the kernel-decomposition lemma: a polynomial in \(u\) that factors into coprime pieces splits the space along those pieces. Section~3 turns this into the cleanest diagonalisability test --- \(u\) is diagonalisable exactly when \(\pi_u\) is split with simple roots. Section~4 proves the Cayley--Hamilton theorem, which provides one annihilating polynomial for free, and decomposes the space into characteristic subspaces.
Standing notation. Throughout, \(\mathbb{K}\) is a subfield of \(\mathbb{C}\); \(E\) is a non-zero \(\mathbb{K}\)-vector space of finite dimension \(\dim E = n \ge 1\); \(u \in \mathcal{L}(E)\) is an endomorphism, \(M \in \mathcal{M}_n(\mathbb{K})\) a square matrix, \(\mathrm{Id}_E\) the identity. The eigenvalue, eigenvector, eigenspace \(E_\lambda(u) = \mathrm{Ker}(u - \lambda\,\mathrm{Id}_E)\), spectrum \(\mathrm{Sp}(u)\), characteristic polynomial \(\chi_u\) (monic of degree \(n\)), multiplicity \(m(\lambda)\), diagonalisable, trigonalisable, nilpotent endomorphism, induced endomorphism on a stable subspace --- with \(\chi_{u_F} \mid \chi_u\) --- are those of Reduction: eigen-elements and diagonalisation; the divisibility, gcd and Bézout's theorem in \(\mathbb{K}[X]\), and the split polynomials and irreducible factorisation, are those of Arithmetic of polynomials and Polynomials. The symbol \(A \sim B\) denotes similar matrices.
The chapter has four sections. Section~1 builds \(P(u)\) and the subalgebra \(\mathbb{K}[u]\), then isolates the annihilating polynomials --- those with \(P(u) = 0\) --- and the smallest of them, the minimal polynomial \(\pi_u\). Section~2 proves the kernel-decomposition lemma: a polynomial in \(u\) that factors into coprime pieces splits the space along those pieces. Section~3 turns this into the cleanest diagonalisability test --- \(u\) is diagonalisable exactly when \(\pi_u\) is split with simple roots. Section~4 proves the Cayley--Hamilton theorem, which provides one annihilating polynomial for free, and decomposes the space into characteristic subspaces.
Standing notation. Throughout, \(\mathbb{K}\) is a subfield of \(\mathbb{C}\); \(E\) is a non-zero \(\mathbb{K}\)-vector space of finite dimension \(\dim E = n \ge 1\); \(u \in \mathcal{L}(E)\) is an endomorphism, \(M \in \mathcal{M}_n(\mathbb{K})\) a square matrix, \(\mathrm{Id}_E\) the identity. The eigenvalue, eigenvector, eigenspace \(E_\lambda(u) = \mathrm{Ker}(u - \lambda\,\mathrm{Id}_E)\), spectrum \(\mathrm{Sp}(u)\), characteristic polynomial \(\chi_u\) (monic of degree \(n\)), multiplicity \(m(\lambda)\), diagonalisable, trigonalisable, nilpotent endomorphism, induced endomorphism on a stable subspace --- with \(\chi_{u_F} \mid \chi_u\) --- are those of Reduction: eigen-elements and diagonalisation; the divisibility, gcd and Bézout's theorem in \(\mathbb{K}[X]\), and the split polynomials and irreducible factorisation, are those of Arithmetic of polynomials and Polynomials. The symbol \(A \sim B\) denotes similar matrices.
I
Polynomials of an endomorphism
I.1
The algebra of polynomials in an endomorphism
A polynomial \(P = \sum_k a_k X^k\) can be evaluated at a number. It can equally be evaluated at an endomorphism \(u\): replace each power \(X^k\) by the iterate \(u^k\) and each constant by the corresponding homothety. The result \(P(u)\) is again an endomorphism, and --- as we now check --- the rules of polynomial algebra survive the substitution.
Definition — Polynomial of an endomorphism
Let \(u \in \mathcal{L}(E)\) and \(P = \sum_{k=0}^d a_k X^k \in \mathbb{K}[X]\). The polynomial \(P\) of the endomorphism \(u\) is the endomorphism $$ \textcolor{colordef}{P(u) = \sum_{k=0}^d a_k\, u^k,} $$ where \(u^k = u \circ \dots \circ u\) (\(k\) factors) and \(u^0 = \mathrm{Id}_E\). For a matrix \(M \in \mathcal{M}_n(\mathbb{K})\), \(P(M) = \sum_k a_k M^k\) with \(M^0 = I_n\). Example — Reading off a polynomial of an endomorphism
For \(P = X^2 - 3X + 2\), one has \(P(u) = u^2 - 3u + 2\,\mathrm{Id}_E\). For a homothety \(u = \lambda\,\mathrm{Id}_E\), every power is \(u^k = \lambda^k\,\mathrm{Id}_E\), so \(P(\lambda\,\mathrm{Id}_E) = \bigl(\sum_k a_k \lambda^k\bigr)\mathrm{Id}_E = P(\lambda)\,\mathrm{Id}_E\): evaluating at a homothety is evaluating at the scalar. Example — A polynomial of a matrix
Take \(M = \begin{pmatrix} 1 & 1 \\ 0 & 2 \end{pmatrix}\) and \(P = X^2 - X\). Then \(M^2 = \begin{pmatrix} 1 & 3 \\ 0 & 4 \end{pmatrix}\), so $$ P(M) = M^2 - M = \begin{pmatrix} 1 & 3 \\
0 & 4 \end{pmatrix} - \begin{pmatrix} 1 & 1 \\
0 & 2 \end{pmatrix} = \begin{pmatrix} 0 & 2 \\
0 & 2 \end{pmatrix}. $$ Theorem — The morphism of polynomials in an endomorphism
Fix \(u \in \mathcal{L}(E)\). The map \(\Phi_u \colon P \mapsto P(u)\) is an algebra morphism from \(\mathbb{K}[X]\) to \(\mathcal{L}(E)\): $$ \textcolor{colorprop}{(P + \lambda Q)(u) = P(u) + \lambda\, Q(u), \qquad (PQ)(u) = P(u) \circ Q(u), \qquad 1(u) = \mathrm{Id}_E.} $$ Its image \(\mathbb{K}[u] = \{ P(u) : P \in \mathbb{K}[X]\}\) is a commutative subalgebra of \(\mathcal{L}(E)\).
Linearity is read off the definition: if \(P = \sum a_k X^k\) and \(Q = \sum b_k X^k\), then \((P + \lambda Q)(u) = \sum (a_k + \lambda b_k) u^k = P(u) + \lambda\, Q(u)\). The image of \(1\) is \(1 \cdot u^0 = \mathrm{Id}_E\). For the product, write \(P = \sum_i a_i X^i\) and \(Q = \sum_j b_j X^j\), so \(PQ = \sum_k \bigl(\sum_{i+j=k} a_i b_j\bigr) X^k\). Since \(u^i \circ u^j = u^{i+j}\), $$ \begin{aligned} P(u) \circ Q(u) &= \Bigl(\sum_i a_i u^i\Bigr) \circ \Bigl(\sum_j b_j u^j\Bigr) = \sum_{i,j} a_i b_j\, u^i \circ u^j\\
&= \sum_{i,j} a_i b_j\, u^{i+j} = \sum_k \Bigl(\sum_{i+j=k} a_i b_j\Bigr) u^k = (PQ)(u). \end{aligned} $$ So \(\Phi_u\) is an algebra morphism. Its image \(\mathbb{K}[u]\) is therefore a subalgebra of \(\mathcal{L}(E)\), and it is commutative: for any \(P, Q\), \(P(u) \circ Q(u) = (PQ)(u) = (QP)(u) = Q(u) \circ P(u)\), the product of \(\mathbb{K}[X]\) being commutative.
Notation. The product \((PQ)(u) = P(u) \circ Q(u)\) shows that composition of polynomials in \(u\) is just multiplication of polynomials read through \(\Phi_u\). Composition of two elements of \(\mathbb{K}[u]\) is therefore commonly written as a product, \(P(u)\,Q(u)\), and powers \(u^k\) as such. Beware the one trap: \(P(u)\) is the endomorphism \(\sum a_k u^k\), applied to a vector \(x\) as \(P(u)(x)\); this is not \(P(u(x))\) --- one evaluates the endomorphism \(P(u)\) at the vector \(x\), never a polynomial at a vector.
The morphism \(\Phi_u\) carries the polynomial algebra \(\mathbb{K}[X]\) into \(\mathcal{L}(E)\); its image \(\mathbb{K}[u]\) is the subalgebra of all polynomials evaluated at \(u\).
Proposition — Kernel and image of a polynomial in u
For every \(P \in \mathbb{K}[X]\), the subspaces \(\mathrm{Ker}\bigl(P(u)\bigr)\) and \(\mathrm{Im}\bigl(P(u)\bigr)\) are stable by \(u\).
The endomorphisms \(u\) and \(P(u)\) commute: \(u \circ P(u) = \sum_k a_k\, u^{k+1} = P(u) \circ u\). Let \(x \in \mathrm{Ker}\bigl(P(u)\bigr)\); then \(P(u)\bigl(u(x)\bigr) = u\bigl(P(u)(x)\bigr) = u(0_E) = 0_E\), so \(u(x) \in \mathrm{Ker}\bigl(P(u)\bigr)\). Let \(y \in \mathrm{Im}\bigl(P(u)\bigr)\), say \(y = P(u)(z)\); then \(u(y) = u\bigl(P(u)(z)\bigr) = P(u)\bigl(u(z)\bigr) \in \mathrm{Im}\bigl(P(u)\bigr)\). Both subspaces are stable by \(u\).
Example — The eigenspace as a polynomial kernel
For \(P = X - \lambda\), \(P(u) = u - \lambda\,\mathrm{Id}_E\) and \(\mathrm{Ker}\bigl(P(u)\bigr) = \mathrm{Ker}(u - \lambda\,\mathrm{Id}_E) = E_\lambda(u)\), the \(\lambda\)-eigenspace recalled from Reduction: eigen-elements and diagonalisation. The eigenspaces are the simplest kernels of polynomials in \(u\) --- the kernel-decomposition lemma of § 2 will produce the general ones. Method — Exploit a polynomial relation satisfied by an endomorphism
When \(u\) satisfies a relation \(P(u) = 0\), rewrite it to read off information about \(u\): - from \(u^2 = a\,u + b\,\mathrm{Id}_E\), the powers \(u^k\) all lie in \(\mathrm{Vect}(\mathrm{Id}_E, u)\) --- compute them by the recurrence \(u^{k+1} = a\,u^k + b\,u^{k-1}\);
- if moreover \(b \ne 0\), then \(u \circ \tfrac{1}{b}(u - a\,\mathrm{Id}_E) = \mathrm{Id}_E\), so \(u\) is invertible with \(u^{-1} = \tfrac{1}{b}(u - a\,\mathrm{Id}_E)\).
Skills to practice
- Computing polynomials of an endomorphism
I.2
Annihilating polynomials and the minimal polynomial
Some polynomials send \(u\) to the zero endomorphism. In finite dimension there is always at least one such non-zero polynomial, and among them a smallest one --- the minimal polynomial. It divides every other, so it carries all the algebraic information a polynomial relation can give about \(u\).
Definition — Annihilating polynomial
A polynomial \(P \in \mathbb{K}[X]\) is an annihilating polynomial of \(u\) (one says \(P\) annihilates \(u\)) when \(P(u) = 0_{\mathcal{L}(E)}\). Likewise \(P\) annihilates a matrix \(M\) when \(P(M) = 0\). Example — Annihilating polynomials of familiar endomorphisms
\(X - \lambda\) annihilates the homothety \(\lambda\,\mathrm{Id}_E\). A projector \(p\) satisfies \(p^2 = p\), so \(X^2 - X\) annihilates it. A symmetry \(s\) satisfies \(s^2 = \mathrm{Id}_E\), so \(X^2 - 1\) annihilates it. A nilpotent endomorphism of index \(p\) satisfies \(u^p = 0\), so \(X^p\) annihilates it. Proposition — Existence of an annihilating polynomial
In finite dimension, every \(u \in \mathcal{L}(E)\) admits a non-zero annihilating polynomial.
The space \(\mathcal{L}(E)\) has dimension \(n^2\). The family \(\bigl(\mathrm{Id}_E, u, u^2, \dots, u^{n^2}\bigr)\) has \(n^2 + 1\) vectors of \(\mathcal{L}(E)\), hence is linearly dependent: there are scalars \(a_0, \dots, a_{n^2}\), not all zero, with \(\sum_{k=0}^{n^2} a_k\, u^k = 0_{\mathcal{L}(E)}\). The polynomial \(P = \sum_{k=0}^{n^2} a_k X^k\) is non-zero and \(P(u) = 0\).
Theorem — The minimal polynomial
There is a unique monic polynomial of minimal degree among the non-zero annihilating polynomials of \(u\): the minimal polynomial \(\pi_u\). A polynomial \(P\) annihilates \(u\) if and only if $$ \textcolor{colorprop}{\pi_u \mid P.} $$
A non-zero constant \(c\) has \(c(u) = c\,\mathrm{Id}_E \ne 0\), so a non-zero annihilating polynomial has degree at least \(1\). The set of degrees of the non-zero annihilating polynomials is a non-empty subset of \(\mathbb{N}\) (Proposition above), so it has a least element \(d\); pick a non-zero annihilator of degree \(d\) and divide it by its leading coefficient to obtain a monic annihilator \(\pi_u\) of degree \(d\).
Let \(P\) be any annihilating polynomial. Euclidean division by \(\pi_u\) gives \(P = Q\,\pi_u + R\) with \(\deg R < d\). Then \(R(u) = P(u) - Q(u) \circ \pi_u(u) = 0 - 0 = 0\), so \(R\) annihilates \(u\); if \(R \ne 0\) it would be a non-zero annihilator of degree \(< d\), against the minimality of \(d\) --- hence \(R = 0\) and \(\pi_u \mid P\). Conversely, if \(\pi_u \mid P\), say \(P = Q\,\pi_u\), then \(P(u) = Q(u) \circ \pi_u(u) = 0\).
Uniqueness: if \(\pi\) and \(\pi'\) are both monic annihilators of minimal degree \(d\), each divides the other (both annihilate \(u\)), so they are equal up to a scalar; both monic forces \(\pi = \pi'\).
Let \(P\) be any annihilating polynomial. Euclidean division by \(\pi_u\) gives \(P = Q\,\pi_u + R\) with \(\deg R < d\). Then \(R(u) = P(u) - Q(u) \circ \pi_u(u) = 0 - 0 = 0\), so \(R\) annihilates \(u\); if \(R \ne 0\) it would be a non-zero annihilator of degree \(< d\), against the minimality of \(d\) --- hence \(R = 0\) and \(\pi_u \mid P\). Conversely, if \(\pi_u \mid P\), say \(P = Q\,\pi_u\), then \(P(u) = Q(u) \circ \pi_u(u) = 0\).
Uniqueness: if \(\pi\) and \(\pi'\) are both monic annihilators of minimal degree \(d\), each divides the other (both annihilate \(u\)), so they are equal up to a scalar; both monic forces \(\pi = \pi'\).
Example — Minimal polynomials of familiar endomorphisms
The homothety \(\lambda\,\mathrm{Id}_E\) has \(\pi_u = X - \lambda\). A projector \(p\) has \(\pi_p = X^2 - X\), unless \(p = 0\) (\(\pi_p = X\)) or \(p = \mathrm{Id}_E\) (\(\pi_p = X - 1\)). A symmetry \(s\) has \(\pi_s = X^2 - 1\), unless \(s = \mathrm{Id}_E\) (\(\pi_s = X - 1\)) or \(s = -\mathrm{Id}_E\) (\(\pi_s = X + 1\)). In the non-degenerate cases the degree is \(2\), since no degree-\(1\) polynomial \(X - \mu\) annihilates a \(p\) or \(s\) that is not a homothety. Proposition — Similarity invariance of the minimal polynomial
Two similar matrices have the same minimal polynomial. Consequently the minimal polynomial of an endomorphism does not depend on the basis chosen to represent it.
Let \(M' = Q^{-1} M Q\) with \(Q\) invertible. For every \(k\), \((M')^k = Q^{-1} M^k Q\), so by linearity \(P(M') = Q^{-1} P(M)\, Q\) for every \(P \in \mathbb{K}[X]\). Hence \(P(M') = 0 \iff P(M) = 0\): \(M\) and \(M'\) have exactly the same annihilating polynomials, therefore the same minimal polynomial. Since the matrices of an endomorphism in two bases are similar, \(\pi_u\) is well defined.
Proposition — A basis of the algebra of polynomials in u
Let \(d = \deg \pi_u\). The family \(\bigl(\mathrm{Id}_E, u, \dots, u^{d-1}\bigr)\) is a basis of \(\mathbb{K}[u]\); in particular \(\dim \mathbb{K}[u] = \deg \pi_u\).
Spanning. An element of \(\mathbb{K}[u]\) is \(P(u)\) for some \(P\). Euclidean division by \(\pi_u\) gives \(P = Q\,\pi_u + R\) with \(\deg R < d\), so \(P(u) = Q(u) \circ \pi_u(u) + R(u) = R(u)\), a linear combination of \(\mathrm{Id}_E, u, \dots, u^{d-1}\).
Freeness. A relation \(\sum_{k=0}^{d-1} c_k\, u^k = 0\) means the polynomial \(R = \sum_{k=0}^{d-1} c_k X^k\) annihilates \(u\); as \(\deg R < d = \deg \pi_u\), \(R\) cannot be a non-zero annihilator, so \(R = 0\) and every \(c_k\) is zero.
The family is therefore a basis of \(\mathbb{K}[u]\), of cardinality \(d\).
Freeness. A relation \(\sum_{k=0}^{d-1} c_k\, u^k = 0\) means the polynomial \(R = \sum_{k=0}^{d-1} c_k X^k\) annihilates \(u\); as \(\deg R < d = \deg \pi_u\), \(R\) cannot be a non-zero annihilator, so \(R = 0\) and every \(c_k\) is zero.
The family is therefore a basis of \(\mathbb{K}[u]\), of cardinality \(d\).
Example — Finding a minimal polynomial
Determine the minimal polynomial of \(M = \begin{pmatrix} 3 & 1 \\ 0 & 3 \end{pmatrix}\).
Write \(M = 3I_2 + N\) with \(N = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}\). Then \(N^2 = 0\), so \((M - 3I_2)^2 = N^2 = 0\): the polynomial \((X - 3)^2\) annihilates \(M\). No degree-\(1\) polynomial annihilates \(M\), for \(M - \mu I_2 = 0\) would force \(M\) to be a homothety, which it is not. Hence the minimal polynomial, monic of least degree, is $$ \pi_M = (X - 3)^2. $$
Method — Determine the minimal polynomial
To find \(\pi_u\): - exhibit a non-zero annihilating polynomial \(P\) --- often by spotting a relation on \(u\) (a projector, a symmetry, a nilpotent, a given identity);
- then \(\pi_u \mid P\), so \(\pi_u\) is one of the monic divisors of \(P\); test them by increasing degree --- the one of least degree that annihilates \(u\) is \(\pi_u\).
Skills to practice
- Determining a minimal polynomial
II
The kernel-decomposition lemma
II.1
The lemma of kernels
The kernel of a polynomial in \(u\) can be hard to grasp directly. But when the polynomial factors into coprime pieces, the kernel splits as the direct sum of the kernels of the pieces. This single lemma --- proved from Bézout's theorem --- is the engine that drives the rest of the chapter.
Theorem — The lemma of kernels
Let \(P_1, P_2 \in \mathbb{K}[X]\) be coprime. For every \(u \in \mathcal{L}(E)\), $$ \textcolor{colorprop}{\mathrm{Ker}\bigl((P_1 P_2)(u)\bigr) = \mathrm{Ker}\bigl(P_1(u)\bigr) \oplus \mathrm{Ker}\bigl(P_2(u)\bigr).} $$
Since \(P_1\) and \(P_2\) are coprime, Bézout's theorem (recalled from Arithmetic of polynomials) gives \(A, B \in \mathbb{K}[X]\) with \(A P_1 + B P_2 = 1\). Reading this through the morphism \(\Phi_u\): $$ A(u) \circ P_1(u) + B(u) \circ P_2(u) = \mathrm{Id}_E. \tag{\(\star\)} $$ The two kernels lie in the left-hand side. As \((P_1 P_2)(u) = P_2(u) \circ P_1(u) = P_1(u) \circ P_2(u)\), a vector killed by \(P_1(u)\) or by \(P_2(u)\) is killed by \((P_1 P_2)(u)\).
The sum is direct. Let \(x \in \mathrm{Ker}\bigl(P_1(u)\bigr) \cap \mathrm{Ker}\bigl(P_2(u)\bigr)\). Applying \((\star)\) to \(x\): \(x = A(u)\bigl(P_1(u)(x)\bigr) + B(u)\bigl(P_2(u)(x)\bigr) = A(u)(0_E) + B(u)(0_E) = 0_E\).
The sum spans the left-hand side. Let \(x \in \mathrm{Ker}\bigl((P_1 P_2)(u)\bigr)\). Set \(x_1 = B(u)\bigl(P_2(u)(x)\bigr)\) and \(x_2 = A(u)\bigl(P_1(u)(x)\bigr)\); by \((\star)\), \(x = x_1 + x_2\). Now, all these polynomials in \(u\) commuting, $$ P_1(u)(x_1) = B(u)\bigl((P_1 P_2)(u)(x)\bigr) = B(u)(0_E) = 0_E, $$ so \(x_1 \in \mathrm{Ker}\bigl(P_1(u)\bigr)\); likewise \(P_2(u)(x_2) = A(u)\bigl((P_1 P_2)(u)(x)\bigr) = 0_E\), so \(x_2 \in \mathrm{Ker}\bigl(P_2(u)\bigr)\).
The three points together give the announced direct sum.
The sum is direct. Let \(x \in \mathrm{Ker}\bigl(P_1(u)\bigr) \cap \mathrm{Ker}\bigl(P_2(u)\bigr)\). Applying \((\star)\) to \(x\): \(x = A(u)\bigl(P_1(u)(x)\bigr) + B(u)\bigl(P_2(u)(x)\bigr) = A(u)(0_E) + B(u)(0_E) = 0_E\).
The sum spans the left-hand side. Let \(x \in \mathrm{Ker}\bigl((P_1 P_2)(u)\bigr)\). Set \(x_1 = B(u)\bigl(P_2(u)(x)\bigr)\) and \(x_2 = A(u)\bigl(P_1(u)(x)\bigr)\); by \((\star)\), \(x = x_1 + x_2\). Now, all these polynomials in \(u\) commuting, $$ P_1(u)(x_1) = B(u)\bigl((P_1 P_2)(u)(x)\bigr) = B(u)(0_E) = 0_E, $$ so \(x_1 \in \mathrm{Ker}\bigl(P_1(u)\bigr)\); likewise \(P_2(u)(x_2) = A(u)\bigl((P_1 P_2)(u)(x)\bigr) = 0_E\), so \(x_2 \in \mathrm{Ker}\bigl(P_2(u)\bigr)\).
The three points together give the announced direct sum.
Example — The lemma of kernels on a projector
A projector \(p\) satisfies \(p^2 - p = 0\), so \(X^2 - X = X(X-1)\) annihilates it and \(\mathrm{Ker}\bigl((X^2-X)(p)\bigr) = E\). The factors \(X\) and \(X - 1\) are coprime (\(1 \cdot X + (-1)(X-1) = 1\)), so the lemma gives $$ E = \mathrm{Ker}(p) \oplus \mathrm{Ker}(p - \mathrm{Id}_E), $$ the decomposition of \(E\) into the kernel and the image of the projector --- recovered here as an instance of the kernel lemma. Theorem — The lemma of kernels for several factors
Let \(P_1, \dots, P_r \in \mathbb{K}[X]\) be pairwise coprime and \(P = P_1 \cdots P_r\). For every \(u \in \mathcal{L}(E)\), $$ \mathrm{Ker}\bigl(P(u)\bigr) = \bigoplus_{i=1}^r \mathrm{Ker}\bigl(P_i(u)\bigr). $$
Induction on \(r\). The case \(r = 1\) is trivial and \(r = 2\) is the lemma of kernels. Suppose the result holds for \(r - 1\) factors (\(r \ge 3\)). The polynomials \(P_1, \dots, P_{r-1}\) are each coprime to \(P_r\), and a product of polynomials each coprime to \(P_r\) is coprime to \(P_r\) (recalled from Arithmetic of polynomials), so \(P_1 \cdots P_{r-1}\) is coprime to \(P_r\). The two-factor lemma applied to \((P_1 \cdots P_{r-1},\, P_r)\) gives $$ \mathrm{Ker}\bigl(P(u)\bigr) = \mathrm{Ker}\bigl((P_1 \cdots P_{r-1})(u)\bigr) \oplus \mathrm{Ker}\bigl(P_r(u)\bigr), $$ and the induction hypothesis decomposes \(\mathrm{Ker}\bigl((P_1 \cdots P_{r-1})(u)\bigr) = \bigoplus_{i=1}^{r-1} \mathrm{Ker}\bigl(P_i(u)\bigr)\). Concatenating the two gives the announced decomposition.
Example — A kernel split by a quadratic factor
Suppose \(u \in \mathcal{L}(E)\), with \(E\) a real vector space, satisfies \((u^2 + \mathrm{Id}_E) \circ (u - \mathrm{Id}_E) = 0\). The polynomials \(X^2 + 1\) and \(X - 1\) are coprime in \(\mathbb{R}[X]\) (the real number \(1\) is not a root of \(X^2 + 1\)), so the lemma gives $$ E = \mathrm{Ker}(u^2 + \mathrm{Id}_E) \oplus \mathrm{Ker}(u - \mathrm{Id}_E). $$ The kernel lemma applies just as well to a non-linear factor. Skills to practice
- Applying the lemma of kernels
II.2
Decomposition of the space by an annihilating polynomial
When the polynomial fed to the kernel lemma annihilates \(u\), its kernel is the whole space. The lemma then decomposes \(E\) itself into a direct sum of \(u\)-stable subspaces --- the bridge from a polynomial relation to the geometric reduction of \(u\).
Proposition — Decomposition by an annihilating polynomial
Let \(P\) be a non-zero annihilating polynomial of \(u\) that factors as \(P = P_1 \cdots P_r\) with the \(P_i\) pairwise coprime. Then $$ E = \bigoplus_{i=1}^r \mathrm{Ker}\bigl(P_i(u)\bigr), $$ and each \(\mathrm{Ker}\bigl(P_i(u)\bigr)\) is stable by \(u\).
Since \(P\) annihilates \(u\), \(P(u) = 0\) and \(\mathrm{Ker}\bigl(P(u)\bigr) = E\). The lemma of kernels for several factors gives \(E = \bigoplus_i \mathrm{Ker}\bigl(P_i(u)\bigr)\). Each \(\mathrm{Ker}\bigl(P_i(u)\bigr)\) is stable by \(u\) by the § 1.1 Proposition.
Proposition — The induced endomorphism is annihilated by the factor
In the decomposition \(E = \bigoplus_i \mathrm{Ker}\bigl(P_i(u)\bigr)\) above, the endomorphism induced by \(u\) on \(\mathrm{Ker}\bigl(P_i(u)\bigr)\) is annihilated by \(P_i\).
Write \(F_i = \mathrm{Ker}\bigl(P_i(u)\bigr)\) and let \(u_i\) be the endomorphism induced by \(u\) on \(F_i\) (well defined, \(F_i\) being \(u\)-stable). Since \(F_i\) is \(u\)-stable, \(u_i^k = (u^k)_{|F_i}\) for every \(k\), hence \(P_i(u_i) = \bigl(P_i(u)\bigr)_{|F_i}\). For \(x \in F_i\), \(P_i(u)(x) = 0_E\) by definition of \(F_i\), so \(P_i(u_i)(x) = 0_E\); this holds for every \(x \in F_i\), so \(P_i(u_i) = 0\).
Example — A space split by a cubic relation
Suppose \(u^3 = u\), that is \((X^3 - X)(u) = 0\). The factorisation \(X^3 - X = X(X-1)(X+1)\) has pairwise-coprime factors, so $$ E = \mathrm{Ker}(u) \oplus \mathrm{Ker}(u - \mathrm{Id}_E) \oplus \mathrm{Ker}(u + \mathrm{Id}_E). $$ A single relation \(u^3 = u\) thus splits \(E\) into three \(u\)-stable subspaces, on which \(u\) acts respectively as the zero map, as the identity and as the opposite of the identity of that subspace.
The kernel decomposition \(E = \bigoplus_i \mathrm{Ker}\bigl(P_i(u)\bigr)\) presents the space as a row of independent \(u\)-stable boxes, one per coprime factor of an annihilating polynomial.
Method — Decompose the space from an annihilating polynomial
Given a non-zero annihilating polynomial \(P\) of \(u\): - factor \(P\) into pairwise-coprime pieces \(P = P_1 \cdots P_r\) --- for instance its distinct monic irreducible factors raised to their powers, a non-zero scalar factor being irrelevant to the kernels;
- the space then splits as \(E = \bigoplus_i \mathrm{Ker}\bigl(P_i(u)\bigr)\), a direct sum of \(u\)-stable subspaces;
- study \(u\) on each \(\mathrm{Ker}\bigl(P_i(u)\bigr)\) separately --- the induced endomorphism there is annihilated by the single factor \(P_i\).
Skills to practice
- Decomposing a space by an annihilating polynomial
III
Annihilating polynomials and diagonalisability
III.1
Eigenvalues and the roots of the minimal polynomial
An annihilating polynomial says something about the eigenvalues: every eigenvalue must be one of its roots. The minimal polynomial says it exactly --- its roots in \(\mathbb{K}\) are precisely the spectrum. This pins the eigenvalues without computing a single determinant.
Proposition — Eigenvalues and annihilating polynomials
If \(u(x) = \lambda x\), then \(P(u)(x) = P(\lambda)\,x\) for every \(P \in \mathbb{K}[X]\). Consequently every eigenvalue of \(u\) is a root of every annihilating polynomial of \(u\).
From \(u(x) = \lambda x\) an immediate induction gives \(u^k(x) = \lambda^k x\) for every \(k\). Hence, for \(P = \sum_k a_k X^k\), $$ P(u)(x) = \sum_k a_k\, u^k(x) = \sum_k a_k\, \lambda^k x = P(\lambda)\, x. $$ Now let \(\lambda\) be an eigenvalue, with an eigenvector \(x \ne 0_E\), and \(P\) an annihilating polynomial. Then \(P(\lambda)\,x = P(u)(x) = 0_E\), and \(x \ne 0_E\) forces \(P(\lambda) = 0\): \(\lambda\) is a root of \(P\).
Example — Bounding the spectrum
If \(u^2 = u\), then \(X^2 - X\) annihilates \(u\), so every eigenvalue is a root of \(X^2 - X\): \(\mathrm{Sp}(u) \subset \{0, 1\}\). If \(u^3 = \mathrm{Id}_E\) with \(E\) a real vector space, then \(X^3 - 1\) annihilates \(u\); its only real root is \(1\), so \(\mathrm{Sp}(u) \subset \{1\}\) --- possibly empty, if \(u\) has no real eigenvalue. An annihilating polynomial caps the spectrum before any eigenspace is computed. Theorem — The roots of the minimal polynomial
The roots of \(\pi_u\) lying in \(\mathbb{K}\) are exactly the eigenvalues of \(u\): $$ \textcolor{colorprop}{\{\text{roots of }\pi_u\text{ in }\mathbb{K}\} = \mathrm{Sp}(u).} $$
\(\pi_u\) annihilates \(u\), so by the Proposition above every eigenvalue of \(u\) is a root of \(\pi_u\): \(\mathrm{Sp}(u) \subset \{\text{roots of }\pi_u\text{ in }\mathbb{K}\}\).
Conversely, let \(\lambda \in \mathbb{K}\) be a root of \(\pi_u\). Write \(\pi_u = (X - \lambda)\,Q\) with \(Q \in \mathbb{K}[X]\) and \(\deg Q = \deg \pi_u - 1 < \deg \pi_u\). By minimality of \(\pi_u\), \(Q\) does not annihilate \(u\), so \(Q(u) \ne 0\): there is a vector \(y\) with \(Q(u)(y) \ne 0_E\). Then $$ (u - \lambda\,\mathrm{Id}_E)\bigl(Q(u)(y)\bigr) = \bigl((X - \lambda)\,Q\bigr)(u)(y) = \pi_u(u)(y) = 0_E, $$ so \(Q(u)(y)\) is a non-zero vector with \(u\bigl(Q(u)(y)\bigr) = \lambda\, Q(u)(y)\): \(\lambda\) is an eigenvalue of \(u\).
Conversely, let \(\lambda \in \mathbb{K}\) be a root of \(\pi_u\). Write \(\pi_u = (X - \lambda)\,Q\) with \(Q \in \mathbb{K}[X]\) and \(\deg Q = \deg \pi_u - 1 < \deg \pi_u\). By minimality of \(\pi_u\), \(Q\) does not annihilate \(u\), so \(Q(u) \ne 0\): there is a vector \(y\) with \(Q(u)(y) \ne 0_E\). Then $$ (u - \lambda\,\mathrm{Id}_E)\bigl(Q(u)(y)\bigr) = \bigl((X - \lambda)\,Q\bigr)(u)(y) = \pi_u(u)(y) = 0_E, $$ so \(Q(u)(y)\) is a non-zero vector with \(u\bigl(Q(u)(y)\bigr) = \lambda\, Q(u)(y)\): \(\lambda\) is an eigenvalue of \(u\).
Example — Spectrum read off the minimal polynomial
If \(\pi_u = (X - 2)^3 (X + 1)\) over \(\mathbb{K} = \mathbb{R}\), then \(\mathrm{Sp}(u) = \{2, -1\}\) exactly --- the multiplicities in \(\pi_u\) play no role here. Beware that a non-minimal annihilator gives only an inclusion: \(X(X-1)(X-2)\) annihilates every projector \(p\) (since \(X^2 - X\) divides it), yet \(\mathrm{Sp}(p) \subset \{0, 1\}\), strictly inside the root set \(\{0, 1, 2\}\). Only \(\pi_u\) pins the spectrum. Method — Bound the spectrum using an annihilating polynomial
To locate the eigenvalues of \(u\) without computing eigenspaces: - exhibit any annihilating polynomial \(P\) --- then \(\mathrm{Sp}(u)\) is contained in the set of roots of \(P\) lying in \(\mathbb{K}\);
- for the exact spectrum, use the minimal polynomial: \(\mathrm{Sp}(u)\) is exactly the set of roots of \(\pi_u\) in \(\mathbb{K}\).
Skills to practice
- Locating the spectrum through an annihilating polynomial
III.2
The polynomial criterion for diagonalisability
Diagonalisability was characterised in Reduction: eigen-elements and diagonalisation by counting eigenspace dimensions. The minimal polynomial gives a far cheaper test: \(u\) is diagonalisable exactly when \(\pi_u\) is split with simple roots --- a single polynomial to inspect.
Theorem — The polynomial criterion for diagonalisability
For \(u \in \mathcal{L}(E)\), the following are equivalent: - [(i)] \(u\) is diagonalisable;
- [(ii)] some non-zero split polynomial with simple roots annihilates \(u\);
- [(iii)] \(\pi_u\) is split with simple roots.
(i)\(\Rightarrow\)(ii). If \(u\) is diagonalisable, \(E = \bigoplus_{\lambda \in \mathrm{Sp}(u)} E_\lambda(u)\). Set \(P = \prod_{\lambda \in \mathrm{Sp}(u)}(X - \lambda)\), split with simple roots (the eigenvalues are distinct). For \(x \in E_\mu(u)\), the factor \((X - \mu)\) of \(P\) gives \(P(u)(x) = 0_E\); as every vector of \(E\) is a sum of such, \(P(u) = 0\). So \(P\) is a non-zero split simple-root annihilator.
(ii)\(\Rightarrow\)(iii). Let \(P\) be a non-zero split simple-root annihilator. Then \(\pi_u \mid P\), so \(\pi_u\) is a monic divisor of \(P\); the roots of \(\pi_u\) form a sub-multiset of the (distinct) roots of \(P\), so \(\pi_u\) is split with simple roots.
(iii)\(\Rightarrow\)(i). Write \(\pi_u = \prod_{i=1}^r (X - \lambda_i)\) with the \(\lambda_i\) distinct. The factors \((X - \lambda_i)\) are pairwise coprime, and \(\pi_u\) annihilates \(u\), so the § 2.2 Proposition gives $$ E = \bigoplus_{i=1}^r \mathrm{Ker}(u - \lambda_i\,\mathrm{Id}_E) = \bigoplus_{i=1}^r E_{\lambda_i}(u). $$ \(E\) is a direct sum of eigenspaces of \(u\), so \(u\) is diagonalisable.
(ii)\(\Rightarrow\)(iii). Let \(P\) be a non-zero split simple-root annihilator. Then \(\pi_u \mid P\), so \(\pi_u\) is a monic divisor of \(P\); the roots of \(\pi_u\) form a sub-multiset of the (distinct) roots of \(P\), so \(\pi_u\) is split with simple roots.
(iii)\(\Rightarrow\)(i). Write \(\pi_u = \prod_{i=1}^r (X - \lambda_i)\) with the \(\lambda_i\) distinct. The factors \((X - \lambda_i)\) are pairwise coprime, and \(\pi_u\) annihilates \(u\), so the § 2.2 Proposition gives $$ E = \bigoplus_{i=1}^r \mathrm{Ker}(u - \lambda_i\,\mathrm{Id}_E) = \bigoplus_{i=1}^r E_{\lambda_i}(u). $$ \(E\) is a direct sum of eigenspaces of \(u\), so \(u\) is diagonalisable.
Example — Projectors and symmetries are diagonalisable
A projector satisfies \(p^2 - p = 0\), so \(X^2 - X = X(X-1)\) annihilates it --- split with the simple roots \(0\) and \(1\); by the criterion, every projector is diagonalisable. A symmetry satisfies \(s^2 - \mathrm{Id}_E = 0\), so \(X^2 - 1 = (X-1)(X+1)\) annihilates it --- split with simple roots; every symmetry is diagonalisable. No eigenspace computation is needed. Example — Diagonalisability depends on the field
The matrix \(R = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}\) satisfies \(R^2 = -I_2\), so \(X^2 + 1\) annihilates it; and \(R\) is not a homothety, so no degree-\(1\) polynomial annihilates it --- hence \(\pi_R = X^2 + 1\). Over \(\mathbb{R}\), \(X^2 + 1\) is not split, so \(R\) is not diagonalisable in \(\mathcal{M}_2(\mathbb{R})\). Over \(\mathbb{C}\), \(X^2 + 1 = (X - \mathrm{i})(X + \mathrm{i})\) is split with simple roots, so the same matrix, viewed in \(\mathcal{M}_2(\mathbb{C})\), is diagonalisable. An arbitrary non-split annihilator proves nothing about diagonalisability. To establish that \(u\) is not diagonalisable, use the minimal polynomial --- \(u\) is diagonalisable if and only if \(\pi_u\) is split with simple roots --- or a criterion of Reduction: eigen-elements and diagonalisation. Proposition — Diagonalisability of an induced endomorphism
If \(u\) is diagonalisable and \(F\) is a non-zero \(u\)-stable subspace, the endomorphism induced by \(u\) on \(F\) is diagonalisable.
Let \(u_F\) be the induced endomorphism. As \(F\) is \(u\)-stable, \(\pi_u(u_F) = \bigl(\pi_u(u)\bigr)_{|F} = 0\), so \(\pi_u\) annihilates \(u_F\). Since \(u\) is diagonalisable, \(\pi_u\) is split with simple roots; a non-zero split simple-root polynomial annihilates \(u_F\), so \(u_F\) is diagonalisable by the criterion.
Example — Diagonalisability without eigenspaces
Show that \(M = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}\) is diagonalisable, without computing its eigenspaces.
A direct computation gives \(M^2 = I_2\), so the polynomial \(X^2 - 1\) annihilates \(M\). Now \(X^2 - 1 = (X - 1)(X + 1)\) is split over \(\mathbb{R}\) with the two simple roots \(1\) and \(-1\). A non-zero split simple-root polynomial annihilates \(M\), so by the polynomial criterion \(M\) is diagonalisable.
Method — Prove diagonalisability through an annihilating polynomial
To show \(u\) is diagonalisable without computing eigenspaces: - find a non-zero polynomial annihilating \(u\) --- typically from a relation such as \(u^2 = \mathrm{Id}_E\), \(u^2 = u\), or a given identity;
- check that it is split with simple roots; if so, \(u\) is diagonalisable;
- to prove \(u\) is not diagonalisable, an arbitrary annihilator does not suffice: invoke the minimal polynomial (\(u\) is diagonalisable if and only if \(\pi_u\) is split with simple roots), or a criterion of Reduction: eigen-elements and diagonalisation.
Skills to practice
- Proving diagonalisability through a polynomial
IV
Cayley-Hamilton and characteristic subspaces
IV.1
The Cayley-Hamilton theorem
So far an annihilating polynomial had to be found by hand. The Cayley--Hamilton theorem provides one for free: the characteristic polynomial always annihilates the endomorphism. Every endomorphism thus carries a canonical annihilating polynomial of degree \(n\).
Theorem — The Cayley-Hamilton theorem
For every \(u \in \mathcal{L}(E)\), $$ \textcolor{colorprop}{\chi_u(u) = 0_{\mathcal{L}(E)}.} $$ The characteristic polynomial of \(u\) annihilates \(u\).
For \(x = 0_E\), \(\chi_u(u)(0_E) = 0_E\) trivially; fix \(x \ne 0_E\). The integers \(j \ge 1\) for which \(\bigl(x, u(x), \dots, u^{j-1}(x)\bigr)\) is free are bounded by \(n = \dim E\), so there is a largest such \(p\); let \(F\) be the span of \(\bigl(x, u(x), \dots, u^{p-1}(x)\bigr)\), a subspace of dimension \(p\).
By maximality of \(p\), the vector \(u^p(x)\) is a linear combination of the earlier ones: $$ u^p(x) = \sum_{k=0}^{p-1} a_k\, u^k(x). $$ Hence \(F\) is stable by \(u\), and in the basis \(\bigl(x, u(x), \dots, u^{p-1}(x)\bigr)\) --- where \(u\) sends each vector to the next, and the last to \(u^p(x) = \sum_k a_k u^k(x)\) --- the matrix of the induced endomorphism \(u_F\) is the companion matrix $$ C = \begin{pmatrix} 0 & 0 & \cdots & 0 & a_0 \\ 1 & 0 & \cdots & 0 & a_1 \\ 0 & 1 & \cdots & 0 & a_2 \\ \vdots & & \ddots & & \vdots \\ 0 & 0 & \cdots & 1 & a_{p-1} \end{pmatrix}. $$ Expand \(\det(X I_p - C)\) along its first row \((X, 0, \dots, 0, -a_0)\): the entry \(X\) contributes \(X\,\det(X I_{p-1} - C')\), where \(C'\) is the companion matrix of \(X^{p-1} - \sum_{k=1}^{p-1} a_k X^{k-1}\); the entry \(-a_0\) contributes \((-a_0)\,(-1)^{1+p}\,(-1)^{p-1} = -a_0\), its minor being upper-triangular with diagonal entries \(-1\). By induction on \(p\) --- the base case \(p = 1\) being \(\det(X - a_0) = X - a_0\) --- \(\det(X I_{p-1} - C') = X^{p-1} - \sum_{k=1}^{p-1} a_k X^{k-1}\), hence $$ \chi_{u_F} = \det(X I_p - C) = X\Bigl(X^{p-1} - \sum_{k=1}^{p-1} a_k X^{k-1}\Bigr) - a_0 = X^p - \sum_{k=0}^{p-1} a_k X^k. $$ Therefore \(\chi_{u_F}(u)(x) = u^p(x) - \sum_{k=0}^{p-1} a_k\, u^k(x) = 0_E\). Since \(F\) is a non-zero \(u\)-stable subspace, \(\chi_{u_F} \mid \chi_u\) (recalled from Reduction: eigen-elements and diagonalisation); writing \(\chi_u = \chi_{u_F}\, R\), $$ \chi_u(u)(x) = R(u)\bigl(\chi_{u_F}(u)(x)\bigr) = R(u)(0_E) = 0_E. $$ This holds for every \(x \ne 0_E\), and also for \(x = 0_E\), so \(\chi_u(u) = 0\).
By maximality of \(p\), the vector \(u^p(x)\) is a linear combination of the earlier ones: $$ u^p(x) = \sum_{k=0}^{p-1} a_k\, u^k(x). $$ Hence \(F\) is stable by \(u\), and in the basis \(\bigl(x, u(x), \dots, u^{p-1}(x)\bigr)\) --- where \(u\) sends each vector to the next, and the last to \(u^p(x) = \sum_k a_k u^k(x)\) --- the matrix of the induced endomorphism \(u_F\) is the companion matrix $$ C = \begin{pmatrix} 0 & 0 & \cdots & 0 & a_0 \\ 1 & 0 & \cdots & 0 & a_1 \\ 0 & 1 & \cdots & 0 & a_2 \\ \vdots & & \ddots & & \vdots \\ 0 & 0 & \cdots & 1 & a_{p-1} \end{pmatrix}. $$ Expand \(\det(X I_p - C)\) along its first row \((X, 0, \dots, 0, -a_0)\): the entry \(X\) contributes \(X\,\det(X I_{p-1} - C')\), where \(C'\) is the companion matrix of \(X^{p-1} - \sum_{k=1}^{p-1} a_k X^{k-1}\); the entry \(-a_0\) contributes \((-a_0)\,(-1)^{1+p}\,(-1)^{p-1} = -a_0\), its minor being upper-triangular with diagonal entries \(-1\). By induction on \(p\) --- the base case \(p = 1\) being \(\det(X - a_0) = X - a_0\) --- \(\det(X I_{p-1} - C') = X^{p-1} - \sum_{k=1}^{p-1} a_k X^{k-1}\), hence $$ \chi_{u_F} = \det(X I_p - C) = X\Bigl(X^{p-1} - \sum_{k=1}^{p-1} a_k X^{k-1}\Bigr) - a_0 = X^p - \sum_{k=0}^{p-1} a_k X^k. $$ Therefore \(\chi_{u_F}(u)(x) = u^p(x) - \sum_{k=0}^{p-1} a_k\, u^k(x) = 0_E\). Since \(F\) is a non-zero \(u\)-stable subspace, \(\chi_{u_F} \mid \chi_u\) (recalled from Reduction: eigen-elements and diagonalisation); writing \(\chi_u = \chi_{u_F}\, R\), $$ \chi_u(u)(x) = R(u)\bigl(\chi_{u_F}(u)(x)\bigr) = R(u)(0_E) = 0_E. $$ This holds for every \(x \ne 0_E\), and also for \(x = 0_E\), so \(\chi_u(u) = 0\).
Remark. The statement of the Cayley--Hamilton theorem is fully in the program; its proof is « démonstration non exigible » at the concours. It is given above, the sole auxiliary step --- the characteristic polynomial of a companion matrix --- being a standard determinant computation (cofactor expansion, induction on the size).
Example — Cayley-Hamilton for a 2 by 2 matrix
For \(M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\), \(\chi_M = X^2 - (a + d)X + (ad - bc)\). From \(M^2 = \begin{pmatrix} a^2 + bc & b(a+d) \\ c(a+d) & d^2 + bc \end{pmatrix}\), the off-diagonal entries of \(M^2 - (a+d)M\) vanish and the diagonal ones collapse: $$ M^2 - (a+d)M = \begin{pmatrix} bc - ad & 0 \\
0 & bc - ad \end{pmatrix} = (bc - ad)\,I_2, $$ so $$ \chi_M(M) = M^2 - (a+d)M + (ad - bc)I_2 = (bc - ad)I_2 + (ad - bc)I_2 = 0. $$ Proposition — The minimal polynomial divides the characteristic polynomial
For every \(u \in \mathcal{L}(E)\), \(\pi_u \mid \chi_u\); in particular \(\deg \pi_u \le n\).
By Cayley--Hamilton, \(\chi_u\) annihilates \(u\); and \(\pi_u\) divides every annihilating polynomial, so \(\pi_u \mid \chi_u\). Hence \(\deg \pi_u \le \deg \chi_u = n\).
Proposition — Roots shared by the minimal and characteristic polynomials
The polynomials \(\pi_u\) and \(\chi_u\) have the same set of roots in \(\mathbb{K}\), namely \(\mathrm{Sp}(u)\) (their multiplicities may differ). Consequently, if \(\chi_u\) is split, then \(\pi_u\) is split.
The roots of \(\pi_u\) in \(\mathbb{K}\) are exactly \(\mathrm{Sp}(u)\) (§ 3.1), and the roots of \(\chi_u\) in \(\mathbb{K}\) are exactly \(\mathrm{Sp}(u)\) (Reduction: eigen-elements and diagonalisation): the two sets of roots in \(\mathbb{K}\) coincide. Moreover \(\pi_u \mid \chi_u\) by the previous Proposition, and a monic divisor of a split polynomial is split (its roots, counted with multiplicity, form a sub-multiset of the dividend's), so \(\chi_u\) split forces \(\pi_u\) split.
Method — Compute an inverse or a power through Cayley-Hamilton
The relation \(\chi_u(u) = 0\) is a degree-\(n\) polynomial identity in \(u\): - writing \(\chi_u = X^n + \dots + c_1 X + c_0\), if \(c_0 \ne 0\) (equivalently \(u\) invertible) then \(u^{-1}\) is a polynomial in \(u\), read off \(\chi_u(u) = 0\);
- every high power \(u^k\) reduces to a combination of \(\mathrm{Id}_E, u, \dots, u^{n-1}\) by taking the remainder of \(X^k\) in the Euclidean division by \(\chi_u\).
Example — An inverse through Cayley-Hamilton
For \(M = \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{pmatrix}\), \(\chi_M = (X - 1)^3 = X^3 - 3X^2 + 3X - 1\). Cayley--Hamilton gives \(M^3 - 3M^2 + 3M - I_3 = 0\), hence \(M(M^2 - 3M + 3I_3) = I_3\) and $$ M^{-1} = M^2 - 3M + 3I_3 = \begin{pmatrix} 1 & -1 & 1 \\
0 & 1 & -1 \\
0 & 0 & 1 \end{pmatrix}. $$ Skills to practice
- Using the Cayley-Hamilton theorem
IV.2
Characteristic subspaces
When \(\chi_u\) is split, Cayley--Hamilton offers a split annihilating polynomial, and the kernel lemma applied to it decomposes \(E\) into one \(u\)-stable subspace per eigenvalue --- the characteristic subspaces. They contain the eigenspaces, and measure exactly how far \(u\) is from being diagonalisable.
Definition — Characteristic subspace
Let \(\lambda\) be an eigenvalue of \(u\), of multiplicity \(m(\lambda)\). The characteristic subspace of \(u\) associated with \(\lambda\) is $$ \textcolor{colordef}{F_\lambda(u) = \mathrm{Ker}\bigl((u - \lambda\,\mathrm{Id}_E)^{m(\lambda)}\bigr).} $$
The kernel of \(u - \lambda\,\mathrm{Id}_E\) is contained in the kernel of any of its powers, so the eigenspace sits inside the characteristic subspace: $$ E_\lambda(u) = \mathrm{Ker}(u - \lambda\,\mathrm{Id}_E) \subset F_\lambda(u). $$ In particular \(F_\lambda(u) \ne \{0_E\}\), since \(\lambda\) is an eigenvalue.
Theorem — Decomposition into characteristic subspaces
If \(\chi_u\) is split, then $$ E = \bigoplus_{\lambda \in \mathrm{Sp}(u)} F_\lambda(u), $$ a direct sum of \(u\)-stable subspaces.
As \(\chi_u\) is split, \(\chi_u = \prod_{\lambda \in \mathrm{Sp}(u)} (X - \lambda)^{m(\lambda)}\). The factors \((X - \lambda)^{m(\lambda)}\), attached to distinct eigenvalues, are pairwise coprime. By Cayley--Hamilton, \(\chi_u\) annihilates \(u\). The § 2.2 Proposition then gives $$ E = \bigoplus_{\lambda \in \mathrm{Sp}(u)} \mathrm{Ker}\bigl((u - \lambda\,\mathrm{Id}_E)^{m(\lambda)}\bigr) = \bigoplus_{\lambda \in \mathrm{Sp}(u)} F_\lambda(u), $$ each summand \(u\)-stable.
Proposition — Dimension of a characteristic subspace
If \(\chi_u\) is split, then \(\dim F_\lambda(u) = m(\lambda)\) for every eigenvalue \(\lambda\).
Let \(u_\lambda\) be the endomorphism induced by \(u\) on \(F_\lambda(u)\). By definition of \(F_\lambda(u)\), \((u_\lambda - \lambda\,\mathrm{Id})^{m(\lambda)} = 0\), so \(u_\lambda - \lambda\,\mathrm{Id}\) is nilpotent. A nilpotent endomorphism of a space of dimension \(q\) has characteristic polynomial \(X^q\) (Reduction: eigen-elements and diagonalisation), and \(\chi_{u_\lambda}(X) = \chi_{u_\lambda - \lambda\mathrm{Id}}(X - \lambda)\) (the shift \(\det(XI - M) = \det((X-\lambda)I - (M - \lambda I))\)); hence \(\chi_{u_\lambda} = (X - \lambda)^{\dim F_\lambda(u)}\).
As \(F_\lambda(u)\) is a non-zero \(u\)-stable subspace, \(\chi_{u_\lambda} \mid \chi_u\) (Reduction: eigen-elements and diagonalisation), so \((X - \lambda)^{\dim F_\lambda(u)} \mid \chi_u\) and \(\dim F_\lambda(u) \le m(\lambda)\). Summing over \(\lambda \in \mathrm{Sp}(u)\): \(\sum_\lambda \dim F_\lambda(u) = \dim E = n\) (the direct sum of the previous Theorem), while \(\sum_\lambda m(\lambda) = n\) (\(\chi_u\) split of degree \(n\)). Inequalities \(\dim F_\lambda(u) \le m(\lambda)\) whose two sides have the same sum \(n\) are all equalities, so \(\dim F_\lambda(u) = m(\lambda)\) for every \(\lambda\).
As \(F_\lambda(u)\) is a non-zero \(u\)-stable subspace, \(\chi_{u_\lambda} \mid \chi_u\) (Reduction: eigen-elements and diagonalisation), so \((X - \lambda)^{\dim F_\lambda(u)} \mid \chi_u\) and \(\dim F_\lambda(u) \le m(\lambda)\). Summing over \(\lambda \in \mathrm{Sp}(u)\): \(\sum_\lambda \dim F_\lambda(u) = \dim E = n\) (the direct sum of the previous Theorem), while \(\sum_\lambda m(\lambda) = n\) (\(\chi_u\) split of degree \(n\)). Inequalities \(\dim F_\lambda(u) \le m(\lambda)\) whose two sides have the same sum \(n\) are all equalities, so \(\dim F_\lambda(u) = m(\lambda)\) for every \(\lambda\).
Example — The eigenspace strictly inside the characteristic subspace
For \(M = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}\), \(\chi_M = (X - 1)^2\), so \(1\) is the only eigenvalue, with \(m(1) = 2\). The eigenspace is \(E_1(M) = \mathrm{Ker}(M - I_2)\), of dimension \(1\). The characteristic subspace is \(F_1(M) = \mathrm{Ker}\bigl((M - I_2)^2\bigr) = \mathrm{Ker}(0) = \mathbb{R}^2\), of dimension \(2 = m(1)\). Here \(E_1(M) \subsetneq F_1(M)\) --- the gap is exactly the failure of \(M\) to be diagonalisable. Theorem — Diagonalisability through the characteristic subspaces
Suppose \(\chi_u\) is split. Then \(u\) is diagonalisable if and only if \(F_\lambda(u) = E_\lambda(u)\) for every eigenvalue \(\lambda\).
\(u\) is diagonalisable if and only if \(\sum_{\lambda} \dim E_\lambda(u) = n\) (Reduction: eigen-elements and diagonalisation). Now \(E_\lambda(u) \subset F_\lambda(u)\) with \(\dim F_\lambda(u) = m(\lambda)\), so \(\dim E_\lambda(u) \le m(\lambda)\), and \(E_\lambda(u) = F_\lambda(u)\) exactly when \(\dim E_\lambda(u) = m(\lambda)\). Since \(\sum_\lambda m(\lambda) = n\), the equality \(\sum_\lambda \dim E_\lambda(u) = n\) holds if and only if \(\dim E_\lambda(u) = m(\lambda)\) for every \(\lambda\), that is if and only if \(E_\lambda(u) = F_\lambda(u)\) for every \(\lambda\).
Example — A characteristic-subspace decomposition
For \(M = \begin{pmatrix} 2 & 1 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{pmatrix}\), \(\chi_M = (X - 2)^2 (X - 3)\), so \(m(2) = 2\) and \(m(3) = 1\). Here \((M - 2I_3)^2 = \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}\), so \(F_2(M) = \mathrm{Ker}\bigl((M - 2I_3)^2\bigr) = \{(x, y, 0)\}\), of dimension \(2\), and \(F_3(M) = \mathrm{Ker}(M - 3I_3) = \{(0, 0, z)\}\), of dimension \(1\). The decomposition is \(\mathbb{R}^3 = F_2(M) \oplus F_3(M)\). As \(E_2(M) = \mathrm{Ker}(M - 2I_3) = \{(x, 0, 0)\}\) has dimension \(1 < 2 = \dim F_2(M)\), the matrix \(M\) is not diagonalisable.
When \(\chi_u\) is split, the decomposition \(E = \bigoplus_\lambda F_\lambda(u)\) puts \(u\) in block-diagonal form, one block per eigenvalue; on \(F_\lambda(u)\) the induced endomorphism is \(\lambda\,\mathrm{Id} + \text{(nilpotent)}\), the nilpotent parts on all the characteristic subspaces vanishing simultaneously exactly when \(u\) is diagonalisable.
Method — Compute the characteristic subspaces
When \(\chi_u\) is split: - compute \(\chi_u\) and read its roots \(\lambda\) with their multiplicities \(m(\lambda)\);
- for each \(\lambda\), compute \(F_\lambda(u) = \mathrm{Ker}\bigl((u - \lambda\,\mathrm{Id}_E)^{m(\lambda)}\bigr)\) --- of dimension \(m(\lambda)\);
- then \(E = \bigoplus_\lambda F_\lambda(u)\); a basis adapted to this decomposition puts \(u\) in block-diagonal form, one block per eigenvalue, and choosing the basis of each \(F_\lambda(u)\) suitably makes that block triangular with the single diagonal entry \(\lambda\).
Going further
This chapter turned a single polynomial relation \(P(u) = 0\) into the whole machinery of reduction: the minimal polynomial, the kernel lemma, the diagonalisability criterion, Cayley--Hamilton, the characteristic subspaces. When \(\chi_u\) is split, the decomposition \(E = \bigoplus_\lambda F_\lambda(u)\) presents \(u\), on each block, as a homothety plus a nilpotent endomorphism --- the starting point of the Dunford decomposition and Jordan reduction, both beyond the program and named here only as the horizon. The euclidean counterpart, chapter Self-adjoint endomorphisms, reduces another distinguished family through the spectral theorem.
Skills to practice
- Computing characteristic subspaces
Jump to section