CommeUnJeu · L2 MP
Reduction: eigen-elements and diagonalisation
Given an endomorphism \(u\) of a vector space \(E\), reduction asks for a basis of \(E\) in which the matrix of \(u\) is as simple as possible --- ideally diagonal, so that \(u\) merely scales each basis vector. This chapter builds the two tools that decide whether such a basis exists. The first is geometric: the eigen-elements of \(u\) --- the directions \(u\) only stretches. The second is algebraic: the characteristic polynomial, which turns the question « is \(\lambda\) an eigenvalue? » into « is \(\lambda\) a root? ».
The headline results are three. An endomorphism is diagonalisable exactly when \(E\) is the direct sum of its eigenspaces --- equivalently, when its characteristic polynomial is split and each eigenspace has the expected dimension. It is trigonalisable (a triangular matrix) exactly when its characteristic polynomial is split. And a nilpotent endomorphism is precisely one that is trigonalisable with \(0\) as its only eigenvalue.
The headline results are three. An endomorphism is diagonalisable exactly when \(E\) is the direct sum of its eigenspaces --- equivalently, when its characteristic polynomial is split and each eigenspace has the expected dimension. It is trigonalisable (a triangular matrix) exactly when its characteristic polynomial is split. And a nilpotent endomorphism is precisely one that is trigonalisable with \(0\) as its only eigenvalue.
Conventions
Throughout this chapter, \(\mathbb{K}\) is a subfield of \(\mathbb{C}\) and \(E\) is a finite-dimensional, non-zero \(\mathbb{K}\)-vector space, with \(n = \dim E \ge 1\). We write \(u \in \mathcal{L}(E)\) for an endomorphism, \(M \in \mathcal{M}_n(\mathbb{K})\) for a square matrix, \(\operatorname{Id}_E\) for the identity and \(I_n\) for the identity matrix. Determinant, trace, rank, similar matrices and the matrix of an endomorphism in a basis are those of Determinants and Change of basis; polynomials, roots, multiplicity and split polynomials those of Polynomials; direct sums, stable subspaces and the induced endomorphism those of Further linear algebra. The characteristic polynomial is defined as \(\det(XI_n - M)\); its degree, monicity and coefficients are established in § 2.
I
Eigen-elements
I.1
Eigenvalues\(\virgule\) eigenvectors\(\virgule\) eigenspaces
The reduction problem starts with a single observation: some vectors are merely stretched by \(u\). A non-zero vector \(x\) with \(u(x) = \lambda x\) keeps its direction --- \(u\) acts on the line it spans as the homothety of ratio \(\lambda\). Such directions are the raw material of reduction: a basis made entirely of them turns the matrix of \(u\) diagonal. This subsection names them and gathers their first properties.
Definition — Eigenvalue and eigenvector
A scalar \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) when there exists a vector \(x \neq 0_E\) such that \(u(x) = \lambda x\). Such a vector \(x\) is then an eigenvector of \(u\) associated with \(\lambda\). An eigenvector is, by definition, never the zero vector. Example — Eigen-elements of three familiar endomorphisms
A homothety \(\lambda_0 \operatorname{Id}_E\) admits every non-zero vector as an eigenvector, all for the single eigenvalue \(\lambda_0\). A projector \(p\) has eigenvalues among \(0\) and \(1\): every vector of \(\operatorname{Im} p\) satisfies \(p(x) = x\) and every vector of \(\operatorname{Ker} p\) satisfies \(p(x) = 0\) --- both \(0\) and \(1\) occur as soon as \(\operatorname{Ker} p\) and \(\operatorname{Im} p\) are both non-zero. A vector symmetry \(s\) has eigenvalues among \(-1\) and \(1\), both occurring when its fixed and anti-fixed subspaces are both non-zero. Definition — Eigenspace
Let \(\lambda \in \mathbb{K}\). The eigenspace of \(u\) associated with \(\lambda\) is $$ E_\lambda(u) = \operatorname{Ker}(u - \lambda \operatorname{Id}_E). $$ It is a subspace of \(E\), being the kernel of an endomorphism. The scalar \(\lambda\) is an eigenvalue of \(u\) if and only if \(E_\lambda(u) \neq \{0_E\}\), and \(E_\lambda(u)\) is then exactly the set of eigenvectors associated with \(\lambda\), together with \(0_E\). Example — An eigenspace as a kernel
For \(M = \begin{psmallmatrix} 1 & 2 \\ 2 & 1 \end{psmallmatrix}\), take \(\lambda = 3\): the eigenspace is \(E_3 = \operatorname{Ker}(M - 3I_2) = \operatorname{Ker}\begin{psmallmatrix} -2 & 2 \\ 2 & -2 \end{psmallmatrix}\). The system \(-2x + 2y = 0\) has solution line \(\operatorname{Vect}(1, 1)\), so \(E_3 = \operatorname{Vect}(1, 1)\), a line; \(3\) is an eigenvalue since \(E_3 \neq \{0\}\). For \(\lambda = 0\), \(\operatorname{Ker} M = \{0\}\) (as \(M\) is invertible), so \(0\) is not an eigenvalue. Definition — Spectrum
The spectrum of \(u\), written \(\operatorname{Sp}(u)\), is the set of eigenvalues of \(u\) in \(\mathbb{K}\). For a matrix \(M \in \mathcal{M}_n(\mathbb{K})\) one defines likewise \(\operatorname{Sp}(M)\), the set of \(\lambda \in \mathbb{K}\) for which \(MX = \lambda X\) has a non-zero column solution \(X\) --- such an \(X\) being a matrix eigenvector, tied to the eigenvectors of \(u\) by the matrix-translation Proposition below. Example — Eigenspaces and spectrum of a diagonal matrix
Let \(u\) be the endomorphism of \(\mathbb{R}^3\) with matrix \(D = \operatorname{diag}(5, 5, -2)\) in the canonical basis \((e_1, e_2, e_3)\). Then \(u(e_1) = 5e_1\), \(u(e_2) = 5e_2\) and \(u(e_3) = -2e_3\), so the spectrum is \(\operatorname{Sp}(u) = \{5, -2\}\). The eigenspaces are \(E_5(u) = \operatorname{Vect}(e_1, e_2)\), a plane, and \(E_{-2}(u) = \operatorname{Vect}(e_3)\), a line; here \(E_5(u) \oplus E_{-2}(u) = \mathbb{R}^3\).
The notion of spectral value is out of the program scope. In infinite dimension one distinguishes eigenvalues from a wider set of « spectral values »; here, in finite dimension, the spectrum is simply the set of eigenvalues and nothing more is needed.
Proposition — Matrix translation
Let \(\mathcal{B}\) be a basis of \(E\) and \(M = \operatorname{Mat}_\mathcal{B}(u)\). A vector \(x\), with coordinate column \(X\) in \(\mathcal{B}\), is an eigenvector of \(u\) associated with \(\lambda\) if and only if \(X \neq 0\) and \(MX = \lambda X\). Consequently \(\operatorname{Sp}(u) = \operatorname{Sp}(M)\).
The coordinate map in \(\mathcal{B}\) is a linear isomorphism \(E \to \mathcal{M}_{n,1}(\mathbb{K})\) sending \(x\) to \(X\) and \(u(x)\) to \(MX\). Hence \(u(x) = \lambda x\) holds if and only if \(MX = \lambda X\), and \(x \neq 0_E\) if and only if \(X \neq 0\). The two notions of eigenvector --- for \(u\) and for \(M\) --- therefore correspond, and so do the two spectra.
Proposition — The eigenvalue zero
The scalar \(0\) is an eigenvalue of \(u\) if and only if \(u\) is not injective, equivalently --- \(E\) being finite-dimensional --- if and only if \(u\) is not invertible.
By definition \(0\) is an eigenvalue of \(u\) iff \(E_0(u) = \operatorname{Ker}(u - 0 \cdot \operatorname{Id}_E) = \operatorname{Ker} u \neq \{0_E\}\), that is, iff \(u\) is not injective. In finite dimension an endomorphism is injective if and only if it is bijective, so this is also the negation of « \(u\) invertible ».
Geometric meaning
Geometrically, an eigenvector marks a direction that \(u\) leaves invariant: on the eigen-line it spans, \(u\) acts as a pure dilation. Off such a line, \(u\) generally tilts a vector into a new direction. The figure shows an eigen-line, on which \(x\) and \(u(x) = \lambda x\) stay collinear, beside an ordinary direction, where \(y\) and \(u(y)\) point differently.
Example — Determine the eigen-elements of a matrix
For the endomorphism \(u\) of \(\mathbb{R}^3\) whose matrix in the canonical basis is $$ M = \begin{pmatrix} 0 & 2 & -1 \\
3 & -2 & 0 \\
-2 & 2 & 1 \end{pmatrix}, $$ it will be shown in § 2 that the spectrum is \(\operatorname{Sp}(M) = \{1, 2, -4\}\) --- the systematic determination of the eigenvalues is the work of the characteristic polynomial. Taking these eigenvalues as given, determine the three eigenspaces.
Each eigenspace is the kernel \(E_\lambda(u) = \operatorname{Ker}(M - \lambda I_3)\). Resolving the system \((M - \lambda I_3)X = 0\) for each given eigenvalue:
- \(\lambda = 1\): \(E_1(u) = \operatorname{Vect}(1, 1, 1)\);
- \(\lambda = 2\): \(E_2(u) = \operatorname{Vect}(4, 3, -2)\);
- \(\lambda = -4\): \(E_{-4}(u) = \operatorname{Vect}(2, -3, 2)\).
Method — Test whether a scalar is an eigenvalue
To decide whether \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) (matrix \(M\)): \(\lambda \in \operatorname{Sp}(u)\) if and only if \(u - \lambda \operatorname{Id}_E\) is not injective, equivalently if and only if \(M - \lambda I_n\) is not invertible, equivalently if and only if \(\operatorname{rg}(M - \lambda I_n) < n\). The eigenspace \(E_\lambda(u)\) is then the solution set of the homogeneous system \((M - \lambda I_n)X = 0\), of dimension \(n - \operatorname{rg}(M - \lambda I_n)\) by the rank theorem. Skills to practice
- Computing eigenvalues and eigenvectors
I.2
The spectrum in finite dimension
Distinct eigenvalues pull in independent directions. This is the structural fact behind all of reduction: eigenspaces attached to different eigenvalues never overlap beyond \(0_E\), they sit in direct sum. Two consequences follow at once --- eigenvectors for distinct eigenvalues form a free family, and an endomorphism of an \(n\)-dimensional space has at most \(n\) eigenvalues.
Theorem — Eigenspaces of distinct eigenvalues are in direct sum
Let \(\lambda_1, \dots, \lambda_p\) be pairwise distinct eigenvalues of \(u\). Then the sum \(E_{\lambda_1}(u) + \dots + E_{\lambda_p}(u)\) is direct.
We argue by induction on \(p \ge 1\).
- Initialization. For \(p = 1\) there is a single subspace and nothing to prove.
- Heredity. Assume the result for \(p\) pairwise distinct eigenvalues. Let \(\lambda_1, \dots, \lambda_{p+1}\) be pairwise distinct, and take \(x_i \in E_{\lambda_i}(u)\) with $$ x_1 + \dots + x_p + x_{p+1} = 0_E. \tag{\(*\)} $$ Applying \(u\) and using \(u(x_i) = \lambda_i x_i\) gives \(\lambda_1 x_1 + \dots + \lambda_{p+1} x_{p+1} = 0_E\). Subtracting \(\lambda_{p+1}\) times \((*)\) from this: $$ (\lambda_1 - \lambda_{p+1})x_1 + \dots + (\lambda_p - \lambda_{p+1})x_p = 0_E. $$ By the induction hypothesis the sum \(E_{\lambda_1} + \dots + E_{\lambda_p}\) is direct, so each term vanishes: \((\lambda_i - \lambda_{p+1})x_i = 0_E\) for \(i \le p\). As \(\lambda_i \neq \lambda_{p+1}\), this forces \(x_i = 0_E\) for \(i \le p\). Reporting into \((*)\) gives \(x_{p+1} = 0_E\) as well. The decomposition of \(0_E\) is trivial, so the sum of the \(p+1\) eigenspaces is direct.
Proposition — Eigenvectors of distinct eigenvalues are free
A family of eigenvectors of \(u\) associated with pairwise distinct eigenvalues is free.
Let \(x_1, \dots, x_p\) be eigenvectors associated with pairwise distinct eigenvalues \(\lambda_1, \dots, \lambda_p\), and suppose \(\alpha_1 x_1 + \dots + \alpha_p x_p = 0_E\). Each \(\alpha_i x_i\) lies in \(E_{\lambda_i}(u)\), and the sum of these eigenspaces is direct by the previous Theorem, so \(\alpha_i x_i = 0_E\) for every \(i\). Since \(x_i \neq 0_E\), this gives \(\alpha_i = 0\) for every \(i\). The family is free.
Proposition — The spectrum is finite
The spectrum \(\operatorname{Sp}(u)\) is finite, with \(\#\operatorname{Sp}(u) \le n = \dim E\).
Suppose \(u\) had \(n + 1\) pairwise distinct eigenvalues \(\lambda_1, \dots, \lambda_{n+1}\). Choosing an eigenvector \(x_i\) for each, the family \((x_1, \dots, x_{n+1})\) is free by the previous Proposition. But a free family of \(E\) has at most \(\dim E = n\) vectors --- a contradiction. So \(u\) has at most \(n\) distinct eigenvalues: \(\operatorname{Sp}(u)\) is finite, with \(\#\operatorname{Sp}(u) \le n\).
Example — An endomorphism with the maximum number of eigenvalues
The endomorphism of the previous answer has spectrum \(\{1, 2, -4\}\), three eigenvalues for \(n = 3\) --- the bound \(\#\operatorname{Sp}(u) \le n\) is attained. Its three eigenvectors \((1,1,1)\), \((4,3,-2)\), \((2,-3,2)\) form a free family, hence a basis of \(\mathbb{R}^3\).
Why finite dimension matters
A brief aside, stepping outside the chapter's finite-dimension convention. On the space \(\mathcal{C}^\infty(\mathbb{R})\) of smooth functions, the derivation \(f \mapsto f'\) admits every real \(\lambda\) as an eigenvalue: the function \(x \mapsto e^{\lambda x}\) satisfies \(f' = \lambda f\). Its spectrum is the whole of \(\mathbb{R}\), infinite. So the finiteness of the spectrum is genuinely a finite-dimension phenomenon, resting on the bound « a free family has at most \(n\) vectors ».
Proposition — Commuting endomorphisms
Let \(u, v \in \mathcal{L}(E)\) with \(u \circ v = v \circ u\). Then every eigenspace \(E_\lambda(u)\) is stable by \(v\).
From \(u \circ v = v \circ u\) one gets \((u - \lambda \operatorname{Id}_E) \circ v = v \circ (u - \lambda \operatorname{Id}_E)\), so \(u - \lambda \operatorname{Id}_E\) and \(v\) commute. By the kernel-stability of commuting endomorphisms --- recalled from Further linear algebra --- the kernel \(E_\lambda(u) = \operatorname{Ker}(u - \lambda \operatorname{Id}_E)\) is stable by \(v\).
Proposition — Similar matrices share the spectrum
Two similar matrices have the same spectrum: if \(A \sim B\) in \(\mathcal{M}_n(\mathbb{K})\), then \(\operatorname{Sp}(A) = \operatorname{Sp}(B)\).
Write \(B = P^{-1}AP\) with \(P \in \operatorname{GL}_n(\mathbb{K})\). If \(\lambda \in \operatorname{Sp}(B)\), take \(X \neq 0\) with \(BX = \lambda X\). Then \(A(PX) = P(BX) = \lambda(PX)\), and \(PX \neq 0\) since \(P\) is invertible, so \(\lambda \in \operatorname{Sp}(A)\). Thus \(\operatorname{Sp}(B) \subseteq \operatorname{Sp}(A)\); exchanging the roles of \(A\) and \(B\) (via \(A = PBP^{-1}\)) gives the reverse inclusion.
Example — An eigenspace stable under a commuting endomorphism
Let \(u\) have \(n\) distinct eigenvalues and let \(v\) commute with \(u\). Since \(u\) has \(n\) distinct eigenvalues, its \(n\) eigenvectors form a basis and each eigenspace \(E_\lambda(u)\) is an eigen-line. Each such line is stable by \(v\), so \(v\) maps every eigenvector of \(u\) to a multiple of itself: \(v\) is itself diagonal in the eigenbasis of \(u\). Commuting with a fixed endomorphism with \(n\) distinct eigenvalues thus forces a shared eigenbasis --- a first glimpse of simultaneous reduction. Skills to practice
- Using the direct sum of eigenspaces
II
The characteristic polynomial
II.1
Definition and first properties
Searching for eigenvalues by hand --- testing each scalar \(\lambda\) for the singularity of \(u - \lambda \operatorname{Id}_E\) --- is hopeless: there are infinitely many candidates. The way out is to make \(\lambda\) a variable. The determinant \(\det(XI_n - M)\) is a polynomial in \(X\), and its roots lying in \(\mathbb{K}\) are exactly the eigenvalues. This characteristic polynomial converts an infinite search into a finite computation.
Proposition — Eigenvalue and vanishing determinant
A scalar \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) if and only if \(\det(u - \lambda \operatorname{Id}_E) = 0\).
\(\lambda\) is an eigenvalue iff \(E_\lambda(u) = \operatorname{Ker}(u - \lambda \operatorname{Id}_E) \neq \{0_E\}\), i.e. iff \(u - \lambda \operatorname{Id}_E\) is not injective. In finite dimension an endomorphism is injective iff it is invertible, and an endomorphism is invertible iff its determinant is non-zero. So « \(u - \lambda \operatorname{Id}_E\) not injective » is « \(\det(u - \lambda \operatorname{Id}_E) = 0\) ».
Definition — Characteristic polynomial of a matrix
The characteristic polynomial of \(M \in \mathcal{M}_n(\mathbb{K})\) is $$ \chi_M = \det(XI_n - M). $$ It is an element of \(\mathbb{K}[X]\) obtained by expanding the determinant of the matrix \(XI_n - M\), whose entries are polynomials of degree at most \(1\). Example — Characteristic polynomial of a small matrix
For \(M = \begin{psmallmatrix} 1 & 2 \\ 3 & 2 \end{psmallmatrix}\), $$ \chi_M = \det\begin{pmatrix} X-1 & -2 \\
-3 & X-2 \end{pmatrix} = (X-1)(X-2) - 6 = X^2 - 3X - 4 = (X-4)(X+1), $$ so \(\operatorname{Sp}(M) = \{4, -1\}\). The coefficient of \(X\) is \(-3 = -\operatorname{tr}(M)\) and the constant term is \(-4 = \det(M)\) --- anticipating the degree-and-coefficients theorem below. Proposition — Invariance under similarity
If \(A \sim B\) in \(\mathcal{M}_n(\mathbb{K})\), then \(\chi_A = \chi_B\).
Write \(B = P^{-1}AP\) with \(P \in \operatorname{GL}_n(\mathbb{K})\). We argue by evaluation, to stay inside the determinant theory over the field \(\mathbb{K}\). For every scalar \(x \in \mathbb{K}\), $$ \begin{aligned} \chi_B(x) &= \det(xI_n - P^{-1}AP) && \text{(definition of \(\chi_B\))} \\
&= \det\big(P^{-1}(xI_n - A)P\big) && \text{(\(xI_n = P^{-1}(xI_n)P\))} \\
&= \det(P^{-1})\det(xI_n - A)\det(P) && \text{(multiplicativity of \(\det\))} \\
&= \det(xI_n - A) = \chi_A(x) && \text{(\(\det(P^{-1})\det(P) = 1\)).} \end{aligned} $$ The polynomials \(\chi_A\) and \(\chi_B\) thus agree at every point of the infinite set \(\mathbb{K}\), hence are equal.
Definition — Characteristic polynomial of an endomorphism
The characteristic polynomial of \(u \in \mathcal{L}(E)\) is \(\chi_u = \chi_M\), where \(M\) is the matrix of \(u\) in any basis of \(E\). This does not depend on the basis: two such matrices are similar, and similar matrices have the same characteristic polynomial. Example — The characteristic polynomial does not see the basis
Let \(u\) be the endomorphism of \(\mathbb{R}^2\) with matrix \(A = \begin{psmallmatrix} 1 & 2 \\ 3 & 2 \end{psmallmatrix}\) in the canonical basis. The vectors \((2, 3)\) and \((1, -1)\) satisfy \(A\begin{psmallmatrix}2\\3\end{psmallmatrix} = \begin{psmallmatrix}8\\12\end{psmallmatrix} = 4\begin{psmallmatrix}2\\3\end{psmallmatrix}\) and \(A\begin{psmallmatrix}1\\-1\end{psmallmatrix} = \begin{psmallmatrix}-1\\1\end{psmallmatrix} = -\begin{psmallmatrix}1\\-1\end{psmallmatrix}\), so in the basis \(\big((2,3),\,(1,-1)\big)\) the matrix of \(u\) is \(D = \operatorname{diag}(4, -1)\), similar to \(A\). Both give \(\chi_u = X^2 - 3X - 4\) --- read from \(A\) as \((X-1)(X-2)-6\), and from \(D\) at once as \((X-4)(X+1)\). The characteristic polynomial of \(u\) is one object, computable through any matrix of \(u\). Theorem — Eigenvalues are the roots of the characteristic polynomial
A scalar \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) if and only if \(\chi_u(\lambda) = 0\). The spectrum of \(u\) is the set of roots of \(\chi_u\) lying in \(\mathbb{K}\).
Fix a basis and let \(M\) be the matrix of \(u\). Evaluating \(\chi_u = \det(XI_n - M)\) at \(\lambda\) gives \(\chi_u(\lambda) = \det(\lambda I_n - M)\). Now \(\lambda I_n - M = -(M - \lambda I_n)\), and the determinant is \(n\)-linear in the columns, so \(\det(\lambda I_n - M) = (-1)^n \det(M - \lambda I_n)\) --- the two vanish together. By the Proposition on the vanishing determinant, \(\det(M - \lambda I_n) = 0\) exactly when \(\lambda\) is an eigenvalue of \(u\). Hence \(\chi_u(\lambda) = 0 \iff \lambda \in \operatorname{Sp}(u)\).
Theorem — Degree and coefficients of the characteristic polynomial
For \(M \in \mathcal{M}_n(\mathbb{K})\), the polynomial \(\chi_M\) is monic of degree \(n\), and $$ \chi_M = X^n - \operatorname{tr}(M)\, X^{n-1} + \dots + (-1)^n \det(M). $$
Expand \(\chi_M = \det(XI_n - M)\) by the permutation formula: $$ \chi_M = \sum_{\sigma \in \mathfrak{S}_n} \varepsilon(\sigma) \prod_{i=1}^n \big(X\delta_{\sigma(i),i} - m_{\sigma(i),i}\big), $$ where \(\delta\) is the Kronecker symbol.
- Degree \(n\) and monic. A factor contributes an \(X\) only when \(\sigma(i) = i\). A term of degree \(n\) thus needs \(\sigma(i) = i\) for every \(i\), i.e. \(\sigma = \operatorname{id}\). That term is \(\prod_{i=1}^n (X - m_{i,i})\), of degree \(n\) with leading coefficient \(1\). So \(\chi_M\) is monic of degree \(n\).
- Coefficient of \(X^{n-1}\). A permutation fixing exactly \(n-1\) indices fixes them all, so again only \(\sigma = \operatorname{id}\) contributes degree \(n-1\). Expanding \(\prod_{i=1}^n (X - m_{i,i})\) gives \(X^n - (m_{1,1} + \dots + m_{n,n})X^{n-1} + \dots = X^n - \operatorname{tr}(M)X^{n-1} + \dots\)
- Constant term. It is \(\chi_M(0) = \det(-M) = (-1)^n \det(M)\).
Example — Characteristic polynomial of a triangular matrix
If \(T\) is upper-triangular with diagonal entries \(d_1, \dots, d_n\), then \(XI_n - T\) is upper-triangular with diagonal \(X - d_1, \dots, X - d_n\), so $$ \chi_T = \prod_{i=1}^n (X - d_i). $$ The spectrum of a triangular matrix is read straight off its diagonal: \(\operatorname{Sp}(T) = \{d_1, \dots, d_n\}\). Method — Compute the characteristic polynomial and the spectrum
To find the eigenvalues of \(u\) given by a matrix \(M\): form \(XI_n - M\), compute \(\chi_M = \det(XI_n - M)\) --- by cofactor expansion, or by adding a \(\mathbb{K}[X]\)-multiple of one row (or column) to another, both valid over the ring \(\mathbb{K}[X]\); avoid dividing a line by a non-constant polynomial --- then factor \(\chi_M\). The eigenvalues are the roots of \(\chi_M\) in \(\mathbb{K}\). The degree-\((n-1)\) and constant coefficients, \(-\operatorname{tr}(M)\) and \((-1)^n\det(M)\), give a quick check of the computation. Skills to practice
- Computing the characteristic polynomial
II.2
Multiplicity of an eigenvalue
An eigenvalue can be a simple root of \(\chi_u\) or a repeated one. The number of repetitions --- its multiplicity --- is the second number attached to an eigenvalue, beside the dimension of its eigenspace. The two are linked by a fundamental inequality: the eigenspace can never be larger than the multiplicity permits. That inequality, \(\dim E_\lambda \le m(\lambda)\), is the gate to diagonalisability.
Definition — Multiplicity of an eigenvalue
Let \(\lambda\) be an eigenvalue of \(u\). Its multiplicity, written \(m(\lambda)\), is the multiplicity of \(\lambda\) as a root of the characteristic polynomial \(\chi_u\). An eigenvalue of multiplicity \(1\) is called simple. Example — Reading multiplicities off a factored characteristic polynomial
If \(\chi_u = (X - 1)(X - 2)^2\), then \(u\) has eigenvalues \(1\) and \(2\), with \(m(1) = 1\) (simple) and \(m(2) = 2\) (double). The multiplicities sum to \(\deg \chi_u = 3 = n\), as they always do when \(\chi_u\) is split. Example — A triple eigenvalue
For the upper-triangular matrix \(T = \begin{psmallmatrix} 3 & 1 & 0 \\ 0 & 3 & 1 \\ 0 & 0 & 3 \end{psmallmatrix}\), the characteristic polynomial is \(\chi_T = (X - 3)^3\), read off the diagonal. So \(3\) is the only eigenvalue, of multiplicity \(m(3) = 3\) --- a triple eigenvalue. Yet \(T - 3I_3\) has rank \(2\), so \(\dim E_3(T) = 3 - 2 = 1\), far below \(m(3)\): multiplicity counts roots of \(\chi_T\), not dimensions of eigenspaces. Proposition — Number of roots of the characteristic polynomial
\(\chi_u\) is monic of degree \(n\), so it has at most \(n\) roots in \(\mathbb{K}\); viewed as an element of \(\mathbb{C}[X]\), it has exactly \(n\) roots counted with multiplicity. In particular \(u\) has at most \(n\) distinct eigenvalues.
A non-zero polynomial of degree \(n\) has at most \(n\) roots in any field, counted with multiplicity --- recalled from Polynomials. Over \(\mathbb{C}\), d'Alembert-Gauss makes \(\chi_u\) split, so it has exactly \(n\) roots counted with multiplicity. The eigenvalues of \(u\) are among the roots lying in \(\mathbb{K}\), hence at most \(n\) in number. This re-derives, through \(\chi_u\), the finiteness of the spectrum already obtained in § 1.2 by the free-family bound, and sharpens it: the count is tied to \(\deg \chi_u = n\).
Proposition — Characteristic polynomial of an induced endomorphism
Let \(F\) be a subspace of \(E\) stable by \(u\), with \(F \neq \{0_E\}\), and let \(u_F\) be the induced endomorphism. Then \(\chi_{u_F}\) divides \(\chi_u\).
If \(F = E\) then \(u_F = u\) and \(\chi_{u_F} = \chi_u\) divides \(\chi_u\); assume now \(F \subsetneq E\). Set \(p = \dim F\), so \(1 \le p < n\). Complete a basis of \(F\) into a basis \(\mathcal{B}\) of \(E\). Since \(F\) is stable by \(u\), the matrix of \(u\) in \(\mathcal{B}\) is block upper-triangular, $$ M = \begin{pmatrix} A & B \\
0 & C \end{pmatrix}, \qquad A = \operatorname{Mat}(u_F), $$ recalled from Further linear algebra. Then \(XI_n - M\) is block upper-triangular with diagonal blocks \(XI_p - A\) and \(XI_{n-p} - C\), and the determinant of a block-triangular matrix is the product of the determinants of the diagonal blocks: $$ \chi_u = \det(XI_n - M) = \det(XI_p - A)\cdot\det(XI_{n-p} - C) = \chi_{u_F}\cdot\chi_C. $$ Hence \(\chi_{u_F}\) divides \(\chi_u\).
Theorem — Dimension of an eigenspace
Let \(\lambda\) be an eigenvalue of \(u\) of multiplicity \(m(\lambda)\). Then $$ 1 \le \dim E_\lambda(u) \le m(\lambda). $$
Write \(d = \dim E_\lambda(u)\). Since \(\lambda\) is an eigenvalue, \(E_\lambda(u) \neq \{0_E\}\), so \(d \ge 1\).
The subspace \(E_\lambda(u)\) is stable by \(u\), and the induced endomorphism is \(u_{E_\lambda} = \lambda \operatorname{Id}_{E_\lambda}\) --- a homothety, since \(u(x) = \lambda x\) for every \(x \in E_\lambda(u)\). Its characteristic polynomial is therefore \(\chi_{u_{E_\lambda}} = (X - \lambda)^d\). By the previous Proposition, \((X - \lambda)^d\) divides \(\chi_u\). The multiplicity of \(\lambda\) as a root of \(\chi_u\) is thus at least \(d\), that is \(d \le m(\lambda)\).
The subspace \(E_\lambda(u)\) is stable by \(u\), and the induced endomorphism is \(u_{E_\lambda} = \lambda \operatorname{Id}_{E_\lambda}\) --- a homothety, since \(u(x) = \lambda x\) for every \(x \in E_\lambda(u)\). Its characteristic polynomial is therefore \(\chi_{u_{E_\lambda}} = (X - \lambda)^d\). By the previous Proposition, \((X - \lambda)^d\) divides \(\chi_u\). The multiplicity of \(\lambda\) as a root of \(\chi_u\) is thus at least \(d\), that is \(d \le m(\lambda)\).
Proposition — A simple eigenvalue has a one-dimensional eigenspace
If \(\lambda\) is a simple eigenvalue of \(u\), then \(\dim E_\lambda(u) = 1\).
A simple eigenvalue has \(m(\lambda) = 1\). The Theorem gives \(1 \le \dim E_\lambda(u) \le 1\), hence \(\dim E_\lambda(u) = 1\).
Example — An eigenspace strictly smaller than the multiplicity
For \(M = \begin{psmallmatrix} 2 & 1 \\ 0 & 2 \end{psmallmatrix}\), \(\chi_M = (X - 2)^2\), so \(2\) is an eigenvalue of multiplicity \(m(2) = 2\). But \(M - 2I_2 = \begin{psmallmatrix} 0 & 1 \\ 0 & 0 \end{psmallmatrix}\) has rank \(1\), so \(\dim E_2(M) = 2 - 1 = 1 < m(2)\). The inequality \(\dim E_\lambda \le m(\lambda)\) can be strict --- and when it is, diagonalisability fails, as § 3 will show. Method — Compute the dimension of an eigenspace
To find \(\dim E_\lambda(u)\) for an eigenvalue \(\lambda\) (matrix \(M\)): compute the rank of \(M - \lambda I_n\) by row reduction, then apply the rank theorem, $$ \dim E_\lambda(u) = n - \operatorname{rg}(M - \lambda I_n). $$ This number is to be compared with \(m(\lambda)\): the verdict \(\dim E_\lambda = m(\lambda)\) or \(\dim E_\lambda < m(\lambda)\) decides diagonalisability eigenvalue by eigenvalue. Skills to practice
- Exploiting the multiplicity of an eigenvalue
III
Diagonalisability
III.1
Diagonalisable endomorphisms and matrices
The favourable case of reduction is the one where a basis of eigenvectors exists. In such a basis \(u\) acts coordinate by coordinate as a dilation, and its matrix is diagonal --- the simplest shape a matrix can take. This subsection fixes the vocabulary, for endomorphisms and for matrices, and reads the geometric content: \(E\) splits into eigen-lines on each of which \(u\) is a homothety.
Definition — Diagonalisable endomorphism and matrix
The endomorphism \(u\) is diagonalisable when there exists a basis of \(E\) in which the matrix of \(u\) is diagonal. A matrix \(M \in \mathcal{M}_n(\mathbb{K})\) is diagonalisable when it is similar to a diagonal matrix, that is, \(P^{-1}MP = D\) for some \(P \in \operatorname{GL}_n(\mathbb{K})\) and some diagonal \(D\). Example — A diagonalisation written out
The matrix \(M = \begin{psmallmatrix} 3 & 1 \\ 2 & 2 \end{psmallmatrix}\) is diagonalisable. Its characteristic polynomial is \(\chi_M = (X-3)(X-2) - 2 = X^2 - 5X + 4 = (X-1)(X-4)\), with two distinct eigenvalues \(1\) and \(4\). Solving \((M - I_2)X = 0\) and \((M - 4I_2)X = 0\) gives eigenvectors \((1, -2)\) and \((1, 1)\). With \(P = \begin{psmallmatrix} 1 & 1 \\ -2 & 1 \end{psmallmatrix}\), $$ P^{-1}MP = \begin{pmatrix} 1 & 0 \\
0 & 4 \end{pmatrix}, $$ the columns of \(P\) being a basis of eigenvectors. Proposition — Endomorphism and matrix versions agree
An endomorphism \(u\) is diagonalisable if and only if its matrix in one --- equivalently every --- basis is a diagonalisable matrix. More generally, the same holds for any matrix property invariant under similarity (in particular trigonalisability and nilpotence, met in §§ 4--5).
The matrices of \(u\) in two bases are similar, and similarity is an equivalence relation. So if the matrix of \(u\) in one basis is similar to a diagonal matrix, the matrix in any other basis --- similar to the first --- is also similar to that diagonal matrix, by transitivity. Hence « the matrix is diagonalisable » does not depend on the basis. And \(u\) is diagonalisable, by definition, exactly when some basis gives a diagonal matrix, i.e. exactly when the matrix of \(u\) is similar to a diagonal one. The argument used only that similarity is transitive, so it applies verbatim to every similarity-invariant matrix property --- in particular « similar to an upper-triangular matrix » and « some power is zero », the trigonalisable and nilpotent cases of §§ 4--5.
Proposition — A diagonalising basis is a basis of eigenvectors
A basis \(\mathcal{B} = (e_1, \dots, e_n)\) diagonalises \(u\) if and only if every \(e_i\) is an eigenvector of \(u\). In that basis the diagonal entries of the matrix are the associated eigenvalues, each eigenvalue \(\lambda\) appearing exactly \(\dim E_\lambda(u)\) times.
The matrix of \(u\) in \(\mathcal{B}\) is diagonal, with diagonal entries \(d_1, \dots, d_n\), if and only if \(u(e_i) = d_i e_i\) for every \(i\) --- that is, if and only if each \(e_i\) is an eigenvector for the eigenvalue \(d_i\). For a fixed eigenvalue \(\lambda\), the basis vectors \(e_i\) with \(d_i = \lambda\) are eigenvectors in \(E_\lambda(u)\); being part of a basis they are free, and they span \(E_\lambda(u)\) (any eigenvector for \(\lambda\) is a combination of basis vectors, necessarily of those with \(d_i = \lambda\)). So they form a basis of \(E_\lambda(u)\), and \(\lambda\) occurs \(\dim E_\lambda(u)\) times on the diagonal --- as the worked \(3 \times 3\) of § 3.2 will show, where \(0\) fills two of the three diagonal slots.
Example — Diagonalisable endomorphisms one already knows
The identity and every homothety are diagonalisable --- they are already diagonal in every basis. A projector \(p\) is diagonalisable: \(E = \operatorname{Ker} p \oplus \operatorname{Im} p\), and a basis adapted to this decomposition is a basis of eigenvectors (eigenvalues among \(0\) and \(1\)). A vector symmetry is diagonalisable likewise, with eigenvalues among \(-1\) and \(1\). Example — An endomorphism that is not diagonalisable
The matrix \(M = \begin{psmallmatrix} 0 & 1 \\ 0 & 0 \end{psmallmatrix}\) is not diagonalisable. Were it \(PDP^{-1}\) with \(D\) diagonal, \(D\) would have the same spectrum as \(M\), namely \(\{0\}\) (since \(\chi_M = X^2\)), so \(D = 0\), hence \(M = P\,0\,P^{-1} = 0\) --- false. This non-zero matrix with a single eigenvalue is a nilpotent matrix, the obstruction studied in § 5.
The geometry of diagonalisation
When \(u\) is diagonalisable, \(E\) breaks into a direct sum of eigen-lines (or eigenspaces): \(E = E_{\lambda_1}(u) \oplus E_{\lambda_2}(u) \oplus \cdots\). A vector \(x\) decomposes uniquely as \(x = x_1 + x_2 + \cdots\) with \(x_i \in E_{\lambda_i}(u)\), and \(u\) stretches each piece by its own ratio: \(u(x) = \lambda_1 x_1 + \lambda_2 x_2 + \cdots\). The figure shows the two-eigenspace case.
Skills to practice
- Identifying a diagonalisable matrix
III.2
Characterisations of diagonalisability
Whether \(u\) is diagonalisable is now decided by three equivalent tests, of decreasing abstraction. The first is structural: \(E\) is the direct sum of the eigenspaces. The second counts dimensions. The third is the practical one: \(\chi_u\) must be split, and each eigenspace must reach the dimension its multiplicity allows. A sufficient shortcut closes the subsection --- \(n\) distinct eigenvalues are enough.
Theorem — Diagonalisable through the direct sum of eigenspaces
The endomorphism \(u\) is diagonalisable if and only if \(E\) is the direct sum of its eigenspaces: $$ E = \bigoplus_{\lambda \in \operatorname{Sp}(u)} E_\lambda(u). $$
The sum of the eigenspaces is always direct (§ 1.2), so the condition is really « the eigenspaces span \(E\) ».
- \((\Rightarrow)\) If \(u\) is diagonalisable, a diagonalising basis is a basis of eigenvectors; each basis vector lies in some \(E_\lambda(u)\), so the eigenspaces span \(E\), and their direct sum is \(E\).
- \((\Leftarrow)\) If \(E = \bigoplus_\lambda E_\lambda(u)\), concatenating a basis of each \(E_\lambda(u)\) yields a basis of \(E\) adapted to the decomposition. Every vector of this basis is an eigenvector, so the basis diagonalises \(u\).
Proposition — Dimension criterion
The endomorphism \(u\) is diagonalisable if and only if $$ \sum_{\lambda \in \operatorname{Sp}(u)} \dim E_\lambda(u) = n. $$
The sum \(\sum_\lambda E_\lambda(u)\) is direct, so its dimension is \(\sum_\lambda \dim E_\lambda(u)\). This sum equals the whole of \(E\) exactly when that dimension equals \(\dim E = n\). By the previous Theorem, that is exactly diagonalisability.
Proposition — An endomorphism with n distinct eigenvalues
If \(u\) has \(n = \dim E\) distinct eigenvalues, then \(u\) is diagonalisable.
Pick one eigenvector for each of the \(n\) distinct eigenvalues. By § 1.2 a family of eigenvectors associated with distinct eigenvalues is free, so these \(n\) vectors form a free family of \(E\); having \(n = \dim E\) elements, they form a basis. It is a basis of eigenvectors, so \(u\) is diagonalisable.
Theorem — Diagonalisable through the characteristic polynomial
The endomorphism \(u\) is diagonalisable over \(\mathbb{K}\) if and only if its characteristic polynomial \(\chi_u\) is split over \(\mathbb{K}\) and, for every eigenvalue \(\lambda \in \operatorname{Sp}(u)\), \(\dim E_\lambda(u) = m(\lambda)\). - \((\Rightarrow)\) Suppose \(u\) diagonalisable. In a diagonalising basis the matrix is diagonal, with each eigenvalue \(\lambda\) repeated \(\dim E_\lambda(u)\) times; its characteristic polynomial is then \(\chi_u = \prod_{\lambda} (X - \lambda)^{\dim E_\lambda(u)}\). This is split over \(\mathbb{K}\), and the exponent of \((X - \lambda)\), namely \(m(\lambda)\), equals \(\dim E_\lambda(u)\).
- \((\Leftarrow)\) Suppose \(\chi_u\) split with \(\dim E_\lambda(u) = m(\lambda)\) for every \(\lambda\). Since \(\chi_u\) is split of degree \(n\), the multiplicities sum to \(n\): \(\sum_\lambda m(\lambda) = n\). Hence \(\sum_\lambda \dim E_\lambda(u) = \sum_\lambda m(\lambda) = n\), and the dimension criterion gives diagonalisability.
Method — Diagonalise a matrix
Given \(M \in \mathcal{M}_n(\mathbb{K})\), proceed in two stages. - Decide. Compute \(\chi_M\) and factor it. If \(\chi_M\) is not split over \(\mathbb{K}\), \(M\) is not diagonalisable over \(\mathbb{K}\). If it is split, for each eigenvalue \(\lambda\) of multiplicity \(m(\lambda) \ge 2\) compute \(\dim E_\lambda = n - \operatorname{rg}(M - \lambda I_n)\) and compare with \(m(\lambda)\); \(M\) is diagonalisable if and only if equality holds for every \(\lambda\) (simple eigenvalues need no check, \(\dim E_\lambda = 1 = m(\lambda)\) automatically).
- Diagonalise. When \(M\) is diagonalisable, solve \((M - \lambda I_n)X = 0\) for each \(\lambda\) to get a basis of \(E_\lambda\), concatenate these into the columns of \(P\); then \(P^{-1}MP = D\) is diagonal, with the eigenvalues on the diagonal in the matching order.
Example — A non-diagonalisable verdict
The matrix \(M = \begin{psmallmatrix} 2 & 1 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{psmallmatrix}\) has \(\chi_M = (X - 2)^3\), split over \(\mathbb{R}\), with the single eigenvalue \(2\) of multiplicity \(m(2) = 3\). Applying the Method: \(M - 2I_3\) has rank \(1\), so \(\dim E_2 = 3 - 1 = 2 < 3 = m(2)\). The criterion fails for the eigenvalue \(2\), so \(M\) is not diagonalisable --- even though \(\chi_M\) is split. The split characteristic polynomial is necessary but not sufficient; the dimension test is what decides. Example — Diagonalise a 3 by 3 matrix
Diagonalise the matrix $$ M = \begin{pmatrix} 1 & 1 & 1 \\
1 & 1 & 1 \\
1 & 1 & 1 \end{pmatrix} \in \mathcal{M}_3(\mathbb{R}). $$
\(M\) has rank \(1\), so \(0\) is an eigenvalue with \(\dim E_0 = 3 - \operatorname{rg}(M) = 2\); hence \(m(0) \ge \dim E_0 = 2\) and \(X^2\) divides the monic degree-\(3\) polynomial \(\chi_M\), giving \(\chi_M = X^2(X - a)\) for some \(a\). The coefficient of \(X^2\) in \(\chi_M\) is \(-\operatorname{tr}(M) = -3\) by the degree-and-coefficients theorem of § 2.1, and in \(X^2(X - a)\) it is \(-a\); so \(a = 3\) and \(\chi_M = X^2(X - 3)\), split over \(\mathbb{R}\), with \(m(0) = 2\) and \(m(3) = 1\).
- \(\dim E_0 = 2 = m(0)\) and \(\dim E_3 = 1 = m(3)\) (simple), so \(M\) is diagonalisable.
- \(E_0 = \operatorname{Ker} M\): the system \(x + y + z = 0\) gives the basis \((1, -1, 0)\), \((1, 0, -1)\).
- \(E_3 = \operatorname{Ker}(M - 3I_3)\): the basis \((1, 1, 1)\).
Skills to practice
- Diagonalising a matrix
IV
Trigonalisability
IV.1
Trigonalisable endomorphisms and matrices
Diagonalisation can fail --- the matrix \(\begin{psmallmatrix} 2 & 1 \\ 0 & 2 \end{psmallmatrix}\) already resists it. When it does, the next best shape is triangular: a basis in which the matrix of \(u\) is upper-triangular still exposes the spectrum on the diagonal and tames the powers of \(u\). This subsection names trigonalisable endomorphisms and matrices; the next characterises them.
Definition — Trigonalisable endomorphism and matrix
The endomorphism \(u\) is trigonalisable when there exists a basis of \(E\) in which the matrix of \(u\) is upper-triangular. A matrix \(M \in \mathcal{M}_n(\mathbb{K})\) is trigonalisable when it is similar to an upper-triangular matrix. As for diagonalisability (§ 3.1), an endomorphism is trigonalisable if and only if its matrix in any basis is a trigonalisable matrix. Example — Diagonalisable implies trigonalisable
Every upper-triangular matrix is trigonalisable, being similar to itself. Every diagonalisable endomorphism is trigonalisable, a diagonal matrix being in particular upper-triangular. Trigonalisability is thus a weaker demand than diagonalisability --- strictly weaker, as the next example shows. Example — Trigonalisable without being diagonalisable
The matrix \(\begin{psmallmatrix} 1 & 1 \\ 0 & 1 \end{psmallmatrix}\) is upper-triangular, hence trigonalisable, yet it is not diagonalisable: \(\chi = (X-1)^2\) and \(\dim E_1 = 2 - \operatorname{rg}\begin{psmallmatrix} 0 & 1 \\ 0 & 0 \end{psmallmatrix} = 1 < 2 = m(1)\). Trigonalisability is genuinely the broader notion. Example — A trigonalisation exhibited
For \(M = \begin{psmallmatrix} 2 & -1 \\ 1 & 0 \end{psmallmatrix}\), take \(P = \begin{psmallmatrix} 1 & 1 \\ 1 & 0 \end{psmallmatrix}\); a direct computation gives \(P^{-1}MP = \begin{psmallmatrix} 1 & 1 \\ 0 & 1 \end{psmallmatrix}\), upper-triangular. So \(M\) is trigonalisable, with sole eigenvalue \(1\). Here the triangularising \(P\) is given and the similarity merely checked --- finding such a \(P\) is not a program objective (see § 4.2). Skills to practice
- Identifying a trigonalisable endomorphism
IV.2
Characterisation by the characteristic polynomial
Trigonalisability has a single, clean criterion: it holds exactly when the characteristic polynomial is split. One direction is immediate --- a triangular matrix has a split characteristic polynomial. The other is an induction that peels off one eigenvector at a time. The criterion has an immediate consequence over \(\mathbb{C}\), and it pins down the trace and the determinant in terms of the eigenvalues.
Theorem — Trigonalisable through a split characteristic polynomial
The endomorphism \(u\) is trigonalisable over \(\mathbb{K}\) if and only if its characteristic polynomial \(\chi_u\) is split over \(\mathbb{K}\). - \((\Rightarrow)\) If \(u\) is trigonalisable, in a suitable basis its matrix \(T\) is upper-triangular with diagonal \(d_1, \dots, d_n\), so \(\chi_u = \chi_T = \prod_{i=1}^n (X - d_i)\) is split over \(\mathbb{K}\).
- \((\Leftarrow)\) Suppose \(\chi_u\) split; we trigonalise by induction on \(n = \dim E\).
- Initialization. For \(n = 1\) every matrix is already upper-triangular.
- Heredity. Assume the result in dimension \(n - 1\). As \(\chi_u\) is split it has a root \(\lambda \in \mathbb{K}\), an eigenvalue of \(u\); let \(e_1\) be an associated eigenvector and complete it into a basis \((e_1, \dots, e_n)\). The matrix of \(u\) in this basis is $$ M = \begin{pmatrix} \lambda & L \\ 0 & M' \end{pmatrix}, \qquad M' \in \mathcal{M}_{n-1}(\mathbb{K}). $$ Then \(\chi_u = (X - \lambda)\chi_{M'}\), and since \(\chi_u\) is split so is \(\chi_{M'}\). By the induction hypothesis \(M'\) is trigonalisable: \(M' = Q^{-1}T'Q\) with \(T'\) upper-triangular. Setting \(P = \begin{psmallmatrix} 1 & 0 \\ 0 & Q^{-1} \end{psmallmatrix}\), a block computation gives $$ P^{-1}MP = \begin{pmatrix} \lambda & LQ^{-1} \\ 0 & QM'Q^{-1} \end{pmatrix} = \begin{pmatrix} \lambda & LQ^{-1} \\ 0 & T' \end{pmatrix}, $$ which is upper-triangular. So \(u\) is trigonalisable.
Proposition — Trigonalisability over the complex numbers
Every matrix \(M \in \mathcal{M}_n(\mathbb{C})\) is trigonalisable in \(\mathcal{M}_n(\mathbb{C})\), since \(\chi_M\) is always split over \(\mathbb{C}\). A real matrix \(M\) is trigonalisable over \(\mathbb{R}\) exactly when \(\chi_M\) splits over \(\mathbb{R}\); when it does not --- for instance the matrix of a planar rotation of angle \(\theta \notin \pi\mathbb{Z}\), whose \(\chi\) has no real root --- \(M\) is still trigonalisable once read in \(\mathcal{M}_n(\mathbb{C})\).
For \(M \in \mathcal{M}_n(\mathbb{C})\), the polynomial \(\chi_M\) has degree \(n\) and, by d'Alembert-Gauss (recalled from Polynomials), splits over \(\mathbb{C}\); the Theorem then makes \(M\) trigonalisable over \(\mathbb{C}\). For a real matrix \(M\), the Theorem applied with \(\mathbb{K} = \mathbb{R}\) gives trigonalisability over \(\mathbb{R}\) exactly when \(\chi_M\) splits over \(\mathbb{R}\); failing that, reading \(M\) in \(\mathcal{M}_n(\mathbb{C})\) brings back the complex case.
Carrying out a trigonalisation in practice is not a program objective. The theorem is used as a structural tool --- it guarantees that a triangular form exists --- but the explicit search for a triangularising basis is not required. The exercises only ever verify a triangular form that is handed to them.
Proposition — Trace and determinant of a trigonalisable endomorphism
If \(u\) is trigonalisable over \(\mathbb{K}\), then \(\operatorname{tr}(u)\) is the sum and \(\det(u)\) the product of the eigenvalues of \(u\), each counted with its multiplicity.
Trigonalise \(u\): in a suitable basis its matrix is upper-triangular, \(T\), with diagonal entries \(d_1, \dots, d_n\). The diagonal of a triangular matrix carries its eigenvalues, each \(\lambda\) appearing \(m(\lambda)\) times since \(\chi_T = \prod_i (X - d_i)\). Trace and determinant are similarity invariants, so \(\operatorname{tr}(u) = \operatorname{tr}(T) = \sum_i d_i\) and \(\det(u) = \det(T) = \prod_i d_i\) --- the sum and product of the eigenvalues counted with multiplicity.
Extension to any endomorphism
More generally, for any \(u \in \mathcal{L}(E)\), the trace and determinant are the sum and product of the roots of \(\chi_u\) in \(\mathbb{C}\), counted with multiplicity. This needs no new theorem: \(\chi_u = X^n - \operatorname{tr}(u)X^{n-1} + \dots + (-1)^n\det(u)\) from § 2, and \(\chi_u\) factors over \(\mathbb{C}\) as \(\prod_{i=1}^n (X - \lambda_i)\) by d'Alembert-Gauss; identifying the coefficients of \(X^{n-1}\) and the constant term reads off \(\operatorname{tr}(u) = \sum_i \lambda_i\) and \(\det(u) = \prod_i \lambda_i\). When \(\mathbb{K} = \mathbb{C}\) these roots are exactly the eigenvalues; over a proper subfield, the non-\(\mathbb{K}\) roots are not eigenvalues of \(u\).
Example — Reading trace and determinant off the eigenvalues
Let \(M\) be trigonalisable over \(\mathbb{K}\), similar to an upper-triangular \(T\) of diagonal \((\lambda_i)\) --- the eigenvalues counted with multiplicity. Then \(\operatorname{tr}(M) = \sum_i \lambda_i\) and \(\det(M) = \prod_i \lambda_i\). Moreover \(M^k = P^{-1}T^kP\) with \(T^k\) upper-triangular of diagonal \((\lambda_i^k)\), so for every integer \(k \ge 1\), $$ \operatorname{tr}(M^k) = \operatorname{tr}(T^k) = \sum_i \lambda_i^k, $$ by similarity invariance of the trace. A matrix with eigenvalues \(1, 2, 2\) thus has \(\operatorname{tr}(M) = 5\), \(\det(M) = 4\) and \(\operatorname{tr}(M^2) = 1 + 4 + 4 = 9\). Method — Exploit a split characteristic polynomial
When \(\chi_u\) is split --- always so over \(\mathbb{C}\) --- \(u\) is trigonalisable, and one may reason as though its matrix were upper-triangular even without computing a triangularising basis. Use this to read \(\operatorname{tr}(u)\) and \(\det(u)\) as the sum and product of the eigenvalues counted with multiplicity, to compute \(\operatorname{tr}(u^k) = \sum \lambda_i^k\) for \(k \ge 1\) (same multiplicity convention), and to bound or locate the spectrum. The triangular form is invoked for what it guarantees, never searched for explicitly. Skills to practice
- Exploiting a split characteristic polynomial
V
Nilpotent endomorphisms
V.1
Nilpotent endomorphisms and matrices
At the opposite extreme from diagonalisability sits a family of endomorphisms that no basis can diagonalise unless they are zero: the nilpotent ones, those a high enough power annihilates. They are the pure obstruction to reduction --- a non-zero nilpotent endomorphism has a single eigenvalue, \(0\), and yet is not the zero map. This subsection defines them and bounds how soon they vanish.
Definition — Nilpotent endomorphism and matrix
The endomorphism \(u\) is nilpotent when \(u^k = 0_{\mathcal{L}(E)}\) for some \(k \in \mathbb{N}^*\). The least such \(k\) is the index of nilpotence of \(u\): writing it \(p\), one has \(u^p = 0\) and \(u^{p-1} \neq 0\). A matrix \(M \in \mathcal{M}_n(\mathbb{K})\) is nilpotent when \(M^k = 0\) for some \(k \in \mathbb{N}^*\). As for diagonalisability (§ 3.1), an endomorphism is nilpotent if and only if its matrix in any basis is a nilpotent matrix. Example — The shift on a basis
On a space with basis \((e_1, \dots, e_p)\), the endomorphism \(u\) defined by \(u(e_i) = e_{i-1}\) for \(i \ge 2\) and \(u(e_1) = 0_E\) is nilpotent of index exactly \(p\): each application of \(u\) shifts the basis one step down the chain, and \(u^p\) kills every \(e_i\) while \(u^{p-1}(e_p) = e_1 \neq 0_E\). Its matrix is strictly upper-triangular with \(1\)'s just above the diagonal.
Example — The derivation on polynomials of bounded degree
On \(\mathbb{K}_d[X]\), the space of polynomials of degree at most \(d\) (of dimension \(d + 1\)), the derivation \(D : P \mapsto P'\) is nilpotent: differentiating \(d + 1\) times annihilates every polynomial of degree \(\le d\). Its index is exactly \(d + 1\), since \(D^d(X^d) = d!\,\neq 0\) while \(D^{d+1}(X^d) = 0\) --- the index equals the dimension. Example — A strictly triangular matrix and its index
A strictly upper-triangular matrix is nilpotent. For the strictly triangular \(N = \begin{psmallmatrix} 0 & 1 & 2 \\ 0 & 0 & 3 \\ 0 & 0 & 0 \end{psmallmatrix}\), one computes \(N^2 = \begin{psmallmatrix} 0 & 0 & 3 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{psmallmatrix} \neq 0\) and \(N^3 = 0\): the index of nilpotence is \(3\), equal to the size of the matrix. Proposition — The index is bounded by the dimension
Let \(u\) be a nilpotent endomorphism of \(E\). Its index of nilpotence is at most \(n = \dim E\).
Let \(p\) be the index, so \(u^{p-1} \neq 0\): there is a vector \(x\) with \(u^{p-1}(x) \neq 0_E\). We show the family \((x, u(x), \dots, u^{p-1}(x))\) is free, which forces \(p \le n\) since a free family of \(E\) has at most \(n\) vectors.
Suppose \(\alpha_0 x + \alpha_1 u(x) + \dots + \alpha_{p-1} u^{p-1}(x) = 0_E\). Apply \(u^{p-1}\): every term \(u^{p-1+j}(x)\) with \(j \ge 1\) vanishes (as \(u^p = 0\)), leaving \(\alpha_0 u^{p-1}(x) = 0_E\), hence \(\alpha_0 = 0\). Apply \(u^{p-2}\) to what remains: likewise \(\alpha_1 = 0\). Continuing, every \(\alpha_j = 0\). The family is free.
Suppose \(\alpha_0 x + \alpha_1 u(x) + \dots + \alpha_{p-1} u^{p-1}(x) = 0_E\). Apply \(u^{p-1}\): every term \(u^{p-1+j}(x)\) with \(j \ge 1\) vanishes (as \(u^p = 0\)), leaving \(\alpha_0 u^{p-1}(x) = 0_E\), hence \(\alpha_0 = 0\). Apply \(u^{p-2}\) to what remains: likewise \(\alpha_1 = 0\). Continuing, every \(\alpha_j = 0\). The family is free.
Skills to practice
- Identifying a nilpotent endomorphism
V.2
Characterisation of nilpotence
Nilpotence has a clean spectral signature. A nilpotent endomorphism is exactly one whose characteristic polynomial is \(X^n\) --- equivalently, one that is trigonalisable with \(0\) as its only eigenvalue. The proof stays entirely inside \(\mathbb{K}\): it reads the strictly triangular shape off the chain of kernels \(\operatorname{Ker} u \subseteq \operatorname{Ker} u^2 \subseteq \cdots\).
Theorem — Characterisation of nilpotence
For \(u \in \mathcal{L}(E)\), the following are equivalent: - \(u\) is nilpotent;
- \(\chi_u = X^n\);
- \(u\) is trigonalisable over \(\mathbb{K}\) with \(0\) as its only eigenvalue.
The argument runs entirely over \(\mathbb{K}\), with no complexification.
- \((1 \Rightarrow 2)\) Let \(u\) be nilpotent of index \(p\). The chain of kernels \(\{0_E\} \subseteq \operatorname{Ker} u \subseteq \operatorname{Ker} u^2 \subseteq \dots \subseteq \operatorname{Ker} u^p = E\) has each term stable by \(u\), and \(u(\operatorname{Ker} u^k) \subseteq \operatorname{Ker} u^{k-1}\): indeed \(x \in \operatorname{Ker} u^k\) gives \(u^{k-1}(u(x)) = u^k(x) = 0_E\). Build a basis of \(E\) adapted to this chain, completing a basis of \(\operatorname{Ker} u^{k-1}\) into one of \(\operatorname{Ker} u^k\) at each step. In that basis \(u\) sends each block strictly down the chain, so its matrix is strictly upper-triangular, with a zero diagonal. Hence \(\chi_u = \prod (X - 0) = X^n\).
- \((2 \Rightarrow 3)\) If \(\chi_u = X^n\), it is split over \(\mathbb{K}\), so \(u\) is trigonalisable over \(\mathbb{K}\); the eigenvalues are the roots of \(\chi_u\), namely \(0\) alone.
- \((3 \Rightarrow 1)\) Trigonalise \(u\): its matrix is upper-triangular, and its diagonal carries the eigenvalues, all equal to \(0\). So the matrix \(T\) is strictly upper-triangular. Such a \(T\) satisfies \(T^n = 0\): writing \((e_1, \dots, e_n)\) for the basis, \(T\) sends \(\operatorname{Vect}(e_1, \dots, e_i)\) into \(\operatorname{Vect}(e_1, \dots, e_{i-1})\) --- lowering the index by at least one --- so \(T^n\) annihilates every basis vector. Hence \(u^n = 0\) and \(u\) is nilpotent.
Example — Recognising nilpotence from the characteristic polynomial
A nilpotent matrix \(N\) has \(\chi_N = X^n\), so all its coefficients of degree \(< n\) vanish: in particular \(\operatorname{tr}(N) = 0\) and \(\det(N) = 0\). Conversely, the matrix \(N = \begin{psmallmatrix} 1 & 1 \\ -1 & -1 \end{psmallmatrix}\) has \(\operatorname{tr}(N) = 0\), \(\det(N) = 0\) and \(\chi_N = X^2\), hence is nilpotent --- and indeed \(N^2 = 0\). In dimension at least \(2\), zero trace alone never suffices, and in dimension \(\ge 3\) even zero trace together with zero determinant do not force nilpotence; the full identity \(\chi_N = X^n\) is the test. Method — Recognise a nilpotent endomorphism
To decide whether \(u\) (matrix \(M\)) is nilpotent, three routes are open. Compute \(\chi_u\) and check \(\chi_u = X^n\). Or trigonalise and check that the only eigenvalue is \(0\). Or, most directly for an explicit matrix, compute the successive powers \(M, M^2, \dots\) until one vanishes (the Theorem guarantees that if \(M\) is nilpotent then \(M^n = 0\), so it suffices to test up to the exponent \(n\)).
Going further
This chapter has followed the geometric road to reduction: stable subspaces, eigenspaces, direct sums. A second, algebraic road --- annihilating polynomials, the minimal polynomial \(\pi_u\), the Cayley-Hamilton theorem, the kernel-decomposition lemma --- is taken up in Polynomials of an endomorphism, where diagonalisability is recast as « the minimal polynomial is split with simple roots ». Reduction is also the engine behind several later chapters: it computes matrix powers \(M^k\), solves linear recurrent sequences, and integrates linear differential systems \(X' = AX\).
Skills to practice
- Applying the nilpotence characterisation
Jump to section