\( \definecolor{colordef}{RGB}{249,49,84} \definecolor{colorprop}{RGB}{18,102,241} \)
CommeUnJeu · L2 MP

Reduction: eigen-elements and diagonalisation

⌚ ~83 min ▢ 10 blocks ✓ 18 exercises Prerequisites : Determinants, Change of basis\(\virgule\) equivalence\(\virgule\) similarity, Polynomials, Further linear algebra
Given an endomorphism \(u\) of a vector space \(E\), reduction asks for a basis of \(E\) in which the matrix of \(u\) is as simple as possible --- ideally diagonal, so that \(u\) merely scales each basis vector. This chapter builds the two tools that decide whether such a basis exists. The first is geometric: the eigen-elements of \(u\) --- the directions \(u\) only stretches. The second is algebraic: the characteristic polynomial, which turns the question « is \(\lambda\) an eigenvalue? » into « is \(\lambda\) a root? ».
The headline results are three. An endomorphism is diagonalisable exactly when \(E\) is the direct sum of its eigenspaces --- equivalently, when its characteristic polynomial is split and each eigenspace has the expected dimension. It is trigonalisable (a triangular matrix) exactly when its characteristic polynomial is split. And a nilpotent endomorphism is precisely one that is trigonalisable with \(0\) as its only eigenvalue.
Conventions
Throughout this chapter, \(\mathbb{K}\) is a subfield of \(\mathbb{C}\) and \(E\) is a finite-dimensional, non-zero \(\mathbb{K}\)-vector space, with \(n = \dim E \ge 1\). We write \(u \in \mathcal{L}(E)\) for an endomorphism, \(M \in \mathcal{M}_n(\mathbb{K})\) for a square matrix, \(\operatorname{Id}_E\) for the identity and \(I_n\) for the identity matrix. Determinant, trace, rank, similar matrices and the matrix of an endomorphism in a basis are those of Determinants and Change of basis; polynomials, roots, multiplicity and split polynomials those of Polynomials; direct sums, stable subspaces and the induced endomorphism those of Further linear algebra. The characteristic polynomial is defined as \(\det(XI_n - M)\); its degree, monicity and coefficients are established in § 2.
I Eigen-elements
I.1 Eigenvalues\(\virgule\) eigenvectors\(\virgule\) eigenspaces
The reduction problem starts with a single observation: some vectors are merely stretched by \(u\). A non-zero vector \(x\) with \(u(x) = \lambda x\) keeps its direction --- \(u\) acts on the line it spans as the homothety of ratio \(\lambda\). Such directions are the raw material of reduction: a basis made entirely of them turns the matrix of \(u\) diagonal. This subsection names them and gathers their first properties.
Definition — Eigenvalue and eigenvector
A scalar \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) when there exists a vector \(x \neq 0_E\) such that \(u(x) = \lambda x\). Such a vector \(x\) is then an eigenvector of \(u\) associated with \(\lambda\). An eigenvector is, by definition, never the zero vector.
Example — Eigen-elements of three familiar endomorphisms
A homothety \(\lambda_0 \operatorname{Id}_E\) admits every non-zero vector as an eigenvector, all for the single eigenvalue \(\lambda_0\). A projector \(p\) has eigenvalues among \(0\) and \(1\): every vector of \(\operatorname{Im} p\) satisfies \(p(x) = x\) and every vector of \(\operatorname{Ker} p\) satisfies \(p(x) = 0\) --- both \(0\) and \(1\) occur as soon as \(\operatorname{Ker} p\) and \(\operatorname{Im} p\) are both non-zero. A vector symmetry \(s\) has eigenvalues among \(-1\) and \(1\), both occurring when its fixed and anti-fixed subspaces are both non-zero.
Definition — Eigenspace
Let \(\lambda \in \mathbb{K}\). The eigenspace of \(u\) associated with \(\lambda\) is $$ E_\lambda(u) = \operatorname{Ker}(u - \lambda \operatorname{Id}_E). $$ It is a subspace of \(E\), being the kernel of an endomorphism. The scalar \(\lambda\) is an eigenvalue of \(u\) if and only if \(E_\lambda(u) \neq \{0_E\}\), and \(E_\lambda(u)\) is then exactly the set of eigenvectors associated with \(\lambda\), together with \(0_E\).
Example — An eigenspace as a kernel
For \(M = \begin{psmallmatrix} 1 & 2 \\ 2 & 1 \end{psmallmatrix}\), take \(\lambda = 3\): the eigenspace is \(E_3 = \operatorname{Ker}(M - 3I_2) = \operatorname{Ker}\begin{psmallmatrix} -2 & 2 \\ 2 & -2 \end{psmallmatrix}\). The system \(-2x + 2y = 0\) has solution line \(\operatorname{Vect}(1, 1)\), so \(E_3 = \operatorname{Vect}(1, 1)\), a line; \(3\) is an eigenvalue since \(E_3 \neq \{0\}\). For \(\lambda = 0\), \(\operatorname{Ker} M = \{0\}\) (as \(M\) is invertible), so \(0\) is not an eigenvalue.
Definition — Spectrum
The spectrum of \(u\), written \(\operatorname{Sp}(u)\), is the set of eigenvalues of \(u\) in \(\mathbb{K}\). For a matrix \(M \in \mathcal{M}_n(\mathbb{K})\) one defines likewise \(\operatorname{Sp}(M)\), the set of \(\lambda \in \mathbb{K}\) for which \(MX = \lambda X\) has a non-zero column solution \(X\) --- such an \(X\) being a matrix eigenvector, tied to the eigenvectors of \(u\) by the matrix-translation Proposition below.
Example — Eigenspaces and spectrum of a diagonal matrix
Let \(u\) be the endomorphism of \(\mathbb{R}^3\) with matrix \(D = \operatorname{diag}(5, 5, -2)\) in the canonical basis \((e_1, e_2, e_3)\). Then \(u(e_1) = 5e_1\), \(u(e_2) = 5e_2\) and \(u(e_3) = -2e_3\), so the spectrum is \(\operatorname{Sp}(u) = \{5, -2\}\). The eigenspaces are \(E_5(u) = \operatorname{Vect}(e_1, e_2)\), a plane, and \(E_{-2}(u) = \operatorname{Vect}(e_3)\), a line; here \(E_5(u) \oplus E_{-2}(u) = \mathbb{R}^3\).
The notion of spectral value is out of the program scope. In infinite dimension one distinguishes eigenvalues from a wider set of « spectral values »; here, in finite dimension, the spectrum is simply the set of eigenvalues and nothing more is needed.
Proposition — Matrix translation
Let \(\mathcal{B}\) be a basis of \(E\) and \(M = \operatorname{Mat}_\mathcal{B}(u)\). A vector \(x\), with coordinate column \(X\) in \(\mathcal{B}\), is an eigenvector of \(u\) associated with \(\lambda\) if and only if \(X \neq 0\) and \(MX = \lambda X\). Consequently \(\operatorname{Sp}(u) = \operatorname{Sp}(M)\).

The coordinate map in \(\mathcal{B}\) is a linear isomorphism \(E \to \mathcal{M}_{n,1}(\mathbb{K})\) sending \(x\) to \(X\) and \(u(x)\) to \(MX\). Hence \(u(x) = \lambda x\) holds if and only if \(MX = \lambda X\), and \(x \neq 0_E\) if and only if \(X \neq 0\). The two notions of eigenvector --- for \(u\) and for \(M\) --- therefore correspond, and so do the two spectra.

Proposition — The eigenvalue zero
The scalar \(0\) is an eigenvalue of \(u\) if and only if \(u\) is not injective, equivalently --- \(E\) being finite-dimensional --- if and only if \(u\) is not invertible.

By definition \(0\) is an eigenvalue of \(u\) iff \(E_0(u) = \operatorname{Ker}(u - 0 \cdot \operatorname{Id}_E) = \operatorname{Ker} u \neq \{0_E\}\), that is, iff \(u\) is not injective. In finite dimension an endomorphism is injective if and only if it is bijective, so this is also the negation of « \(u\) invertible ».

Geometric meaning
Geometrically, an eigenvector marks a direction that \(u\) leaves invariant: on the eigen-line it spans, \(u\) acts as a pure dilation. Off such a line, \(u\) generally tilts a vector into a new direction. The figure shows an eigen-line, on which \(x\) and \(u(x) = \lambda x\) stay collinear, beside an ordinary direction, where \(y\) and \(u(y)\) point differently.
Example — Determine the eigen-elements of a matrix
For the endomorphism \(u\) of \(\mathbb{R}^3\) whose matrix in the canonical basis is $$ M = \begin{pmatrix} 0 & 2 & -1 \\ 3 & -2 & 0 \\ -2 & 2 & 1 \end{pmatrix}, $$ it will be shown in § 2 that the spectrum is \(\operatorname{Sp}(M) = \{1, 2, -4\}\) --- the systematic determination of the eigenvalues is the work of the characteristic polynomial. Taking these eigenvalues as given, determine the three eigenspaces.

Each eigenspace is the kernel \(E_\lambda(u) = \operatorname{Ker}(M - \lambda I_3)\). Resolving the system \((M - \lambda I_3)X = 0\) for each given eigenvalue:
  • \(\lambda = 1\): \(E_1(u) = \operatorname{Vect}(1, 1, 1)\);
  • \(\lambda = 2\): \(E_2(u) = \operatorname{Vect}(4, 3, -2)\);
  • \(\lambda = -4\): \(E_{-4}(u) = \operatorname{Vect}(2, -3, 2)\).
Each eigenspace is a line; concatenating one eigenvector from each gives three vectors of \(\mathbb{R}^3\) --- a free family, as we shall see in § 1.2.

Method — Test whether a scalar is an eigenvalue
To decide whether \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) (matrix \(M\)): \(\lambda \in \operatorname{Sp}(u)\) if and only if \(u - \lambda \operatorname{Id}_E\) is not injective, equivalently if and only if \(M - \lambda I_n\) is not invertible, equivalently if and only if \(\operatorname{rg}(M - \lambda I_n) < n\). The eigenspace \(E_\lambda(u)\) is then the solution set of the homogeneous system \((M - \lambda I_n)X = 0\), of dimension \(n - \operatorname{rg}(M - \lambda I_n)\) by the rank theorem.
Skills to practice
  • Computing eigenvalues and eigenvectors
I.2 The spectrum in finite dimension
Distinct eigenvalues pull in independent directions. This is the structural fact behind all of reduction: eigenspaces attached to different eigenvalues never overlap beyond \(0_E\), they sit in direct sum. Two consequences follow at once --- eigenvectors for distinct eigenvalues form a free family, and an endomorphism of an \(n\)-dimensional space has at most \(n\) eigenvalues.
Theorem — Eigenspaces of distinct eigenvalues are in direct sum
Let \(\lambda_1, \dots, \lambda_p\) be pairwise distinct eigenvalues of \(u\). Then the sum \(E_{\lambda_1}(u) + \dots + E_{\lambda_p}(u)\) is direct.

We argue by induction on \(p \ge 1\).
  • Initialization. For \(p = 1\) there is a single subspace and nothing to prove.
  • Heredity. Assume the result for \(p\) pairwise distinct eigenvalues. Let \(\lambda_1, \dots, \lambda_{p+1}\) be pairwise distinct, and take \(x_i \in E_{\lambda_i}(u)\) with $$ x_1 + \dots + x_p + x_{p+1} = 0_E. \tag{\(*\)} $$ Applying \(u\) and using \(u(x_i) = \lambda_i x_i\) gives \(\lambda_1 x_1 + \dots + \lambda_{p+1} x_{p+1} = 0_E\). Subtracting \(\lambda_{p+1}\) times \((*)\) from this: $$ (\lambda_1 - \lambda_{p+1})x_1 + \dots + (\lambda_p - \lambda_{p+1})x_p = 0_E. $$ By the induction hypothesis the sum \(E_{\lambda_1} + \dots + E_{\lambda_p}\) is direct, so each term vanishes: \((\lambda_i - \lambda_{p+1})x_i = 0_E\) for \(i \le p\). As \(\lambda_i \neq \lambda_{p+1}\), this forces \(x_i = 0_E\) for \(i \le p\). Reporting into \((*)\) gives \(x_{p+1} = 0_E\) as well. The decomposition of \(0_E\) is trivial, so the sum of the \(p+1\) eigenspaces is direct.

Proposition — Eigenvectors of distinct eigenvalues are free
A family of eigenvectors of \(u\) associated with pairwise distinct eigenvalues is free.

Let \(x_1, \dots, x_p\) be eigenvectors associated with pairwise distinct eigenvalues \(\lambda_1, \dots, \lambda_p\), and suppose \(\alpha_1 x_1 + \dots + \alpha_p x_p = 0_E\). Each \(\alpha_i x_i\) lies in \(E_{\lambda_i}(u)\), and the sum of these eigenspaces is direct by the previous Theorem, so \(\alpha_i x_i = 0_E\) for every \(i\). Since \(x_i \neq 0_E\), this gives \(\alpha_i = 0\) for every \(i\). The family is free.

Proposition — The spectrum is finite
The spectrum \(\operatorname{Sp}(u)\) is finite, with \(\#\operatorname{Sp}(u) \le n = \dim E\).

Suppose \(u\) had \(n + 1\) pairwise distinct eigenvalues \(\lambda_1, \dots, \lambda_{n+1}\). Choosing an eigenvector \(x_i\) for each, the family \((x_1, \dots, x_{n+1})\) is free by the previous Proposition. But a free family of \(E\) has at most \(\dim E = n\) vectors --- a contradiction. So \(u\) has at most \(n\) distinct eigenvalues: \(\operatorname{Sp}(u)\) is finite, with \(\#\operatorname{Sp}(u) \le n\).

Example — An endomorphism with the maximum number of eigenvalues
The endomorphism of the previous answer has spectrum \(\{1, 2, -4\}\), three eigenvalues for \(n = 3\) --- the bound \(\#\operatorname{Sp}(u) \le n\) is attained. Its three eigenvectors \((1,1,1)\), \((4,3,-2)\), \((2,-3,2)\) form a free family, hence a basis of \(\mathbb{R}^3\).
Why finite dimension matters
A brief aside, stepping outside the chapter's finite-dimension convention. On the space \(\mathcal{C}^\infty(\mathbb{R})\) of smooth functions, the derivation \(f \mapsto f'\) admits every real \(\lambda\) as an eigenvalue: the function \(x \mapsto e^{\lambda x}\) satisfies \(f' = \lambda f\). Its spectrum is the whole of \(\mathbb{R}\), infinite. So the finiteness of the spectrum is genuinely a finite-dimension phenomenon, resting on the bound « a free family has at most \(n\) vectors ».
Proposition — Commuting endomorphisms
Let \(u, v \in \mathcal{L}(E)\) with \(u \circ v = v \circ u\). Then every eigenspace \(E_\lambda(u)\) is stable by \(v\).

From \(u \circ v = v \circ u\) one gets \((u - \lambda \operatorname{Id}_E) \circ v = v \circ (u - \lambda \operatorname{Id}_E)\), so \(u - \lambda \operatorname{Id}_E\) and \(v\) commute. By the kernel-stability of commuting endomorphisms --- recalled from Further linear algebra --- the kernel \(E_\lambda(u) = \operatorname{Ker}(u - \lambda \operatorname{Id}_E)\) is stable by \(v\).

Proposition — Similar matrices share the spectrum
Two similar matrices have the same spectrum: if \(A \sim B\) in \(\mathcal{M}_n(\mathbb{K})\), then \(\operatorname{Sp}(A) = \operatorname{Sp}(B)\).

Write \(B = P^{-1}AP\) with \(P \in \operatorname{GL}_n(\mathbb{K})\). If \(\lambda \in \operatorname{Sp}(B)\), take \(X \neq 0\) with \(BX = \lambda X\). Then \(A(PX) = P(BX) = \lambda(PX)\), and \(PX \neq 0\) since \(P\) is invertible, so \(\lambda \in \operatorname{Sp}(A)\). Thus \(\operatorname{Sp}(B) \subseteq \operatorname{Sp}(A)\); exchanging the roles of \(A\) and \(B\) (via \(A = PBP^{-1}\)) gives the reverse inclusion.

Example — An eigenspace stable under a commuting endomorphism
Let \(u\) have \(n\) distinct eigenvalues and let \(v\) commute with \(u\). Since \(u\) has \(n\) distinct eigenvalues, its \(n\) eigenvectors form a basis and each eigenspace \(E_\lambda(u)\) is an eigen-line. Each such line is stable by \(v\), so \(v\) maps every eigenvector of \(u\) to a multiple of itself: \(v\) is itself diagonal in the eigenbasis of \(u\). Commuting with a fixed endomorphism with \(n\) distinct eigenvalues thus forces a shared eigenbasis --- a first glimpse of simultaneous reduction.
Skills to practice
  • Using the direct sum of eigenspaces
II The characteristic polynomial
II.1 Definition and first properties
Searching for eigenvalues by hand --- testing each scalar \(\lambda\) for the singularity of \(u - \lambda \operatorname{Id}_E\) --- is hopeless: there are infinitely many candidates. The way out is to make \(\lambda\) a variable. The determinant \(\det(XI_n - M)\) is a polynomial in \(X\), and its roots lying in \(\mathbb{K}\) are exactly the eigenvalues. This characteristic polynomial converts an infinite search into a finite computation.
Proposition — Eigenvalue and vanishing determinant
A scalar \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) if and only if \(\det(u - \lambda \operatorname{Id}_E) = 0\).

\(\lambda\) is an eigenvalue iff \(E_\lambda(u) = \operatorname{Ker}(u - \lambda \operatorname{Id}_E) \neq \{0_E\}\), i.e. iff \(u - \lambda \operatorname{Id}_E\) is not injective. In finite dimension an endomorphism is injective iff it is invertible, and an endomorphism is invertible iff its determinant is non-zero. So « \(u - \lambda \operatorname{Id}_E\) not injective » is « \(\det(u - \lambda \operatorname{Id}_E) = 0\) ».

Definition — Characteristic polynomial of a matrix
The characteristic polynomial of \(M \in \mathcal{M}_n(\mathbb{K})\) is $$ \chi_M = \det(XI_n - M). $$ It is an element of \(\mathbb{K}[X]\) obtained by expanding the determinant of the matrix \(XI_n - M\), whose entries are polynomials of degree at most \(1\).
Example — Characteristic polynomial of a small matrix
For \(M = \begin{psmallmatrix} 1 & 2 \\ 3 & 2 \end{psmallmatrix}\), $$ \chi_M = \det\begin{pmatrix} X-1 & -2 \\ -3 & X-2 \end{pmatrix} = (X-1)(X-2) - 6 = X^2 - 3X - 4 = (X-4)(X+1), $$ so \(\operatorname{Sp}(M) = \{4, -1\}\). The coefficient of \(X\) is \(-3 = -\operatorname{tr}(M)\) and the constant term is \(-4 = \det(M)\) --- anticipating the degree-and-coefficients theorem below.
Proposition — Invariance under similarity
If \(A \sim B\) in \(\mathcal{M}_n(\mathbb{K})\), then \(\chi_A = \chi_B\).

Write \(B = P^{-1}AP\) with \(P \in \operatorname{GL}_n(\mathbb{K})\). We argue by evaluation, to stay inside the determinant theory over the field \(\mathbb{K}\). For every scalar \(x \in \mathbb{K}\), $$ \begin{aligned} \chi_B(x) &= \det(xI_n - P^{-1}AP) && \text{(definition of \(\chi_B\))} \\ &= \det\big(P^{-1}(xI_n - A)P\big) && \text{(\(xI_n = P^{-1}(xI_n)P\))} \\ &= \det(P^{-1})\det(xI_n - A)\det(P) && \text{(multiplicativity of \(\det\))} \\ &= \det(xI_n - A) = \chi_A(x) && \text{(\(\det(P^{-1})\det(P) = 1\)).} \end{aligned} $$ The polynomials \(\chi_A\) and \(\chi_B\) thus agree at every point of the infinite set \(\mathbb{K}\), hence are equal.

Definition — Characteristic polynomial of an endomorphism
The characteristic polynomial of \(u \in \mathcal{L}(E)\) is \(\chi_u = \chi_M\), where \(M\) is the matrix of \(u\) in any basis of \(E\). This does not depend on the basis: two such matrices are similar, and similar matrices have the same characteristic polynomial.
Example — The characteristic polynomial does not see the basis
Let \(u\) be the endomorphism of \(\mathbb{R}^2\) with matrix \(A = \begin{psmallmatrix} 1 & 2 \\ 3 & 2 \end{psmallmatrix}\) in the canonical basis. The vectors \((2, 3)\) and \((1, -1)\) satisfy \(A\begin{psmallmatrix}2\\3\end{psmallmatrix} = \begin{psmallmatrix}8\\12\end{psmallmatrix} = 4\begin{psmallmatrix}2\\3\end{psmallmatrix}\) and \(A\begin{psmallmatrix}1\\-1\end{psmallmatrix} = \begin{psmallmatrix}-1\\1\end{psmallmatrix} = -\begin{psmallmatrix}1\\-1\end{psmallmatrix}\), so in the basis \(\big((2,3),\,(1,-1)\big)\) the matrix of \(u\) is \(D = \operatorname{diag}(4, -1)\), similar to \(A\). Both give \(\chi_u = X^2 - 3X - 4\) --- read from \(A\) as \((X-1)(X-2)-6\), and from \(D\) at once as \((X-4)(X+1)\). The characteristic polynomial of \(u\) is one object, computable through any matrix of \(u\).
Theorem — Eigenvalues are the roots of the characteristic polynomial
A scalar \(\lambda \in \mathbb{K}\) is an eigenvalue of \(u\) if and only if \(\chi_u(\lambda) = 0\). The spectrum of \(u\) is the set of roots of \(\chi_u\) lying in \(\mathbb{K}\).

Fix a basis and let \(M\) be the matrix of \(u\). Evaluating \(\chi_u = \det(XI_n - M)\) at \(\lambda\) gives \(\chi_u(\lambda) = \det(\lambda I_n - M)\). Now \(\lambda I_n - M = -(M - \lambda I_n)\), and the determinant is \(n\)-linear in the columns, so \(\det(\lambda I_n - M) = (-1)^n \det(M - \lambda I_n)\) --- the two vanish together. By the Proposition on the vanishing determinant, \(\det(M - \lambda I_n) = 0\) exactly when \(\lambda\) is an eigenvalue of \(u\). Hence \(\chi_u(\lambda) = 0 \iff \lambda \in \operatorname{Sp}(u)\).

Theorem — Degree and coefficients of the characteristic polynomial
For \(M \in \mathcal{M}_n(\mathbb{K})\), the polynomial \(\chi_M\) is monic of degree \(n\), and $$ \chi_M = X^n - \operatorname{tr}(M)\, X^{n-1} + \dots + (-1)^n \det(M). $$

Expand \(\chi_M = \det(XI_n - M)\) by the permutation formula: $$ \chi_M = \sum_{\sigma \in \mathfrak{S}_n} \varepsilon(\sigma) \prod_{i=1}^n \big(X\delta_{\sigma(i),i} - m_{\sigma(i),i}\big), $$ where \(\delta\) is the Kronecker symbol.
  • Degree \(n\) and monic. A factor contributes an \(X\) only when \(\sigma(i) = i\). A term of degree \(n\) thus needs \(\sigma(i) = i\) for every \(i\), i.e. \(\sigma = \operatorname{id}\). That term is \(\prod_{i=1}^n (X - m_{i,i})\), of degree \(n\) with leading coefficient \(1\). So \(\chi_M\) is monic of degree \(n\).
  • Coefficient of \(X^{n-1}\). A permutation fixing exactly \(n-1\) indices fixes them all, so again only \(\sigma = \operatorname{id}\) contributes degree \(n-1\). Expanding \(\prod_{i=1}^n (X - m_{i,i})\) gives \(X^n - (m_{1,1} + \dots + m_{n,n})X^{n-1} + \dots = X^n - \operatorname{tr}(M)X^{n-1} + \dots\)
  • Constant term. It is \(\chi_M(0) = \det(-M) = (-1)^n \det(M)\).

Example — Characteristic polynomial of a triangular matrix
If \(T\) is upper-triangular with diagonal entries \(d_1, \dots, d_n\), then \(XI_n - T\) is upper-triangular with diagonal \(X - d_1, \dots, X - d_n\), so $$ \chi_T = \prod_{i=1}^n (X - d_i). $$ The spectrum of a triangular matrix is read straight off its diagonal: \(\operatorname{Sp}(T) = \{d_1, \dots, d_n\}\).
Method — Compute the characteristic polynomial and the spectrum
To find the eigenvalues of \(u\) given by a matrix \(M\): form \(XI_n - M\), compute \(\chi_M = \det(XI_n - M)\) --- by cofactor expansion, or by adding a \(\mathbb{K}[X]\)-multiple of one row (or column) to another, both valid over the ring \(\mathbb{K}[X]\); avoid dividing a line by a non-constant polynomial --- then factor \(\chi_M\). The eigenvalues are the roots of \(\chi_M\) in \(\mathbb{K}\). The degree-\((n-1)\) and constant coefficients, \(-\operatorname{tr}(M)\) and \((-1)^n\det(M)\), give a quick check of the computation.
Skills to practice
  • Computing the characteristic polynomial
II.2 Multiplicity of an eigenvalue
An eigenvalue can be a simple root of \(\chi_u\) or a repeated one. The number of repetitions --- its multiplicity --- is the second number attached to an eigenvalue, beside the dimension of its eigenspace. The two are linked by a fundamental inequality: the eigenspace can never be larger than the multiplicity permits. That inequality, \(\dim E_\lambda \le m(\lambda)\), is the gate to diagonalisability.
Definition — Multiplicity of an eigenvalue
Let \(\lambda\) be an eigenvalue of \(u\). Its multiplicity, written \(m(\lambda)\), is the multiplicity of \(\lambda\) as a root of the characteristic polynomial \(\chi_u\). An eigenvalue of multiplicity \(1\) is called simple.
Example — Reading multiplicities off a factored characteristic polynomial
If \(\chi_u = (X - 1)(X - 2)^2\), then \(u\) has eigenvalues \(1\) and \(2\), with \(m(1) = 1\) (simple) and \(m(2) = 2\) (double). The multiplicities sum to \(\deg \chi_u = 3 = n\), as they always do when \(\chi_u\) is split.
Example — A triple eigenvalue
For the upper-triangular matrix \(T = \begin{psmallmatrix} 3 & 1 & 0 \\ 0 & 3 & 1 \\ 0 & 0 & 3 \end{psmallmatrix}\), the characteristic polynomial is \(\chi_T = (X - 3)^3\), read off the diagonal. So \(3\) is the only eigenvalue, of multiplicity \(m(3) = 3\) --- a triple eigenvalue. Yet \(T - 3I_3\) has rank \(2\), so \(\dim E_3(T) = 3 - 2 = 1\), far below \(m(3)\): multiplicity counts roots of \(\chi_T\), not dimensions of eigenspaces.
Proposition — Number of roots of the characteristic polynomial
\(\chi_u\) is monic of degree \(n\), so it has at most \(n\) roots in \(\mathbb{K}\); viewed as an element of \(\mathbb{C}[X]\), it has exactly \(n\) roots counted with multiplicity. In particular \(u\) has at most \(n\) distinct eigenvalues.

A non-zero polynomial of degree \(n\) has at most \(n\) roots in any field, counted with multiplicity --- recalled from Polynomials. Over \(\mathbb{C}\), d'Alembert-Gauss makes \(\chi_u\) split, so it has exactly \(n\) roots counted with multiplicity. The eigenvalues of \(u\) are among the roots lying in \(\mathbb{K}\), hence at most \(n\) in number. This re-derives, through \(\chi_u\), the finiteness of the spectrum already obtained in § 1.2 by the free-family bound, and sharpens it: the count is tied to \(\deg \chi_u = n\).

Proposition — Characteristic polynomial of an induced endomorphism
Let \(F\) be a subspace of \(E\) stable by \(u\), with \(F \neq \{0_E\}\), and let \(u_F\) be the induced endomorphism. Then \(\chi_{u_F}\) divides \(\chi_u\).

If \(F = E\) then \(u_F = u\) and \(\chi_{u_F} = \chi_u\) divides \(\chi_u\); assume now \(F \subsetneq E\). Set \(p = \dim F\), so \(1 \le p < n\). Complete a basis of \(F\) into a basis \(\mathcal{B}\) of \(E\). Since \(F\) is stable by \(u\), the matrix of \(u\) in \(\mathcal{B}\) is block upper-triangular, $$ M = \begin{pmatrix} A & B \\ 0 & C \end{pmatrix}, \qquad A = \operatorname{Mat}(u_F), $$ recalled from Further linear algebra. Then \(XI_n - M\) is block upper-triangular with diagonal blocks \(XI_p - A\) and \(XI_{n-p} - C\), and the determinant of a block-triangular matrix is the product of the determinants of the diagonal blocks: $$ \chi_u = \det(XI_n - M) = \det(XI_p - A)\cdot\det(XI_{n-p} - C) = \chi_{u_F}\cdot\chi_C. $$ Hence \(\chi_{u_F}\) divides \(\chi_u\).

Theorem — Dimension of an eigenspace
Let \(\lambda\) be an eigenvalue of \(u\) of multiplicity \(m(\lambda)\). Then $$ 1 \le \dim E_\lambda(u) \le m(\lambda). $$

Write \(d = \dim E_\lambda(u)\). Since \(\lambda\) is an eigenvalue, \(E_\lambda(u) \neq \{0_E\}\), so \(d \ge 1\).
The subspace \(E_\lambda(u)\) is stable by \(u\), and the induced endomorphism is \(u_{E_\lambda} = \lambda \operatorname{Id}_{E_\lambda}\) --- a homothety, since \(u(x) = \lambda x\) for every \(x \in E_\lambda(u)\). Its characteristic polynomial is therefore \(\chi_{u_{E_\lambda}} = (X - \lambda)^d\). By the previous Proposition, \((X - \lambda)^d\) divides \(\chi_u\). The multiplicity of \(\lambda\) as a root of \(\chi_u\) is thus at least \(d\), that is \(d \le m(\lambda)\).

Proposition — A simple eigenvalue has a one-dimensional eigenspace
If \(\lambda\) is a simple eigenvalue of \(u\), then \(\dim E_\lambda(u) = 1\).

A simple eigenvalue has \(m(\lambda) = 1\). The Theorem gives \(1 \le \dim E_\lambda(u) \le 1\), hence \(\dim E_\lambda(u) = 1\).

Example — An eigenspace strictly smaller than the multiplicity
For \(M = \begin{psmallmatrix} 2 & 1 \\ 0 & 2 \end{psmallmatrix}\), \(\chi_M = (X - 2)^2\), so \(2\) is an eigenvalue of multiplicity \(m(2) = 2\). But \(M - 2I_2 = \begin{psmallmatrix} 0 & 1 \\ 0 & 0 \end{psmallmatrix}\) has rank \(1\), so \(\dim E_2(M) = 2 - 1 = 1 < m(2)\). The inequality \(\dim E_\lambda \le m(\lambda)\) can be strict --- and when it is, diagonalisability fails, as § 3 will show.
Method — Compute the dimension of an eigenspace
To find \(\dim E_\lambda(u)\) for an eigenvalue \(\lambda\) (matrix \(M\)): compute the rank of \(M - \lambda I_n\) by row reduction, then apply the rank theorem, $$ \dim E_\lambda(u) = n - \operatorname{rg}(M - \lambda I_n). $$ This number is to be compared with \(m(\lambda)\): the verdict \(\dim E_\lambda = m(\lambda)\) or \(\dim E_\lambda < m(\lambda)\) decides diagonalisability eigenvalue by eigenvalue.
Skills to practice
  • Exploiting the multiplicity of an eigenvalue
III Diagonalisability
III.1 Diagonalisable endomorphisms and matrices
The favourable case of reduction is the one where a basis of eigenvectors exists. In such a basis \(u\) acts coordinate by coordinate as a dilation, and its matrix is diagonal --- the simplest shape a matrix can take. This subsection fixes the vocabulary, for endomorphisms and for matrices, and reads the geometric content: \(E\) splits into eigen-lines on each of which \(u\) is a homothety.
Definition — Diagonalisable endomorphism and matrix
The endomorphism \(u\) is diagonalisable when there exists a basis of \(E\) in which the matrix of \(u\) is diagonal. A matrix \(M \in \mathcal{M}_n(\mathbb{K})\) is diagonalisable when it is similar to a diagonal matrix, that is, \(P^{-1}MP = D\) for some \(P \in \operatorname{GL}_n(\mathbb{K})\) and some diagonal \(D\).
Example — A diagonalisation written out
The matrix \(M = \begin{psmallmatrix} 3 & 1 \\ 2 & 2 \end{psmallmatrix}\) is diagonalisable. Its characteristic polynomial is \(\chi_M = (X-3)(X-2) - 2 = X^2 - 5X + 4 = (X-1)(X-4)\), with two distinct eigenvalues \(1\) and \(4\). Solving \((M - I_2)X = 0\) and \((M - 4I_2)X = 0\) gives eigenvectors \((1, -2)\) and \((1, 1)\). With \(P = \begin{psmallmatrix} 1 & 1 \\ -2 & 1 \end{psmallmatrix}\), $$ P^{-1}MP = \begin{pmatrix} 1 & 0 \\ 0 & 4 \end{pmatrix}, $$ the columns of \(P\) being a basis of eigenvectors.
Proposition — Endomorphism and matrix versions agree
An endomorphism \(u\) is diagonalisable if and only if its matrix in one --- equivalently every --- basis is a diagonalisable matrix. More generally, the same holds for any matrix property invariant under similarity (in particular trigonalisability and nilpotence, met in §§ 4--5).

The matrices of \(u\) in two bases are similar, and similarity is an equivalence relation. So if the matrix of \(u\) in one basis is similar to a diagonal matrix, the matrix in any other basis --- similar to the first --- is also similar to that diagonal matrix, by transitivity. Hence « the matrix is diagonalisable » does not depend on the basis. And \(u\) is diagonalisable, by definition, exactly when some basis gives a diagonal matrix, i.e. exactly when the matrix of \(u\) is similar to a diagonal one. The argument used only that similarity is transitive, so it applies verbatim to every similarity-invariant matrix property --- in particular « similar to an upper-triangular matrix » and « some power is zero », the trigonalisable and nilpotent cases of §§ 4--5.

Proposition — A diagonalising basis is a basis of eigenvectors
A basis \(\mathcal{B} = (e_1, \dots, e_n)\) diagonalises \(u\) if and only if every \(e_i\) is an eigenvector of \(u\). In that basis the diagonal entries of the matrix are the associated eigenvalues, each eigenvalue \(\lambda\) appearing exactly \(\dim E_\lambda(u)\) times.

The matrix of \(u\) in \(\mathcal{B}\) is diagonal, with diagonal entries \(d_1, \dots, d_n\), if and only if \(u(e_i) = d_i e_i\) for every \(i\) --- that is, if and only if each \(e_i\) is an eigenvector for the eigenvalue \(d_i\). For a fixed eigenvalue \(\lambda\), the basis vectors \(e_i\) with \(d_i = \lambda\) are eigenvectors in \(E_\lambda(u)\); being part of a basis they are free, and they span \(E_\lambda(u)\) (any eigenvector for \(\lambda\) is a combination of basis vectors, necessarily of those with \(d_i = \lambda\)). So they form a basis of \(E_\lambda(u)\), and \(\lambda\) occurs \(\dim E_\lambda(u)\) times on the diagonal --- as the worked \(3 \times 3\) of § 3.2 will show, where \(0\) fills two of the three diagonal slots.

Example — Diagonalisable endomorphisms one already knows
The identity and every homothety are diagonalisable --- they are already diagonal in every basis. A projector \(p\) is diagonalisable: \(E = \operatorname{Ker} p \oplus \operatorname{Im} p\), and a basis adapted to this decomposition is a basis of eigenvectors (eigenvalues among \(0\) and \(1\)). A vector symmetry is diagonalisable likewise, with eigenvalues among \(-1\) and \(1\).
Example — An endomorphism that is not diagonalisable
The matrix \(M = \begin{psmallmatrix} 0 & 1 \\ 0 & 0 \end{psmallmatrix}\) is not diagonalisable. Were it \(PDP^{-1}\) with \(D\) diagonal, \(D\) would have the same spectrum as \(M\), namely \(\{0\}\) (since \(\chi_M = X^2\)), so \(D = 0\), hence \(M = P\,0\,P^{-1} = 0\) --- false. This non-zero matrix with a single eigenvalue is a nilpotent matrix, the obstruction studied in § 5.
The geometry of diagonalisation
When \(u\) is diagonalisable, \(E\) breaks into a direct sum of eigen-lines (or eigenspaces): \(E = E_{\lambda_1}(u) \oplus E_{\lambda_2}(u) \oplus \cdots\). A vector \(x\) decomposes uniquely as \(x = x_1 + x_2 + \cdots\) with \(x_i \in E_{\lambda_i}(u)\), and \(u\) stretches each piece by its own ratio: \(u(x) = \lambda_1 x_1 + \lambda_2 x_2 + \cdots\). The figure shows the two-eigenspace case.
Skills to practice
  • Identifying a diagonalisable matrix
III.2 Characterisations of diagonalisability
Whether \(u\) is diagonalisable is now decided by three equivalent tests, of decreasing abstraction. The first is structural: \(E\) is the direct sum of the eigenspaces. The second counts dimensions. The third is the practical one: \(\chi_u\) must be split, and each eigenspace must reach the dimension its multiplicity allows. A sufficient shortcut closes the subsection --- \(n\) distinct eigenvalues are enough.
Theorem — Diagonalisable through the direct sum of eigenspaces
The endomorphism \(u\) is diagonalisable if and only if \(E\) is the direct sum of its eigenspaces: $$ E = \bigoplus_{\lambda \in \operatorname{Sp}(u)} E_\lambda(u). $$

The sum of the eigenspaces is always direct (§ 1.2), so the condition is really « the eigenspaces span \(E\) ».
  • \((\Rightarrow)\) If \(u\) is diagonalisable, a diagonalising basis is a basis of eigenvectors; each basis vector lies in some \(E_\lambda(u)\), so the eigenspaces span \(E\), and their direct sum is \(E\).
  • \((\Leftarrow)\) If \(E = \bigoplus_\lambda E_\lambda(u)\), concatenating a basis of each \(E_\lambda(u)\) yields a basis of \(E\) adapted to the decomposition. Every vector of this basis is an eigenvector, so the basis diagonalises \(u\).

Proposition — Dimension criterion
The endomorphism \(u\) is diagonalisable if and only if $$ \sum_{\lambda \in \operatorname{Sp}(u)} \dim E_\lambda(u) = n. $$

The sum \(\sum_\lambda E_\lambda(u)\) is direct, so its dimension is \(\sum_\lambda \dim E_\lambda(u)\). This sum equals the whole of \(E\) exactly when that dimension equals \(\dim E = n\). By the previous Theorem, that is exactly diagonalisability.

Proposition — An endomorphism with n distinct eigenvalues
If \(u\) has \(n = \dim E\) distinct eigenvalues, then \(u\) is diagonalisable.

Pick one eigenvector for each of the \(n\) distinct eigenvalues. By § 1.2 a family of eigenvectors associated with distinct eigenvalues is free, so these \(n\) vectors form a free family of \(E\); having \(n = \dim E\) elements, they form a basis. It is a basis of eigenvectors, so \(u\) is diagonalisable.

Theorem — Diagonalisable through the characteristic polynomial
The endomorphism \(u\) is diagonalisable over \(\mathbb{K}\) if and only if its characteristic polynomial \(\chi_u\) is split over \(\mathbb{K}\) and, for every eigenvalue \(\lambda \in \operatorname{Sp}(u)\), \(\dim E_\lambda(u) = m(\lambda)\).

  • \((\Rightarrow)\) Suppose \(u\) diagonalisable. In a diagonalising basis the matrix is diagonal, with each eigenvalue \(\lambda\) repeated \(\dim E_\lambda(u)\) times; its characteristic polynomial is then \(\chi_u = \prod_{\lambda} (X - \lambda)^{\dim E_\lambda(u)}\). This is split over \(\mathbb{K}\), and the exponent of \((X - \lambda)\), namely \(m(\lambda)\), equals \(\dim E_\lambda(u)\).
  • \((\Leftarrow)\) Suppose \(\chi_u\) split with \(\dim E_\lambda(u) = m(\lambda)\) for every \(\lambda\). Since \(\chi_u\) is split of degree \(n\), the multiplicities sum to \(n\): \(\sum_\lambda m(\lambda) = n\). Hence \(\sum_\lambda \dim E_\lambda(u) = \sum_\lambda m(\lambda) = n\), and the dimension criterion gives diagonalisability.

Method — Diagonalise a matrix
Given \(M \in \mathcal{M}_n(\mathbb{K})\), proceed in two stages.
  • Decide. Compute \(\chi_M\) and factor it. If \(\chi_M\) is not split over \(\mathbb{K}\), \(M\) is not diagonalisable over \(\mathbb{K}\). If it is split, for each eigenvalue \(\lambda\) of multiplicity \(m(\lambda) \ge 2\) compute \(\dim E_\lambda = n - \operatorname{rg}(M - \lambda I_n)\) and compare with \(m(\lambda)\); \(M\) is diagonalisable if and only if equality holds for every \(\lambda\) (simple eigenvalues need no check, \(\dim E_\lambda = 1 = m(\lambda)\) automatically).
  • Diagonalise. When \(M\) is diagonalisable, solve \((M - \lambda I_n)X = 0\) for each \(\lambda\) to get a basis of \(E_\lambda\), concatenate these into the columns of \(P\); then \(P^{-1}MP = D\) is diagonal, with the eigenvalues on the diagonal in the matching order.
Example — A non-diagonalisable verdict
The matrix \(M = \begin{psmallmatrix} 2 & 1 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{psmallmatrix}\) has \(\chi_M = (X - 2)^3\), split over \(\mathbb{R}\), with the single eigenvalue \(2\) of multiplicity \(m(2) = 3\). Applying the Method: \(M - 2I_3\) has rank \(1\), so \(\dim E_2 = 3 - 1 = 2 < 3 = m(2)\). The criterion fails for the eigenvalue \(2\), so \(M\) is not diagonalisable --- even though \(\chi_M\) is split. The split characteristic polynomial is necessary but not sufficient; the dimension test is what decides.
Example — Diagonalise a 3 by 3 matrix
Diagonalise the matrix $$ M = \begin{pmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{pmatrix} \in \mathcal{M}_3(\mathbb{R}). $$

\(M\) has rank \(1\), so \(0\) is an eigenvalue with \(\dim E_0 = 3 - \operatorname{rg}(M) = 2\); hence \(m(0) \ge \dim E_0 = 2\) and \(X^2\) divides the monic degree-\(3\) polynomial \(\chi_M\), giving \(\chi_M = X^2(X - a)\) for some \(a\). The coefficient of \(X^2\) in \(\chi_M\) is \(-\operatorname{tr}(M) = -3\) by the degree-and-coefficients theorem of § 2.1, and in \(X^2(X - a)\) it is \(-a\); so \(a = 3\) and \(\chi_M = X^2(X - 3)\), split over \(\mathbb{R}\), with \(m(0) = 2\) and \(m(3) = 1\).
  • \(\dim E_0 = 2 = m(0)\) and \(\dim E_3 = 1 = m(3)\) (simple), so \(M\) is diagonalisable.
  • \(E_0 = \operatorname{Ker} M\): the system \(x + y + z = 0\) gives the basis \((1, -1, 0)\), \((1, 0, -1)\).
  • \(E_3 = \operatorname{Ker}(M - 3I_3)\): the basis \((1, 1, 1)\).
With \(P = \begin{psmallmatrix} 1 & 1 & 1 \\ -1 & 0 & 1 \\ 0 & -1 & 1 \end{psmallmatrix}\) one gets \(P^{-1}MP = \operatorname{diag}(0, 0, 3)\).

Skills to practice
  • Diagonalising a matrix
IV Trigonalisability
IV.1 Trigonalisable endomorphisms and matrices
Diagonalisation can fail --- the matrix \(\begin{psmallmatrix} 2 & 1 \\ 0 & 2 \end{psmallmatrix}\) already resists it. When it does, the next best shape is triangular: a basis in which the matrix of \(u\) is upper-triangular still exposes the spectrum on the diagonal and tames the powers of \(u\). This subsection names trigonalisable endomorphisms and matrices; the next characterises them.
Definition — Trigonalisable endomorphism and matrix
The endomorphism \(u\) is trigonalisable when there exists a basis of \(E\) in which the matrix of \(u\) is upper-triangular. A matrix \(M \in \mathcal{M}_n(\mathbb{K})\) is trigonalisable when it is similar to an upper-triangular matrix. As for diagonalisability (§ 3.1), an endomorphism is trigonalisable if and only if its matrix in any basis is a trigonalisable matrix.
Example — Diagonalisable implies trigonalisable
Every upper-triangular matrix is trigonalisable, being similar to itself. Every diagonalisable endomorphism is trigonalisable, a diagonal matrix being in particular upper-triangular. Trigonalisability is thus a weaker demand than diagonalisability --- strictly weaker, as the next example shows.
Example — Trigonalisable without being diagonalisable
The matrix \(\begin{psmallmatrix} 1 & 1 \\ 0 & 1 \end{psmallmatrix}\) is upper-triangular, hence trigonalisable, yet it is not diagonalisable: \(\chi = (X-1)^2\) and \(\dim E_1 = 2 - \operatorname{rg}\begin{psmallmatrix} 0 & 1 \\ 0 & 0 \end{psmallmatrix} = 1 < 2 = m(1)\). Trigonalisability is genuinely the broader notion.
Example — A trigonalisation exhibited
For \(M = \begin{psmallmatrix} 2 & -1 \\ 1 & 0 \end{psmallmatrix}\), take \(P = \begin{psmallmatrix} 1 & 1 \\ 1 & 0 \end{psmallmatrix}\); a direct computation gives \(P^{-1}MP = \begin{psmallmatrix} 1 & 1 \\ 0 & 1 \end{psmallmatrix}\), upper-triangular. So \(M\) is trigonalisable, with sole eigenvalue \(1\). Here the triangularising \(P\) is given and the similarity merely checked --- finding such a \(P\) is not a program objective (see § 4.2).
Skills to practice
  • Identifying a trigonalisable endomorphism
IV.2 Characterisation by the characteristic polynomial
Trigonalisability has a single, clean criterion: it holds exactly when the characteristic polynomial is split. One direction is immediate --- a triangular matrix has a split characteristic polynomial. The other is an induction that peels off one eigenvector at a time. The criterion has an immediate consequence over \(\mathbb{C}\), and it pins down the trace and the determinant in terms of the eigenvalues.
Theorem — Trigonalisable through a split characteristic polynomial
The endomorphism \(u\) is trigonalisable over \(\mathbb{K}\) if and only if its characteristic polynomial \(\chi_u\) is split over \(\mathbb{K}\).

  • \((\Rightarrow)\) If \(u\) is trigonalisable, in a suitable basis its matrix \(T\) is upper-triangular with diagonal \(d_1, \dots, d_n\), so \(\chi_u = \chi_T = \prod_{i=1}^n (X - d_i)\) is split over \(\mathbb{K}\).
  • \((\Leftarrow)\) Suppose \(\chi_u\) split; we trigonalise by induction on \(n = \dim E\).
    • Initialization. For \(n = 1\) every matrix is already upper-triangular.
    • Heredity. Assume the result in dimension \(n - 1\). As \(\chi_u\) is split it has a root \(\lambda \in \mathbb{K}\), an eigenvalue of \(u\); let \(e_1\) be an associated eigenvector and complete it into a basis \((e_1, \dots, e_n)\). The matrix of \(u\) in this basis is $$ M = \begin{pmatrix} \lambda & L \\ 0 & M' \end{pmatrix}, \qquad M' \in \mathcal{M}_{n-1}(\mathbb{K}). $$ Then \(\chi_u = (X - \lambda)\chi_{M'}\), and since \(\chi_u\) is split so is \(\chi_{M'}\). By the induction hypothesis \(M'\) is trigonalisable: \(M' = Q^{-1}T'Q\) with \(T'\) upper-triangular. Setting \(P = \begin{psmallmatrix} 1 & 0 \\ 0 & Q^{-1} \end{psmallmatrix}\), a block computation gives $$ P^{-1}MP = \begin{pmatrix} \lambda & LQ^{-1} \\ 0 & QM'Q^{-1} \end{pmatrix} = \begin{pmatrix} \lambda & LQ^{-1} \\ 0 & T' \end{pmatrix}, $$ which is upper-triangular. So \(u\) is trigonalisable.

Proposition — Trigonalisability over the complex numbers
Every matrix \(M \in \mathcal{M}_n(\mathbb{C})\) is trigonalisable in \(\mathcal{M}_n(\mathbb{C})\), since \(\chi_M\) is always split over \(\mathbb{C}\). A real matrix \(M\) is trigonalisable over \(\mathbb{R}\) exactly when \(\chi_M\) splits over \(\mathbb{R}\); when it does not --- for instance the matrix of a planar rotation of angle \(\theta \notin \pi\mathbb{Z}\), whose \(\chi\) has no real root --- \(M\) is still trigonalisable once read in \(\mathcal{M}_n(\mathbb{C})\).

For \(M \in \mathcal{M}_n(\mathbb{C})\), the polynomial \(\chi_M\) has degree \(n\) and, by d'Alembert-Gauss (recalled from Polynomials), splits over \(\mathbb{C}\); the Theorem then makes \(M\) trigonalisable over \(\mathbb{C}\). For a real matrix \(M\), the Theorem applied with \(\mathbb{K} = \mathbb{R}\) gives trigonalisability over \(\mathbb{R}\) exactly when \(\chi_M\) splits over \(\mathbb{R}\); failing that, reading \(M\) in \(\mathcal{M}_n(\mathbb{C})\) brings back the complex case.

Carrying out a trigonalisation in practice is not a program objective. The theorem is used as a structural tool --- it guarantees that a triangular form exists --- but the explicit search for a triangularising basis is not required. The exercises only ever verify a triangular form that is handed to them.
Proposition — Trace and determinant of a trigonalisable endomorphism
If \(u\) is trigonalisable over \(\mathbb{K}\), then \(\operatorname{tr}(u)\) is the sum and \(\det(u)\) the product of the eigenvalues of \(u\), each counted with its multiplicity.

Trigonalise \(u\): in a suitable basis its matrix is upper-triangular, \(T\), with diagonal entries \(d_1, \dots, d_n\). The diagonal of a triangular matrix carries its eigenvalues, each \(\lambda\) appearing \(m(\lambda)\) times since \(\chi_T = \prod_i (X - d_i)\). Trace and determinant are similarity invariants, so \(\operatorname{tr}(u) = \operatorname{tr}(T) = \sum_i d_i\) and \(\det(u) = \det(T) = \prod_i d_i\) --- the sum and product of the eigenvalues counted with multiplicity.

Extension to any endomorphism
More generally, for any \(u \in \mathcal{L}(E)\), the trace and determinant are the sum and product of the roots of \(\chi_u\) in \(\mathbb{C}\), counted with multiplicity. This needs no new theorem: \(\chi_u = X^n - \operatorname{tr}(u)X^{n-1} + \dots + (-1)^n\det(u)\) from § 2, and \(\chi_u\) factors over \(\mathbb{C}\) as \(\prod_{i=1}^n (X - \lambda_i)\) by d'Alembert-Gauss; identifying the coefficients of \(X^{n-1}\) and the constant term reads off \(\operatorname{tr}(u) = \sum_i \lambda_i\) and \(\det(u) = \prod_i \lambda_i\). When \(\mathbb{K} = \mathbb{C}\) these roots are exactly the eigenvalues; over a proper subfield, the non-\(\mathbb{K}\) roots are not eigenvalues of \(u\).
Example — Reading trace and determinant off the eigenvalues
Let \(M\) be trigonalisable over \(\mathbb{K}\), similar to an upper-triangular \(T\) of diagonal \((\lambda_i)\) --- the eigenvalues counted with multiplicity. Then \(\operatorname{tr}(M) = \sum_i \lambda_i\) and \(\det(M) = \prod_i \lambda_i\). Moreover \(M^k = P^{-1}T^kP\) with \(T^k\) upper-triangular of diagonal \((\lambda_i^k)\), so for every integer \(k \ge 1\), $$ \operatorname{tr}(M^k) = \operatorname{tr}(T^k) = \sum_i \lambda_i^k, $$ by similarity invariance of the trace. A matrix with eigenvalues \(1, 2, 2\) thus has \(\operatorname{tr}(M) = 5\), \(\det(M) = 4\) and \(\operatorname{tr}(M^2) = 1 + 4 + 4 = 9\).
Method — Exploit a split characteristic polynomial
When \(\chi_u\) is split --- always so over \(\mathbb{C}\) --- \(u\) is trigonalisable, and one may reason as though its matrix were upper-triangular even without computing a triangularising basis. Use this to read \(\operatorname{tr}(u)\) and \(\det(u)\) as the sum and product of the eigenvalues counted with multiplicity, to compute \(\operatorname{tr}(u^k) = \sum \lambda_i^k\) for \(k \ge 1\) (same multiplicity convention), and to bound or locate the spectrum. The triangular form is invoked for what it guarantees, never searched for explicitly.
Skills to practice
  • Exploiting a split characteristic polynomial
V Nilpotent endomorphisms
V.1 Nilpotent endomorphisms and matrices
At the opposite extreme from diagonalisability sits a family of endomorphisms that no basis can diagonalise unless they are zero: the nilpotent ones, those a high enough power annihilates. They are the pure obstruction to reduction --- a non-zero nilpotent endomorphism has a single eigenvalue, \(0\), and yet is not the zero map. This subsection defines them and bounds how soon they vanish.
Definition — Nilpotent endomorphism and matrix
The endomorphism \(u\) is nilpotent when \(u^k = 0_{\mathcal{L}(E)}\) for some \(k \in \mathbb{N}^*\). The least such \(k\) is the index of nilpotence of \(u\): writing it \(p\), one has \(u^p = 0\) and \(u^{p-1} \neq 0\). A matrix \(M \in \mathcal{M}_n(\mathbb{K})\) is nilpotent when \(M^k = 0\) for some \(k \in \mathbb{N}^*\). As for diagonalisability (§ 3.1), an endomorphism is nilpotent if and only if its matrix in any basis is a nilpotent matrix.
Example — The shift on a basis
On a space with basis \((e_1, \dots, e_p)\), the endomorphism \(u\) defined by \(u(e_i) = e_{i-1}\) for \(i \ge 2\) and \(u(e_1) = 0_E\) is nilpotent of index exactly \(p\): each application of \(u\) shifts the basis one step down the chain, and \(u^p\) kills every \(e_i\) while \(u^{p-1}(e_p) = e_1 \neq 0_E\). Its matrix is strictly upper-triangular with \(1\)'s just above the diagonal.
Example — The derivation on polynomials of bounded degree
On \(\mathbb{K}_d[X]\), the space of polynomials of degree at most \(d\) (of dimension \(d + 1\)), the derivation \(D : P \mapsto P'\) is nilpotent: differentiating \(d + 1\) times annihilates every polynomial of degree \(\le d\). Its index is exactly \(d + 1\), since \(D^d(X^d) = d!\,\neq 0\) while \(D^{d+1}(X^d) = 0\) --- the index equals the dimension.
Example — A strictly triangular matrix and its index
A strictly upper-triangular matrix is nilpotent. For the strictly triangular \(N = \begin{psmallmatrix} 0 & 1 & 2 \\ 0 & 0 & 3 \\ 0 & 0 & 0 \end{psmallmatrix}\), one computes \(N^2 = \begin{psmallmatrix} 0 & 0 & 3 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{psmallmatrix} \neq 0\) and \(N^3 = 0\): the index of nilpotence is \(3\), equal to the size of the matrix.
Proposition — The index is bounded by the dimension
Let \(u\) be a nilpotent endomorphism of \(E\). Its index of nilpotence is at most \(n = \dim E\).

Let \(p\) be the index, so \(u^{p-1} \neq 0\): there is a vector \(x\) with \(u^{p-1}(x) \neq 0_E\). We show the family \((x, u(x), \dots, u^{p-1}(x))\) is free, which forces \(p \le n\) since a free family of \(E\) has at most \(n\) vectors.
Suppose \(\alpha_0 x + \alpha_1 u(x) + \dots + \alpha_{p-1} u^{p-1}(x) = 0_E\). Apply \(u^{p-1}\): every term \(u^{p-1+j}(x)\) with \(j \ge 1\) vanishes (as \(u^p = 0\)), leaving \(\alpha_0 u^{p-1}(x) = 0_E\), hence \(\alpha_0 = 0\). Apply \(u^{p-2}\) to what remains: likewise \(\alpha_1 = 0\). Continuing, every \(\alpha_j = 0\). The family is free.

Skills to practice
  • Identifying a nilpotent endomorphism
V.2 Characterisation of nilpotence
Nilpotence has a clean spectral signature. A nilpotent endomorphism is exactly one whose characteristic polynomial is \(X^n\) --- equivalently, one that is trigonalisable with \(0\) as its only eigenvalue. The proof stays entirely inside \(\mathbb{K}\): it reads the strictly triangular shape off the chain of kernels \(\operatorname{Ker} u \subseteq \operatorname{Ker} u^2 \subseteq \cdots\).
Theorem — Characterisation of nilpotence
For \(u \in \mathcal{L}(E)\), the following are equivalent:
  1. \(u\) is nilpotent;
  2. \(\chi_u = X^n\);
  3. \(u\) is trigonalisable over \(\mathbb{K}\) with \(0\) as its only eigenvalue.

The argument runs entirely over \(\mathbb{K}\), with no complexification.
  • \((1 \Rightarrow 2)\) Let \(u\) be nilpotent of index \(p\). The chain of kernels \(\{0_E\} \subseteq \operatorname{Ker} u \subseteq \operatorname{Ker} u^2 \subseteq \dots \subseteq \operatorname{Ker} u^p = E\) has each term stable by \(u\), and \(u(\operatorname{Ker} u^k) \subseteq \operatorname{Ker} u^{k-1}\): indeed \(x \in \operatorname{Ker} u^k\) gives \(u^{k-1}(u(x)) = u^k(x) = 0_E\). Build a basis of \(E\) adapted to this chain, completing a basis of \(\operatorname{Ker} u^{k-1}\) into one of \(\operatorname{Ker} u^k\) at each step. In that basis \(u\) sends each block strictly down the chain, so its matrix is strictly upper-triangular, with a zero diagonal. Hence \(\chi_u = \prod (X - 0) = X^n\).
  • \((2 \Rightarrow 3)\) If \(\chi_u = X^n\), it is split over \(\mathbb{K}\), so \(u\) is trigonalisable over \(\mathbb{K}\); the eigenvalues are the roots of \(\chi_u\), namely \(0\) alone.
  • \((3 \Rightarrow 1)\) Trigonalise \(u\): its matrix is upper-triangular, and its diagonal carries the eigenvalues, all equal to \(0\). So the matrix \(T\) is strictly upper-triangular. Such a \(T\) satisfies \(T^n = 0\): writing \((e_1, \dots, e_n)\) for the basis, \(T\) sends \(\operatorname{Vect}(e_1, \dots, e_i)\) into \(\operatorname{Vect}(e_1, \dots, e_{i-1})\) --- lowering the index by at least one --- so \(T^n\) annihilates every basis vector. Hence \(u^n = 0\) and \(u\) is nilpotent.

Example — Recognising nilpotence from the characteristic polynomial
A nilpotent matrix \(N\) has \(\chi_N = X^n\), so all its coefficients of degree \(< n\) vanish: in particular \(\operatorname{tr}(N) = 0\) and \(\det(N) = 0\). Conversely, the matrix \(N = \begin{psmallmatrix} 1 & 1 \\ -1 & -1 \end{psmallmatrix}\) has \(\operatorname{tr}(N) = 0\), \(\det(N) = 0\) and \(\chi_N = X^2\), hence is nilpotent --- and indeed \(N^2 = 0\). In dimension at least \(2\), zero trace alone never suffices, and in dimension \(\ge 3\) even zero trace together with zero determinant do not force nilpotence; the full identity \(\chi_N = X^n\) is the test.
Method — Recognise a nilpotent endomorphism
To decide whether \(u\) (matrix \(M\)) is nilpotent, three routes are open. Compute \(\chi_u\) and check \(\chi_u = X^n\). Or trigonalise and check that the only eigenvalue is \(0\). Or, most directly for an explicit matrix, compute the successive powers \(M, M^2, \dots\) until one vanishes (the Theorem guarantees that if \(M\) is nilpotent then \(M^n = 0\), so it suffices to test up to the exponent \(n\)).
Going further
This chapter has followed the geometric road to reduction: stable subspaces, eigenspaces, direct sums. A second, algebraic road --- annihilating polynomials, the minimal polynomial \(\pi_u\), the Cayley-Hamilton theorem, the kernel-decomposition lemma --- is taken up in Polynomials of an endomorphism, where diagonalisability is recast as « the minimal polynomial is split with simple roots ». Reduction is also the engine behind several later chapters: it computes matrix powers \(M^k\), solves linear recurrent sequences, and integrates linear differential systems \(X' = AX\).
Skills to practice
  • Applying the nilpotence characterisation