CommeUnJeu · L1 MPSI

Change of basis, equivalence, similarity

⌚ ~35 min ▢ 4 blocks ✓ 28 exercises ➣ Prerequisites : Matrix representation of linear maps, Finite-dimensional vector spaces

The chapter Matrix representation of linear maps set up the dictionary $u \leftrightarrow \mathrm{Mat}_{\mathcal{B}, \mathcal{B}'}(u)$ between a linear map and its matrix once a basis is fixed. Two natural questions remained open: how does the matrix transform when the basis is changed, and what is the right way to say that two matrices encode « the same » linear map up to a change of viewpoint? This chapter answers both.
The plan has four sections. Change of basis introduces the passage matrix $P_{\mathcal{B}}^{\mathcal{B}'}$, proves the two key formulas $X = P X'$ (for a vector) and $A' = Q^{-1} A P$ (for a linear map), and specialises the second formula to endomorphisms ($A' = P^{-1} A P$). Equivalent matrices defines the equivalence relation « matrices équivalentes » on rectangular matrices and proves that two matrices are equivalent if and only if they have the same rank --- a complete classification. Similar matrices introduces the stricter relation « matrices semblables » on square matrices; rank and trace are similarity invariants but, importantly, do not classify similarity classes. Trace of an endomorphism capitalises on similarity invariance of trace to define the trace of an endomorphism --- the first basis-free numerical invariant of an endomorphism we encounter.
Throughout the chapter, $\mathbb{K} \in \{\mathbb{R}, \mathbb{C}\}$, and all vector spaces $E, F, G$ are $\mathbb{K}$-vector spaces of finite dimension, equipped with bases denoted by $\mathcal{B}, \mathcal{B}', \mathcal{B}''$ etc. For a linear map $u \in \mathcal{L}(E, F)$ with $\dim E = p$ and $\dim F = n$, the matrix $\mathrm{Mat}_{\mathcal{B}, \mathcal{B}'}(u) \in M_{n, p}(\mathbb{K})$ is read in the source basis $\mathcal{B}$ and the target basis $\mathcal{B}'$; for an endomorphism $u \in \mathcal{L}(E)$, $\mathrm{Mat}_{\mathcal{B}}(u) \in M_n(\mathbb{K})$ uses the same basis on both sides. The notations $\mathrm{rg}, \mathrm{Ker}, \mathrm{Im}, \mathrm{tr}$ keep their usual meaning. $\mathrm{GL}_n(\mathbb{K})$ is the group of invertible $n \times n$ matrices and $\mathrm{GL}(E)$ the group of automorphisms of $E$.

I Change of basis

The change-of-basis story rests on a single object: the passage matrix $P_{\mathcal{B}}^{\mathcal{B}'}$, whose columns store the new basis vectors expressed in the old basis. Two formulas then follow mechanically. A vector has two coordinate columns $X$ (old basis) and $X'$ (new basis); they are related by $X = P X'$. A linear map has two matrix representations $A$ (old basis pair) and $A'$ (new basis pair); they are related by $A' = Q^{-1} A P$, where $P$ is the source change and $Q$ is the target change. For an endomorphism, source and target collapse: $A' = P^{-1} A P$.

Definition — Passage matrix

Let $E$ be a $\mathbb{K}$-vector space of finite dimension and let $\mathcal{B}, \mathcal{B}'$ be two bases of $E$. The passage matrix from $\mathcal{B}$ to $\mathcal{B}'$ is $$ P_{\mathcal{B}}^{\mathcal{B}'} = \mathrm{Mat}_{\mathcal{B}', \mathcal{B}}(\mathrm{Id}_E). $$ More explicitly, write $\mathcal{B} = (e_1, \dots, e_n)$, $\mathcal{B}' = (e'_1, \dots, e'_n)$, and $P_{\mathcal{B}}^{\mathcal{B}'} = (a_{ij})_{1 \le i \le n,\, 1 \le j \le n}$. Then $$ \forall j \in \{1, \dots, n\}: \quad e'_j = \sum_{i=1}^{n} a_{ij}\, e_i. $$ Equivalently, the $j$-th column of $P_{\mathcal{B}}^{\mathcal{B}'}$ contains the coordinates of $e'_j$ in the basis $\mathcal{B}$:

Proposition — Properties of the passage matrix

Let $\mathcal{B}, \mathcal{B}', \mathcal{B}''$ be three bases of the same $\mathbb{K}$-vector space $E$ of finite dimension. Then:

Invertibility: $P_{\mathcal{B}}^{\mathcal{B}'} \in \mathrm{GL}_n(\mathbb{K})$ where $n = \dim E$, with inverse $\bigl(P_{\mathcal{B}}^{\mathcal{B}'}\bigr)^{-1} = P_{\mathcal{B}'}^{\mathcal{B}}$.
Composition: $P_{\mathcal{B}}^{\mathcal{B}'} \cdot P_{\mathcal{B}'}^{\mathcal{B}''} = P_{\mathcal{B}}^{\mathcal{B}''}$.

Proof

Both properties follow from the dictionary isomorphism of chapter Matrix representation of linear maps.

Invertibility. The identity $\mathrm{Id}_E$ is an automorphism of $E$, so by the « iso $\Leftrightarrow$ invertible matrix » theorem, $\mathrm{Mat}_{\mathcal{B}', \mathcal{B}}(\mathrm{Id}_E)$ is invertible and its inverse is $\mathrm{Mat}_{\mathcal{B}, \mathcal{B}'}(\mathrm{Id}_E^{-1}) = \mathrm{Mat}_{\mathcal{B}, \mathcal{B}'}(\mathrm{Id}_E) = P_{\mathcal{B}'}^{\mathcal{B}}$.
Composition. Apply the « composition becomes matrix product » theorem to $\mathrm{Id}_E \circ \mathrm{Id}_E = \mathrm{Id}_E$ read in the three bases $\mathcal{B}'', \mathcal{B}', \mathcal{B}$: $$ \mathrm{Mat}_{\mathcal{B}'', \mathcal{B}}(\mathrm{Id}_E) = \mathrm{Mat}_{\mathcal{B}', \mathcal{B}}(\mathrm{Id}_E) \cdot \mathrm{Mat}_{\mathcal{B}'', \mathcal{B}'}(\mathrm{Id}_E), $$ which reads $P_{\mathcal{B}}^{\mathcal{B}''} = P_{\mathcal{B}}^{\mathcal{B}'} \cdot P_{\mathcal{B}'}^{\mathcal{B}''}$.

Example

In $\mathbb{R}^2$, take the canonical basis $\mathcal{B} = ((1, 0), (0, 1))$ and the basis $\mathcal{B}' = ((1, 1), (1, -1))$. The columns of $P_{\mathcal{B}}^{\mathcal{B}'}$ are the coordinates of $(1, 1)$ and $(1, -1)$ in $\mathcal{B}$, read directly: $$ P_{\mathcal{B}}^{\mathcal{B}'} = \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}. $$ The inverse $P_{\mathcal{B}'}^{\mathcal{B}}$ is obtained by Gauss-Jordan or by direct $2 \times 2$ inversion: $P_{\mathcal{B}'}^{\mathcal{B}} = \frac{1}{2}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}$.

Example

In $\mathbb{R}_2[X]$, take the canonical basis $\mathcal{B} = (1, X, X^2)$ and the Taylor basis at $a \in \mathbb{R}$: $\mathcal{B}' = (1, X - a, (X - a)^2)$. Expanding gives $(X - a)^2 = X^2 - 2 a X + a^2$, so the columns of $P_{\mathcal{B}}^{\mathcal{B}'}$ are $$ P_{\mathcal{B}}^{\mathcal{B}'} = \begin{pmatrix} 1 & -a & a^2 \\ 0 & 1 & -2 a \\ 0 & 0 & 1 \end{pmatrix}. $$ The matrix is upper triangular with $1$s on the diagonal, hence invertible --- confirming that $\mathcal{B}'$ is indeed a basis.

Method — Compute the passage matrix

To compute $P_{\mathcal{B}}^{\mathcal{B}'}$:

Read off each vector $e'_j$ of $\mathcal{B}'$.
Decompose $e'_j$ in $\mathcal{B}$: $e'_j = a_{1j} e_1 + \dots + a_{nj} e_n$.
The $j$-th column of $P_{\mathcal{B}}^{\mathcal{B}'}$ is $(a_{1j}, \dots, a_{nj})^\top$.

Columns of $P_{\mathcal{B}}^{\mathcal{B}'}$ are vectors of $\mathcal{B}'$ expressed in $\mathcal{B}$. Reverse the roles to obtain $P_{\mathcal{B}'}^{\mathcal{B}}$ --- or invert the matrix.

Theorem — Change of basis for a vector

Let $\mathcal{B}, \mathcal{B}'$ be two bases of a finite-dimensional $\mathbb{K}$-vector space $E$. For every $x \in E$, with $X = \mathrm{Mat}_{\mathcal{B}}(x)$ and $X' = \mathrm{Mat}_{\mathcal{B}'}(x)$: $$ X = P_{\mathcal{B}}^{\mathcal{B}'} \, X'. $$

Proof

Apply the matrix-vector formula from chapter Matrix representation of linear maps to the identity $\mathrm{Id}_E$ and the vector $x$, read in source basis $\mathcal{B}'$ and target basis $\mathcal{B}$: $$ \mathrm{Mat}_{\mathcal{B}}(\mathrm{Id}_E(x)) = \mathrm{Mat}_{\mathcal{B}', \mathcal{B}}(\mathrm{Id}_E) \cdot \mathrm{Mat}_{\mathcal{B}'}(x). $$ Since $\mathrm{Id}_E(x) = x$, the left-hand side is $X$; the right-hand side is $P_{\mathcal{B}}^{\mathcal{B}'} \cdot X'$, giving $X = P_{\mathcal{B}}^{\mathcal{B}'} X'$.

Convention warning

The passage matrix converts NEW coordinates to OLD coordinates, not the other way around: $$ X = P_{\mathcal{B}}^{\mathcal{B}'} \, X' \qquad \text{(new $X'$ on the right, old $X$ on the left).} $$ To go from old to new, invert: $X' = \bigl(P_{\mathcal{B}}^{\mathcal{B}'}\bigr)^{-1} X = P_{\mathcal{B}'}^{\mathcal{B}} X$.
This is the main trap of the chapter --- students often expect the opposite direction. The convention is forced by the definition $P_{\mathcal{B}}^{\mathcal{B}'} = \mathrm{Mat}_{\mathcal{B}', \mathcal{B}}(\mathrm{Id}_E)$: the columns are vectors of $\mathcal{B}'$ in $\mathcal{B}$, so multiplying by $P_{\mathcal{B}}^{\mathcal{B}'}$ « converts $\mathcal{B}'$-coordinates back to $\mathcal{B}$-coordinates ».

Example

With the bases $\mathcal{B} = ((1, 0), (0, 1))$ and $\mathcal{B}' = ((1, 1), (1, -1))$ of $\mathbb{R}^2$ as before, take $x = (3, 5)$. In the canonical basis, $X = (3, 5)^\top$ directly. The coordinates $X' = (a, b)^\top$ in $\mathcal{B}'$ satisfy $X = P_{\mathcal{B}}^{\mathcal{B}'} X'$, that is $a + b = 3$ and $a - b = 5$. Adding gives $a = 4$, then $b = -1$. So $X' = (4, -1)^\top$.

Method — Switch between coordinates in two bases

Given $P = P_{\mathcal{B}}^{\mathcal{B}'}$ and a vector $x$ with known coordinates in one basis:

New $\to$ old: if $X'$ is known, then $X = P X'$ directly.
Old $\to$ new: if $X$ is known, then $X' = P^{-1} X$. Either invert $P$, or solve the linear system $P X' = X$ for $X'$.

Theorem — Change of basis for a linear map

Let $u \in \mathcal{L}(E, F)$ with $E$ and $F$ finite-dimensional. Let $\mathcal{B}, \mathcal{B}'$ be two bases of $E$ and $\mathcal{C}, \mathcal{C}'$ two bases of $F$. Set $P = P_{\mathcal{B}}^{\mathcal{B}'}$, $Q = P_{\mathcal{C}}^{\mathcal{C}'}$, $A = \mathrm{Mat}_{\mathcal{B}, \mathcal{C}}(u)$, $A' = \mathrm{Mat}_{\mathcal{B}', \mathcal{C}'}(u)$. Then $$ A' = Q^{-1} \, A \, P. $$

Proof

Apply the « composition becomes matrix product » theorem to $u \circ \mathrm{Id}_E = \mathrm{Id}_F \circ u$, read in source basis $\mathcal{B}'$ and target basis $\mathcal{C}$: $$ \mathrm{Mat}_{\mathcal{B}', \mathcal{C}}(u \circ \mathrm{Id}_E) = \mathrm{Mat}_{\mathcal{B}', \mathcal{C}}(\mathrm{Id}_F \circ u). $$ The left-hand side factors as $\mathrm{Mat}_{\mathcal{B}, \mathcal{C}}(u) \cdot \mathrm{Mat}_{\mathcal{B}', \mathcal{B}}(\mathrm{Id}_E) = A \cdot P$. The right-hand side factors as $\mathrm{Mat}_{\mathcal{C}', \mathcal{C}}(\mathrm{Id}_F) \cdot \mathrm{Mat}_{\mathcal{B}', \mathcal{C}'}(u) = Q \cdot A'$. Hence $A P = Q A'$, and since $Q$ is invertible, $A' = Q^{-1} A P$.

The commutative diagram below visualises the formula. Each vertex is a coordinate space, each horizontal arrow encodes $u$ in the corresponding basis pair, and each vertical arrow encodes the change of basis (new $\to$ old). Commutativity of the square reads $A P = Q A'$, equivalently $A' = Q^{-1} A P$.

Corollary — Endomorphism case

Let $u \in \mathcal{L}(E)$ with $E$ finite-dimensional. Let $\mathcal{B}, \mathcal{B}'$ be two bases of $E$. Set $P = P_{\mathcal{B}}^{\mathcal{B}'}$, $A = \mathrm{Mat}_{\mathcal{B}}(u)$, $A' = \mathrm{Mat}_{\mathcal{B}'}(u)$. Then $$ A' = P^{-1} \, A \, P. $$

Proof

Special case of the previous theorem with $E = F$, $\mathcal{C} = \mathcal{B}$, $\mathcal{C}' = \mathcal{B}'$, hence $Q = P_{\mathcal{B}}^{\mathcal{B}'} = P$.

Example

Take the endomorphism $u \colon \mathbb{R}^2 \to \mathbb{R}^2$, $u(x, y) = (x + y, y)$ (a shear). In the canonical basis $\mathcal{B} = ((1, 0), (0, 1))$, $u(1, 0) = (1, 0)$ and $u(0, 1) = (1, 1)$, so $$ A = \mathrm{Mat}_{\mathcal{B}}(u) = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}. $$ Switch to the basis $\mathcal{B}' = ((1, 1), (0, 1))$. Read the columns of $P_{\mathcal{B}}^{\mathcal{B}'}$ as the new basis vectors expressed in the canonical basis: $P = \begin{pmatrix} 1 & 0 \\ 1 & 1 \end{pmatrix}$, and $P^{-1} = \begin{pmatrix} 1 & 0 \\ -1 & 1 \end{pmatrix}$. Compute step by step: $$ A P = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 1 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 1 \\ 1 & 1 \end{pmatrix}, \qquad A' = P^{-1} (A P) = \begin{pmatrix} 1 & 0 \\ -1 & 1 \end{pmatrix} \begin{pmatrix} 2 & 1 \\ 1 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 1 \\ -1 & 0 \end{pmatrix}. $$ The new matrix $A'$ is genuinely different from $A$ --- entries have moved --- yet $\mathrm{tr}(A') = 2 + 0 = 2 = 1 + 1 = \mathrm{tr}(A)$ and $\det(A') = 2 \cdot 0 - 1 \cdot (-1) = 1 = \det(A)$, confirming that the two matrices encode the same shear endomorphism in two different bases.

Method — Apply the change-of-basis formula

To compute $A' = \mathrm{Mat}_{\mathcal{B}', \mathcal{C}'}(u)$ from $A = \mathrm{Mat}_{\mathcal{B}, \mathcal{C}}(u)$:

Identify the source passage matrix $P = P_{\mathcal{B}}^{\mathcal{B}'}$.
Identify the target passage matrix $Q = P_{\mathcal{C}}^{\mathcal{C}'}$.
Compute $A' = Q^{-1} A P$ (matrix product).

For an endomorphism with single basis change, $P = Q$ and the formula collapses to $A' = P^{-1} A P$.

Skills to practice

Change of basis

II Equivalent matrices

The second « fundamental example » of the change-of-basis theorem reads: two matrices of the same linear map in different basis pairs satisfy $A' = Q^{-1} A P$ for some invertible $P, Q$. Reading this backwards, we can declare two matrices to be « the same up to a choice of basis » whenever they are related by such a formula. This gives the named relation « matrices équivalentes » on $M_{n, p}(\mathbb{K})$, which we now study. The headline result is striking: equivalence is completely classified by rank.
We recall from chapter Matrix representation of linear maps the canonical normal form: for $r \in \llbracket 0, \min(n, p) \rrbracket$, $$ J_{n, p, r} = \begin{pmatrix} I_r & 0 \\ 0 & 0 \end{pmatrix} \in M_{n, p}(\mathbb{K}). $$

Definition — Equivalent matrices

Two matrices $A, B \in M_{n, p}(\mathbb{K})$ are equivalent if there exist $P \in \mathrm{GL}_p(\mathbb{K})$ and $Q \in \mathrm{GL}_n(\mathbb{K})$ such that $$ B = Q^{-1} \, A \, P. $$ We denote this $A \sim_{\mathrm{eq}} B$.

Proposition — Two fundamental examples of equivalence

Two natural ways to produce equivalent matrices:

Elementary operations. If $B$ is obtained from $A$ by a sequence of elementary row and column operations, then $A \sim_{\mathrm{eq}} B$.
Same map, different basis pairs. For $u \in \mathcal{L}(E, F)$ with $\dim E = p, \dim F = n$, and for any two basis pairs $(\mathcal{B}, \mathcal{C})$ and $(\mathcal{B}', \mathcal{C}')$ of $E, F$: $$ \mathrm{Mat}_{\mathcal{B}, \mathcal{C}}(u) \sim_{\mathrm{eq}} \mathrm{Mat}_{\mathcal{B}', \mathcal{C}'}(u). $$

Proof

Elementary operations. Each elementary row operation is left-multiplication by an invertible matrix; each elementary column operation is right-multiplication by an invertible matrix (chapter Matrix calculus). After a sequence of such operations, $B = Q^{-1} A P$ with $Q^{-1}$ a product of row operation matrices and $P$ a product of column operation matrices, both invertible.
Same map, different basis pairs. This is the change-of-basis theorem of Change of basis: $\mathrm{Mat}_{\mathcal{B}', \mathcal{C}'}(u) = Q^{-1} \mathrm{Mat}_{\mathcal{B}, \mathcal{C}}(u) P$ where $P = P_{\mathcal{B}}^{\mathcal{B}'}, Q = P_{\mathcal{C}}^{\mathcal{C}'}$ are invertible passage matrices.

Theorem — Properties of equivalence

The relation $\sim_{\mathrm{eq}}$ on $M_{n, p}(\mathbb{K})$ satisfies:

Equivalence relation: $\sim_{\mathrm{eq}}$ is reflexive, symmetric, transitive.
Classification by rank: $A \sim_{\mathrm{eq}} B \iff \mathrm{rg}(A) = \mathrm{rg}(B)$.
Normal form: every matrix $A \in M_{n, p}(\mathbb{K})$ of rank $r$ satisfies $A \sim_{\mathrm{eq}} J_{n, p, r}$.

Proof

Equivalence relation. Reflexivity: $A = I_n^{-1} A I_p$. Symmetry: $B = Q^{-1} A P \Rightarrow A = Q B P^{-1} = (Q^{-1})^{-1} B (P^{-1})$, both factors invertible. Transitivity: $B = Q_1^{-1} A P_1$ and $C = Q_2^{-1} B P_2$ give $C = (Q_1 Q_2)^{-1} A (P_1 P_2)$.
Normal form (every rank-$r$ matrix is equivalent to $J_{n,p,r}$). This is exactly the rank-normal-form theorem of chapter Matrix representation of linear maps: there exist invertible $P, Q$ with $J_{n, p, r} = Q^{-1} A P$, hence $A \sim_{\mathrm{eq}} J_{n, p, r}$.
Classification by rank. $(\Rightarrow)$ Multiplication on the left or right by an invertible matrix preserves rank (chapter Matrix representation of linear maps, Proposition « Invertible multiplication preserves rank »), so $B = Q^{-1} A P \Rightarrow \mathrm{rg}(B) = \mathrm{rg}(A)$. $(\Leftarrow)$ If $\mathrm{rg}(A) = \mathrm{rg}(B) = r$, then by the normal-form bullet both $A$ and $B$ are equivalent to $J_{n, p, r}$, and by transitivity $A \sim_{\mathrm{eq}} B$.

Example

Show that $A = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{pmatrix}$ and $B = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix} = J_{2, 3, 2}$ are equivalent.

Answer

Compute $\mathrm{rg}(A)$. Row-reduce: $L_2 \leftarrow L_2 - 4 L_1 = (0, -3, -6)$, two pivots, so $\mathrm{rg}(A) = 2$. Also $\mathrm{rg}(B) = 2$. By the classification theorem, $A \sim_{\mathrm{eq}} B$.

Example

All nonzero rank-1 matrices in $M_2(\mathbb{R})$ are equivalent to $J_{2, 2, 1} = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}$. For instance, $\begin{pmatrix} 1 & 2 \\ 3 & 6 \end{pmatrix}$, $\begin{pmatrix} 2 & 4 \\ 1 & 2 \end{pmatrix}$, and $\begin{pmatrix} 0 & 0 \\ 5 & 5 \end{pmatrix}$ all have rank 1, so are pairwise equivalent.

Corollary — Rank by transposition

For every $A \in M_{n, p}(\mathbb{K})$, $\mathrm{rg}(A) = \mathrm{rg}(A^\top)$. This was established directly in chapter Matrix representation of linear maps (the row-rank $=$ column-rank Corollary established there). The equivalence framework now provides a reformulation: $A \sim_{\mathrm{eq}} J_{n, p, r}$ in $M_{n, p}(\mathbb{K})$, and transposing both sides gives $A^\top \sim_{\mathrm{eq}} J_{p, n, r}$ in $M_{p, n}(\mathbb{K})$, both of rank $r$.

Theorem — Rank by extracted submatrix

For every $A \in M_{n, p}(\mathbb{K})$, the rank $\mathrm{rg}(A)$ is the maximum size of an invertible square submatrix that can be extracted from $A$.

Proof

Set $r = \mathrm{rg}(A)$.

Existence of an invertible $r \times r$ extracted submatrix. Since $\mathrm{rg}(A) = r$, $A$ has $r$ linearly independent columns. Let $C \in M_{n, r}(\mathbb{K})$ be the matrix formed by these $r$ columns; $C$ has rank $r$. By the row-rank = column-rank theorem (Corollary above), $C$ also has $r$ linearly independent rows. The $r \times r$ submatrix obtained by intersecting those $r$ rows with the $r$ chosen columns of $A$ has column rank $r$, hence is invertible.
No larger invertible extracted submatrix exists. Conversely, suppose an $s \times s$ invertible submatrix $M$ is extracted from $A$ (so $s$ rows and $s$ columns of $A$ have been selected). The $s$ chosen columns of $A$ have a restriction to those $s$ rows that is exactly $M$, hence has rank $s$. So the chosen columns are themselves linearly independent in $\mathbb{K}^n$, which forces $s \le \mathrm{rg}(A) = r$.

Both inequalities combine to $r = $ max size of an invertible extracted submatrix.

Method — Show two matrices are equivalent

For $A, B \in M_{n, p}(\mathbb{K})$:

Compute $\mathrm{rg}(A)$ and $\mathrm{rg}(B)$ (e.g.\ by Gauss elimination on rows; or by the rank-by-extracted-submatrix theorem, looking for a large invertible square submatrix).
If $\mathrm{rg}(A) = \mathrm{rg}(B)$, conclude $A \sim_{\mathrm{eq}} B$ by the classification theorem.
If explicit $P, Q$ are required, reduce both $A$ and $B$ to the same normal form $J_{n, p, r}$ using elementary operations: row operations on $A$ accumulate into $Q^{-1}$ (left factor), column operations accumulate into $P$ (right factor), until $A$ reaches $J_{n, p, r}$; do the same for $B$; compose the two factorisations to express $B = Q^{-1} A P$.

Skills to practice

Equivalent matrices

III Similar matrices

Equivalence allows independent changes of basis in the source and target: $B = Q^{-1} A P$ with two unrelated invertible matrices $P, Q$. For an endomorphism, source and target are the same space $E$, and changing the basis of $E$ forces $P = Q$. The same passage matrix appears on both sides: $B = P^{-1} A P$. This stricter relation is called similarity; it preserves more structure than equivalence --- notably, the trace.

Definition — Similar matrices

Two square matrices $A, B \in M_n(\mathbb{K})$ are similar if there exists $P \in \mathrm{GL}_n(\mathbb{K})$ such that $$ B = P^{-1} \, A \, P. $$ We denote this $A \sim_{\mathrm{sim}} B$. Similarity is defined only between square matrices of the same size.

Proposition — Fundamental example of similarity

For $u \in \mathcal{L}(E)$ with $E$ finite-dimensional, the matrices $\mathrm{Mat}_{\mathcal{B}}(u)$ and $\mathrm{Mat}_{\mathcal{B}'}(u)$ in two bases of $E$ are similar.

Proof

Direct from the endomorphism corollary of Change of basis: $\mathrm{Mat}_{\mathcal{B}'}(u) = P^{-1} \mathrm{Mat}_{\mathcal{B}}(u) P$ where $P = P_{\mathcal{B}}^{\mathcal{B}'}$.

Theorem — Properties of similarity

The relation $\sim_{\mathrm{sim}}$ on $M_n(\mathbb{K})$ satisfies:

Equivalence relation: $\sim_{\mathrm{sim}}$ is reflexive, symmetric, transitive.
Invariants: similar matrices have the same rank AND the same trace.

Proof

Equivalence relation. Same template as for $\sim_{\mathrm{eq}}$, special case $Q = P$.
Rank invariance. Similar matrices are equivalent (with $Q = P$), so they have the same rank by the equivalence-classification theorem.
Trace invariance. For $B = P^{-1} A P$, use the cyclicity $\mathrm{tr}(X Y) = \mathrm{tr}(Y X)$ from chapter Matrix calculus: $$ \begin{aligned} \mathrm{tr}(B) &= \mathrm{tr}(P^{-1} (A P)) && \text{(definition of $B$)} \\ &= \mathrm{tr}((A P) P^{-1}) && \text{(cyclicity of $\mathrm{tr}$)} \\ &= \mathrm{tr}(A (P P^{-1})) && \text{(associativity)} \\ &= \mathrm{tr}(A \, I_n) = \mathrm{tr}(A). \end{aligned} $$

Corollary — Similarity is strictly finer than equivalence

Two similar matrices are equivalent, but the converse is false. Even rank and trace together do not classify similarity classes.

Example

Same rank, different trace $\Rightarrow$ equivalent but not similar. Take $A = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}$ and $B = \begin{pmatrix} 2 & 0 \\ 0 & 0 \end{pmatrix}$. Both have rank 1, so $A \sim_{\mathrm{eq}} B$. But $\mathrm{tr}(A) = 1 \ne 2 = \mathrm{tr}(B)$, so $A \not\sim_{\mathrm{sim}} B$.

Example

Same rank AND same trace, but still not similar. Take $A = I_2$ and $B = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}$ (a shear). Both have rank 2 and trace 2.
Suppose for contradiction $B = P^{-1} I_2 P$ for some $P \in \mathrm{GL}_2(\mathbb{R})$. Then $B = P^{-1} P = I_2$. But $B \ne I_2$: contradiction. So $A \not\sim_{\mathrm{sim}} B$, even though rank and trace coincide.

Example

Same rank, same trace, similar (explicit $P$). Take $A = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}$ (swap) and $B = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$ (diagonal). Both have rank 2 and trace 0. Set $$ P = \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}, \qquad P^{-1} = \frac{1}{2} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}. $$ Compute $P^{-1} A P = \frac{1}{2} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} = \frac{1}{2} \begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} = \frac{1}{2} \begin{pmatrix} 2 & 0 \\ 0 & -2 \end{pmatrix} = B$. So $A \sim_{\mathrm{sim}} B$.

Method — Show two square matrices are NOT similar

For $A, B \in M_n(\mathbb{K})$, exhibit any invariant that differs:

Rank: if $\mathrm{rg}(A) \ne \mathrm{rg}(B)$, not similar.
Trace: if $\mathrm{tr}(A) \ne \mathrm{tr}(B)$, not similar.
Preserved algebraic identities: if one of $A, B$ satisfies a polynomial identity that the other does not (idempotency $X^2 = X$, involution $X^2 = I_n$, nilpotency $X^k = 0$, etc.), then not similar. Reason: similarity preserves all such identities, since $P^{-1} X^k P = (P^{-1} X P)^k$ and $P^{-1} I_n P = I_n$.

Skills to practice

Similar matrices

IV Trace of an endomorphism

The invariance of trace by similarity transforms the matrix-side notion $\mathrm{tr}(A)$ into an intrinsic scalar associated to an endomorphism $u$: pick any basis $\mathcal{B}$, write $\mathrm{Mat}_{\mathcal{B}}(u)$, take its matrix trace; the result does not depend on the basis. This gives the trace of an endomorphism --- the first basis-independent numerical invariant of an endomorphism we encounter in this chapter.

Definition — Trace of an endomorphism

Let $E$ be a finite-dimensional $\mathbb{K}$-vector space and $u \in \mathcal{L}(E)$. The trace of $u$, denoted $\mathrm{tr}(u)$, is defined as $$ \mathrm{tr}(u) = \mathrm{tr}\bigl(\mathrm{Mat}_{\mathcal{B}}(u)\bigr) \quad \text{for any basis $\mathcal{B}$ of $E$}. $$ The value is independent of the basis $\mathcal{B}$ by similarity invariance of the matrix trace established in Similar matrices.

Theorem — Properties of the trace

Let $E$ be a finite-dimensional $\mathbb{K}$-vector space. The map $\mathrm{tr} \colon \mathcal{L}(E) \to \mathbb{K}$ satisfies:

Linearity: $\mathrm{tr}(\lambda u + \mu v) = \lambda \mathrm{tr}(u) + \mu \mathrm{tr}(v)$ for $u, v \in \mathcal{L}(E)$, $\lambda, \mu \in \mathbb{K}$.
Cyclicity: $\mathrm{tr}(u \circ v) = \mathrm{tr}(v \circ u)$ for $u, v \in \mathcal{L}(E)$.
Conjugation invariance: $\mathrm{tr}(g \circ u \circ g^{-1}) = \mathrm{tr}(u)$ for $u \in \mathcal{L}(E)$ and $g \in \mathrm{GL}(E)$.

Proof

Fix a basis $\mathcal{B}$ of $E$ and let $A = \mathrm{Mat}_{\mathcal{B}}(u)$, $B = \mathrm{Mat}_{\mathcal{B}}(v)$, $G = \mathrm{Mat}_{\mathcal{B}}(g)$. All three properties transfer from matrices to endomorphisms via the dictionary isomorphism $\mathcal{L}(E) \to M_n(\mathbb{K})$.

Linearity. $$ \begin{aligned} \mathrm{tr}(\lambda u + \mu v) &= \mathrm{tr}(\mathrm{Mat}_{\mathcal{B}}(\lambda u + \mu v)) && \text{(definition of endo trace)} \\ &= \mathrm{tr}(\lambda A + \mu B) && \text{(linearity of $\mathrm{Mat}$)} \\ &= \lambda \, \mathrm{tr}(A) + \mu \, \mathrm{tr}(B) && \text{(linearity of matrix trace)} \\ &= \lambda \, \mathrm{tr}(u) + \mu \, \mathrm{tr}(v) && \text{(definition of endo trace).} \end{aligned} $$
Cyclicity. $\mathrm{tr}(u \circ v) = \mathrm{tr}(A B) = \mathrm{tr}(B A) = \mathrm{tr}(v \circ u)$ by matrix-trace cyclicity.
Conjugation invariance. $\mathrm{tr}(g \circ u \circ g^{-1}) = \mathrm{tr}(G A G^{-1}) = \mathrm{tr}(A)$ by similarity invariance of the matrix trace (cf.\ Similar matrices).

Proposition — Trace of a projector

For a projector $p \in \mathcal{L}(E)$ ($p \circ p = p$) with $E$ finite-dimensional: $$ \mathrm{tr}(p) = \mathrm{rg}(p). $$

Proof

We saw in chapter Matrix representation of linear maps that for a projector $p$ on $\mathrm{Im}\, p$ parallel to $\mathrm{Ker}\, p$, the matrix in a basis adapted to $E = \mathrm{Im}\, p \oplus \mathrm{Ker}\, p$ is $$ \mathrm{Mat}_{\mathcal{B}}(p) = \begin{pmatrix} I_r & 0 \\ 0 & 0_{n - r} \end{pmatrix}, \qquad r = \mathrm{rg}(p). $$ Taking the matrix trace, $\mathrm{tr}(p) = r = \mathrm{rg}(p)$.

Example

The derivation $\Function{\Phi}{\mathbb{R}_n[X]}{\mathbb{R}_n[X]}{P}{P'}$ has trace $0$. In the canonical basis, $\mathrm{Mat}(\Phi)$ is strictly upper triangular (diagonal entries are zero), so $\mathrm{tr}(\Phi) = 0$.

Example

A symmetry $s$ on $E$ with respect to $F$ parallel to $G$, with $\dim F = r$ and $\dim G = n - r$, has matrix $\mathrm{diag}(I_r, -I_{n - r})$ in an adapted basis. So $$ \mathrm{tr}(s) = r + (-(n - r)) = 2 r - n. $$ In particular, when $r = n / 2$ (only possible if $n$ is even), $\mathrm{tr}(s) = 0$.

Method — Compute the trace of an endomorphism

To compute $\mathrm{tr}(u)$ for $u \in \mathcal{L}(E)$:

Pick the most convenient basis $\mathcal{B}$ of $E$ --- usually the canonical basis, or a basis adapted to a known direct-sum decomposition (e.g.\ projector, symmetry).
Compute $\mathrm{Mat}_{\mathcal{B}}(u)$.
Read $\mathrm{tr}(u)$ as the sum of diagonal entries.

The basis-independence of $\mathrm{tr}(u)$ means any basis works --- pick the one that minimises calculation.

Example

Compute $\mathrm{tr}(u)$ for $u \in \mathcal{L}(\mathbb{R}^2)$, $u(x, y) = (3 x + 2 y, x + 4 y)$.

Answer

In the canonical basis, $u(1, 0) = (3, 1)$ and $u(0, 1) = (2, 4)$, so $$ \mathrm{Mat}(u) = \begin{pmatrix} 3 & 2 \\ 1 & 4 \end{pmatrix}. $$ Hence $\mathrm{tr}(u) = 3 + 4 = 7$.

This closes the linear-algebra block at this level. Equivalence is completely classified by rank. Similarity is subtler: rank and trace are useful invariants for disproving similarity, but they do not classify similarity classes --- refining similarity further is the subject of eigenvalue theory, addressed in L2. The next chapters move to the symmetric group (preparing determinants), then determinants themselves.

Skills to practice

Trace of an endomorphism