CommeUnJeu · L2 MP
Matrix exponential and differential systems
The scalar exponential \(\exp(x) = \sum_{n \geq 0} x^n/n!\) is the unique function on \(\mathbb{R}\) that equals its own derivative with value \(1\) at \(0\); it solves \(y' = y\), \(y(0) = 1\). This chapter extends the construction to matrices and endomorphisms in finite dimension. For \(A \in \mathcal{M}_n(K)\), the same series \(\sum_{n \geq 0} A^n / n!\) converges (in finite dim, all norms are equivalent and a sub-multiplicative one exists, so the absolute convergence test of Numerical and vector series applies). The matrix exponential then takes over the role of the scalar exponential: \(t \mapsto \exp(tA)\, X_0\) solves the vector differential equation \(X' = AX\), \(X(0) = X_0\) (the matrix curve \(t \mapsto \exp(tA)\) itself solves the matrix equation with \(X(0) = I_n\)), and the solution is unique by the linear Cauchy theorem recalled from Linear differential equations.
The chapter has three sections. The first builds the exponential as a function \(\mathcal{M}_n(K) \to \mathcal{M}_n(K)\): definition by series, convergence and bound \(\|\exp(A)\| \leq e^{\|A\|}\), continuity, derivative of \(t \mapsto \exp(tA)\), the morphism property \(\exp(A+B) = \exp(A)\exp(B)\) on commuting pairs, and invertibility of \(\exp(A)\). The second turns to practical computation: the diagonalisable case via similarity, the spectrum of \(\exp(A)\) over \(\mathbb{C}\), the closed form \(\det(\exp(A)) = e^{\operatorname{tr}(A)}\). The third applies the machine to the constant-coefficient linear differential system \(X' = AX\) --- the existence-uniqueness theorem and an eigenvalue-decoupling route to the explicit solution.
Standing notation. \(K = \mathbb{R}\) or \(\mathbb{C}\), fixed throughout. \(E\) a finite-dimensional \(K\)-vector space; \(\mathcal{L}(E)\) its endomorphism algebra; \(\mathcal{M}_n(K)\) the \(n \times n\) matrices over \(K\). We equip \(\mathcal{M}_n(K)\) with the operator norm \(\|M\| = \sup_{\|x\| \leq 1} \|Mx\|\) subordinate to a fixed norm \(\|\cdot\|\) on \(K^n\); recalled from Limits and continuity in a normed space, this norm is sub-multiplicative (\(\|MN\| \leq \|M\|\,\|N\|\)). By Compactness, connectedness, finite dimension, all norms on \(\mathcal{M}_n(K)\) are equivalent, so convergence statements below do not depend on the chosen norm. \(\operatorname{Sp}(A)\) is the spectrum of \(A\) over the base field \(K\); \(\operatorname{Sp}_{\mathbb{C}}(A)\) is the complex spectrum (the roots in \(\mathbb{C}\) of \(\chi_A\), counted with multiplicity). \(\chi_A\) is the characteristic polynomial. \(E_\lambda(A) = \ker(A - \lambda I)\). \(\mathrm{GL}_n(K)\) the invertible matrices. The eigen-elements, characteristic polynomial, diagonalisability and trigonalisability over \(\mathbb{C}\) are those of Reduction: eigen-elements and diagonalisation; the linear differential equation, the Cauchy problem and the linear Cauchy theorem are those of Linear differential equations; the vector-valued series and the absolute-convergence theorem are those of Numerical and vector series; the differentiability and integration of vector-valued maps and the FTC are those of Vector-valued functions of a real variable.
The chapter has three sections. The first builds the exponential as a function \(\mathcal{M}_n(K) \to \mathcal{M}_n(K)\): definition by series, convergence and bound \(\|\exp(A)\| \leq e^{\|A\|}\), continuity, derivative of \(t \mapsto \exp(tA)\), the morphism property \(\exp(A+B) = \exp(A)\exp(B)\) on commuting pairs, and invertibility of \(\exp(A)\). The second turns to practical computation: the diagonalisable case via similarity, the spectrum of \(\exp(A)\) over \(\mathbb{C}\), the closed form \(\det(\exp(A)) = e^{\operatorname{tr}(A)}\). The third applies the machine to the constant-coefficient linear differential system \(X' = AX\) --- the existence-uniqueness theorem and an eigenvalue-decoupling route to the explicit solution.
Standing notation. \(K = \mathbb{R}\) or \(\mathbb{C}\), fixed throughout. \(E\) a finite-dimensional \(K\)-vector space; \(\mathcal{L}(E)\) its endomorphism algebra; \(\mathcal{M}_n(K)\) the \(n \times n\) matrices over \(K\). We equip \(\mathcal{M}_n(K)\) with the operator norm \(\|M\| = \sup_{\|x\| \leq 1} \|Mx\|\) subordinate to a fixed norm \(\|\cdot\|\) on \(K^n\); recalled from Limits and continuity in a normed space, this norm is sub-multiplicative (\(\|MN\| \leq \|M\|\,\|N\|\)). By Compactness, connectedness, finite dimension, all norms on \(\mathcal{M}_n(K)\) are equivalent, so convergence statements below do not depend on the chosen norm. \(\operatorname{Sp}(A)\) is the spectrum of \(A\) over the base field \(K\); \(\operatorname{Sp}_{\mathbb{C}}(A)\) is the complex spectrum (the roots in \(\mathbb{C}\) of \(\chi_A\), counted with multiplicity). \(\chi_A\) is the characteristic polynomial. \(E_\lambda(A) = \ker(A - \lambda I)\). \(\mathrm{GL}_n(K)\) the invertible matrices. The eigen-elements, characteristic polynomial, diagonalisability and trigonalisability over \(\mathbb{C}\) are those of Reduction: eigen-elements and diagonalisation; the linear differential equation, the Cauchy problem and the linear Cauchy theorem are those of Linear differential equations; the vector-valued series and the absolute-convergence theorem are those of Numerical and vector series; the differentiability and integration of vector-valued maps and the FTC are those of Vector-valued functions of a real variable.
I
The matrix and endomorphism exponential
I.1
Definition and convergence
The scalar series \(\sum x^n/n!\) converges everywhere on \(\mathbb{R}\) (or \(\mathbb{C}\)) with sum \(e^x\). The natural extension to matrices reads « replace \(x \in K\) by \(A \in \mathcal{M}_n(K)\) ». The crucial difference: matrix multiplication is non-commutative, and the partial sums \(\sum_{k \leq n} A^k/k!\) now live in the finite-dimensional normed space \(\mathcal{M}_n(K)\), where we must control their size to ensure convergence. The operator norm together with norm equivalence give a clean route.
Definition — Matrix and endomorphism exponential
For \(A \in \mathcal{M}_n(K)\), the matrix exponential of \(A\) is $$ \exp(A) \;=\; \sum_{n \geq 0} \frac{A^n}{n!} \;\in\; \mathcal{M}_n(K). $$ For \(a \in \mathcal{L}(E)\), the endomorphism exponential of \(a\) is $$ \exp(a) \;=\; \sum_{n \geq 0} \frac{a^n}{n!} \;\in\; \mathcal{L}(E). $$ Both series converge in their finite-dim ambient space (Proposition below). The choice of base point is \(A^0 = I_n\), \(a^0 = \operatorname{Id}_E\). Proposition — Convergence and uniform bound
For every \(A \in \mathcal{M}_n(K)\), the series \(\sum A^n/n!\) converges absolutely in \(\mathcal{M}_n(K)\), hence converges. Moreover, for the operator norm, $$ \textcolor{colorprop}{\|\exp(A)\| \;\leq\; e^{\|A\|}.} $$ The analogous statement holds for \(\exp(a)\) with \(a \in \mathcal{L}(E)\).
The operator norm subordinate to a norm on \(K^n\) is sub-multiplicative (recalled from Limits and continuity in a normed space): for all \(M, N \in \mathcal{M}_n(K)\), \(\|MN\| \leq \|M\|\,\|N\|\). By induction, \(\|A^n\| \leq \|A\|^n\) for every \(n\). Hence $$ \begin{aligned} \left\| \frac{A^n}{n!} \right\| & = \frac{\|A^n\|}{n!} && \text{(positive homogeneity)}\\
& \leq \frac{\|A\|^n}{n!} && \text{(sub-multiplicativity, induction).} \end{aligned} $$ The numerical series \(\sum \|A\|^n / n! = e^{\|A\|}\) converges. By the absolute-convergence-implies-convergence theorem for vector-valued series (recalled from Numerical and vector series), \(\sum A^n/n!\) converges in \(\mathcal{M}_n(K)\). The bound on \(\|\exp(A)\|\) follows from the triangle inequality for an absolutely convergent series: $$ \|\exp(A)\| \;=\; \left\| \sum_{n \geq 0} \frac{A^n}{n!} \right\| \;\leq\; \sum_{n \geq 0} \frac{\|A\|^n}{n!} \;=\; e^{\|A\|}. $$ The endomorphism case is identical via the matrix of \(a\) in any basis (the choice of basis does not affect the convergence, by norm equivalence).
Example — The zero matrix
For \(A = 0_n\), \(A^k = 0\) for every \(k \geq 1\), so \(\exp(0_n) = I_n + 0 + 0 + \dots = I_n\). The zero matrix maps to the identity, just as \(e^0 = 1\). Example — Scalar matrices
For \(\lambda \in K\), \((\lambda I_n)^k = \lambda^k I_n\), so $$ \exp(\lambda I_n) \;=\; \sum_{k \geq 0} \frac{\lambda^k I_n}{k!} \;=\; \left( \sum_{k \geq 0} \frac{\lambda^k}{k!} \right) I_n \;=\; e^\lambda I_n. $$ The scalar exponential extends to scalar matrices in the most natural way: factor the identity out. Example — A nilpotent matrix
Let \(N = \begin{pmatrix}0 & 1\\ 0 & 0\end{pmatrix}\). Then \(N^2 = 0\), so the series terminates at \(k = 1\): $$ \exp(N) \;=\; I_2 + N \;=\; \begin{pmatrix}1 & 1\\
0 & 1\end{pmatrix}. $$ More generally, if \(N \in \mathcal{M}_n(K)\) is nilpotent of index \(p\) (i.e.\ \(N^p = 0\), \(N^{p-1} \neq 0\)), then \(\exp(N)\) is a finite sum: \(\exp(N) = \sum_{k=0}^{p-1} N^k/k!\). Example — An involution
If \(S \in \mathcal{M}_n(K)\) satisfies \(S^2 = I_n\) (involution), then \(S^{2k} = I_n\) and \(S^{2k+1} = S\) for every \(k\). Splitting the series into even and odd terms, $$ \exp(tS) \;=\; \left( \sum_{k \geq 0} \frac{t^{2k}}{(2k)!} \right) I_n \,+\, \left( \sum_{k \geq 0} \frac{t^{2k+1}}{(2k+1)!} \right) S \;=\; \cosh(t)\, I_n + \sinh(t)\, S. $$ The hyperbolic functions appear naturally on diagonal-symmetric and off-diagonal-symmetric pieces. Method — Computing \(\exp(A)\) directly from the series
When the powers \(A^k\) stabilise or repeat with a known pattern, the series can be summed directly without invoking diagonalisation. Three standard cases: - Nilpotent. If \(A^p = 0\), \(\exp(A) = \sum_{k = 0}^{p-1} A^k/k!\) --- a finite sum.
- Involution. If \(A^2 = I_n\), \(\exp(tA) = \cosh(t) I_n + \sinh(t) A\).
- Projector. If \(A^2 = A\), \(A^k = A\) for every \(k \geq 1\), so \(\exp(tA) = I_n + (e^t - 1) A\).
Skills to practice
- Computing the exponential by series
I.2
Continuity and derivative
With \(\exp\) defined as a function \(\mathcal{M}_n(K) \to \mathcal{M}_n(K)\), two regularity questions arise. Is \(M \mapsto \exp(M)\) continuous? Yes, because the partial sums are polynomial maps, hence continuous, and the convergence is normal on every closed ball. Is \(t \mapsto \exp(tA)\) differentiable, and what is its derivative? Yes, of class \(\mathcal{C}^1\) on \(\mathbb{R}\), with derivative \(A \exp(tA) = \exp(tA) A\). The second equality uses a commutation lemma we establish first.
Proposition — Commutation lemma
Let \(A, B \in \mathcal{M}_n(K)\) with \(AB = BA\). Then \(A\) commutes with \(\exp(B)\): \(A \exp(B) = \exp(B) A\). In particular, \(A\) commutes with \(\exp(tB)\) for every \(t \in K\).
First, an induction shows that \(A\) commutes with every power \(B^k\):
- Base \(k = 0\). \(A B^0 = A I_n = A = I_n A = B^0 A\). \(\checkmark\)
- Step. Assume \(A B^k = B^k A\). Then \(A B^{k+1} = (A B^k) B = (B^k A) B = B^k (A B) = B^k (BA) = B^{k+1} A\).
Proposition — Continuity of the exponential
The map \(M \mapsto \exp(M)\) is continuous on \(\mathcal{M}_n(K)\).
Fix \(M_0 \in \mathcal{M}_n(K)\) and \(\varepsilon > 0\); we prove \(\exp\) continuous at \(M_0\). Set \(R = \|M_0\| + 1\) and work on the closed ball \(B(0, R) = \{M \in \mathcal{M}_n(K) : \|M\| \leq R\}\); then \(M_0 \in B(0, R)\).
- Normal convergence on \(B(0, R)\). The partial sum \(S_n(M) = \sum_{k \leq n} M^k/k!\) is a polynomial in the entries of \(M\), hence continuous on \(\mathcal{M}_n(K)\) (polynomial maps on a finite-dim space are continuous, recalled from Compactness, connectedness, finite dimension). The bound \(\|M^k/k!\|_{\infty, B(0, R)} \leq R^k/k!\) and the convergent dominating series \(\sum R^k/k! = e^R\) give $$ \|\exp(M) - S_n(M)\| \;\leq\; \sum_{k \geq n+1} \frac{R^k}{k!} \;\xrightarrow[n \to +\infty]{}\; 0 \quad \text{uniformly in } M \in B(0, R). $$ Pick \(N\) such that this tail is \(\leq \varepsilon/3\) on \(B(0, R)\).
- Continuity of \(S_N\) at \(M_0\). \(S_N\) is continuous on \(\mathcal{M}_n(K)\) (polynomial). Choose \(\eta \in (0, 1]\) such that \(\|M - M_0\| \leq \eta\) implies \(\|S_N(M) - S_N(M_0)\| \leq \varepsilon/3\); then \(M \in B(M_0, 1) \subset B(0, R)\) automatically.
- \(\varepsilon/3\)-triangle. For \(\|M - M_0\| \leq \eta\), $$ \|\exp(M) - \exp(M_0)\| \,\leq\, \|\exp(M) - S_N(M)\| + \|S_N(M) - S_N(M_0)\| + \|S_N(M_0) - \exp(M_0)\| \,\leq\, \tfrac{\varepsilon}{3} + \tfrac{\varepsilon}{3} + \tfrac{\varepsilon}{3} = \varepsilon. $$
Theorem — Derivative of \(t \mapsto \exp(tA)\)
For every \(A \in \mathcal{M}_n(K)\), the map \(t \mapsto \exp(tA)\) is of class \textcolor{colorprop}{\(\mathcal{C}^1\) on \(\mathbb{R}\)}, and its derivative at \(t\) is $$ \textcolor{colorprop}{\frac{d}{dt} \exp(tA) \;=\; A \exp(tA) \;=\; \exp(tA)\, A.} $$ The second equality follows from the commutation lemma applied to \(A\) and \(tA\) (which trivially commute).
We prove the derivative formula directly via the difference quotient. The variable \(t\) is real throughout the proof (even when \(K = \mathbb{C}\), the parameter \(t\) of the curve \(t \mapsto \exp(tA)\) lives in \(\mathbb{R}\)), so the scalar MVT on \(\mathbb{R}\) applies below. Fix \(t \in \mathbb{R}\) and \(h \in \mathbb{R}\) with \(0 < |h| \leq 1\). Define \(\Delta(h) = \frac{\exp((t+h)A) - \exp(tA)}{h} - A \exp(tA)\). Expand the exponentials as series: $$ \Delta(h) \;=\; \sum_{n \geq 0} \frac{1}{n!}\, \frac{(t+h)^n - t^n}{h}\, A^n \,-\, A \sum_{n \geq 0} \frac{t^n}{n!} A^n. $$ For each \(n\), \(\frac{(t+h)^n - t^n}{h} \to n t^{n-1}\) as \(h \to 0\) (scalar derivative), so the formal limit term-by-term is \(\sum_{n \geq 1} \frac{n t^{n-1}}{n!} A^n = A \sum_{n \geq 1} \frac{t^{n-1}}{(n-1)!} A^{n-1} = A \exp(tA)\), which cancels the second sum. To justify exchanging the limit and the sum, we bound the general term uniformly in \(h\) for \(|h| \leq 1\): $$ \begin{aligned} |(t+h)^n - t^n| & \leq n (|t| + 1)^{n-1} |h| && \text{(scalar MVT applied to } x \mapsto x^n \text{ on } [t, t+h]),\\
\left\| \frac{1}{n!} \frac{(t+h)^n - t^n}{h} A^n \right\| & \leq \frac{(|t|+1)^{n-1}}{(n-1)!} \|A\|^n && \text{(for } n \geq 1\text{).} \end{aligned} $$ The dominating numerical series \(\sum_{n \geq 1} \frac{(|t|+1)^{n-1}}{(n-1)!} \|A\|^n = \|A\|\, e^{(|t|+1) \|A\|}\) converges; by normal convergence of the bound (independent of \(h\)), we may exchange limit and sum: \(\Delta(h) \to 0\) as \(h \to 0\). Hence \(t \mapsto \exp(tA)\) is differentiable at \(t\) with derivative \(A \exp(tA)\). The continuity of \(t \mapsto A \exp(tA)\) follows from the continuity of \(\exp\) (previous Proposition) and the continuity of \(M \mapsto AM\), so \(t \mapsto \exp(tA)\) is of class \(\mathcal{C}^1\). The equality \(A \exp(tA) = \exp(tA) A\) is the commutation lemma (since \(A\) and \(tA\) commute).
Example — The derivative at \(t \equal 0\)
Specialising the previous Theorem at \(t = 0\): $$ \left. \frac{d}{dt} \exp(tA) \right|_{t = 0} \;=\; A \exp(0) \;=\; A I_n \;=\; A. $$ This is the seed of the chapter's third section: the map \(t \mapsto \exp(tA)\) solves the differential equation \(X' = AX\) with \(X(0) = I_n\) in the matrix sense, or \(X(0) = X_0\) when multiplied on the right by \(X_0\). Method — Differentiating \(\exp(u(t)A)\) with \(A\) fixed
For \(A \in \mathcal{M}_n(K)\) fixed and \(u : I \to \mathbb{R}\) a \(\mathcal{C}^1\) real-valued scalar function (\(I\) a real interval), the chain rule applied to \(t \mapsto \exp(u(t) A)\) gives $$ \frac{d}{dt} \exp(u(t) A) \;=\; u'(t)\, A\, \exp(u(t) A) \;=\; u'(t)\, \exp(u(t) A)\, A. $$ Warning. The analogous formula for a varying matrix family \(t \mapsto A(t)\), $$ \frac{d}{dt} \exp(A(t)) \;\overset{?}{=}\; A'(t) \exp(A(t)), $$ is false in general; a sufficient condition for it to hold is the local commutation \([A(t), A'(t)] = 0\) at every \(t\) (and more — see references). Scalar multiplication by \(u(t)\) trivially satisfies this since \(A\) and \(A\) commute; a general \(A(t)\) does not. Skills to practice
- Differentiating expressions in \(\exp(tA)\)
I.3
The morphism property and similar matrices
The scalar exponential is a morphism \((\mathbb{R}, +) \to (\mathbb{R}^*, \times)\): \(e^{x + y} = e^x e^y\). The matrix exponential extends this only on commuting pairs --- without the commutation, the Baker-Campbell-Hausdorff correction terms kick in (beyond our scope). Three consequences follow: \(\exp(A)\) is invertible with inverse \(\exp(-A)\), and similar matrices have similar exponentials.
Theorem — Morphism property under commutation
If \(A, B \in \mathcal{M}_n(K)\) commute (\(AB = BA\)), then $$ \textcolor{colorprop}{\exp(A + B) \;=\; \exp(A)\, \exp(B) \;=\; \exp(B)\, \exp(A).} $$
Define \(\varphi : \mathbb{R} \to \mathcal{M}_n(K)\) by \(\varphi(t) = \exp(t(A+B))\, \exp(-tA)\, \exp(-tB)\). By the commutation Proposition of \S 1.2, \(A+B\), \(A\), \(B\) each commute with each \(\exp(\pm sA)\) and \(\exp(\pm sB)\) for every \(s\) (since the underlying pairs commute). Differentiate using the derivative theorem of \S 1.2: $$ \begin{aligned} \varphi'(t) & = (A+B)\exp(t(A+B))\, \exp(-tA)\, \exp(-tB) && \text{(derivative of the 1st factor)}\\
& \quad + \exp(t(A+B))\, (-A)\exp(-tA)\, \exp(-tB) && \text{(derivative of the 2nd factor)}\\
& \quad + \exp(t(A+B))\, \exp(-tA)\, (-B)\exp(-tB) && \text{(derivative of the 3rd factor)}\\
& = (A+B)\, \varphi(t) && \text{(1st term is already \((A+B)\cdot \varphi(t)\))}\\
& \quad + (-A)\, \exp(t(A+B))\, \exp(-tA)\, \exp(-tB) && \text{(comm.\ Prop.: \(A\) commutes with \(\exp(t(A+B))\))}\\
& \quad + (-B)\, \exp(t(A+B))\, \exp(-tA)\, \exp(-tB) && \text{(comm.\ Prop.: \(B\) commutes with \(\exp(t(A+B))\) and \(\exp(-tA)\))}\\
& = (A+B)\, \varphi(t) \,-\, A\, \varphi(t) \,-\, B\, \varphi(t) && \text{(re-collect \(\varphi(t)\))}\\
& = (A + B - A - B)\, \varphi(t) \;=\; 0. \end{aligned} $$ Hence \(\varphi\) is constant on \(\mathbb{R}\); evaluating at \(t = 0\), \(\varphi(0) = \exp(0)\exp(0)\exp(0) = I_n\). So \(\varphi(1) = I_n\), i.e. $$ \exp(A+B)\,\exp(-A)\,\exp(-B) \;=\; I_n. \qquad (\star) $$ Special case \(B = 0\) (invertibility). The same \(\varphi\) argument with \(B\) replaced by \(0\) (which commutes trivially with \(A\)) gives \(\exp(tA)\exp(-tA) = I_n\) for every \(t \in \mathbb{R}\); at \(t = 1\), \(\exp(A)\exp(-A) = I_n\), and by \(A \leftrightarrow -A\) also \(\exp(-A)\exp(A) = I_n\). So \(\exp(A) \in \mathrm{GL}_n(K)\) with inverse \(\exp(-A)\) for every \(A \in \mathcal{M}_n(K)\) (this is the content of the next Proposition --- the morphism Theorem and the Invertibility Proposition share one \(\varphi\) argument). The same applies to \(B\) and to \(-A, -B\).
Rearranging \((\star)\). Now that we know \(\exp(B)\), \(\exp(A)\) are invertible: right-multiply \((\star)\) by \(\exp(B)\) to get \(\exp(A+B)\exp(-A)\cdot I_n = \exp(B)\), i.e.\ \(\exp(A+B)\exp(-A) = \exp(B)\); then right-multiply by \(\exp(A)\) to get \(\exp(A+B) = \exp(B)\exp(A)\). By the commutation Proposition of \S 1.2 (with \(A\) commuting with \(\exp(B)\), since \(A\) commutes with \(B\)), \(\exp(B)\exp(A) = \exp(A)\exp(B)\). Hence \(\exp(A+B) = \exp(A)\exp(B) = \exp(B)\exp(A)\).
Rearranging \((\star)\). Now that we know \(\exp(B)\), \(\exp(A)\) are invertible: right-multiply \((\star)\) by \(\exp(B)\) to get \(\exp(A+B)\exp(-A)\cdot I_n = \exp(B)\), i.e.\ \(\exp(A+B)\exp(-A) = \exp(B)\); then right-multiply by \(\exp(A)\) to get \(\exp(A+B) = \exp(B)\exp(A)\). By the commutation Proposition of \S 1.2 (with \(A\) commuting with \(\exp(B)\), since \(A\) commutes with \(B\)), \(\exp(B)\exp(A) = \exp(A)\exp(B)\). Hence \(\exp(A+B) = \exp(A)\exp(B) = \exp(B)\exp(A)\).
Proposition — Invertibility
For every \(A \in \mathcal{M}_n(K)\), \(\exp(A) \in \textcolor{colorprop}{\mathrm{GL}_n(K)}\), and \(\exp(A)^{-1} = \exp(-A)\).
Already established as the \(B = 0\) special case of the morphism Theorem's \(\varphi\) argument (see the Proof above): \(\varphi(t) = \exp(tA)\exp(-tA)\) at \(t = 1\) gives \(\exp(A)\exp(-A) = I_n\), and by \(A \leftrightarrow -A\) also \(\exp(-A)\exp(A) = I_n\). So \(\exp(A) \in \mathrm{GL}_n(K)\) with inverse \(\exp(-A)\).
Proposition — Similar matrices
If \(A = PBP^{-1}\) with \(P \in \mathrm{GL}_n(K)\), then \(\exp(A) = \textcolor{colorprop}{P \exp(B) P^{-1}}\). In particular, \(\exp(A)\) and \(\exp(B)\) are similar through the same \(P\).
By induction on \(k\), \((PBP^{-1})^k = PB^k P^{-1}\) (base \(k = 0\): \(I_n = P I_n P^{-1}\); step: \((PBP^{-1})^{k+1} = PBP^{-1} \cdot PB^k P^{-1} = PB^{k+1} P^{-1}\)). Pass to the limit in the partial sum: $$ \sum_{k \leq n} \frac{(PBP^{-1})^k}{k!} \;=\; \sum_{k \leq n} \frac{P B^k P^{-1}}{k!} \;=\; P \left( \sum_{k \leq n} \frac{B^k}{k!} \right) P^{-1}. $$ The map \(M \mapsto P M P^{-1}\) is linear on the finite-dim space \(\mathcal{M}_n(K)\), hence continuous (recalled from Compactness, connectedness, finite dimension). Letting \(n \to +\infty\), \(\exp(PBP^{-1}) = P \exp(B) P^{-1}\).
Example — Failure without commutation
Take \(A = \begin{pmatrix}0 & 1\\ 0 & 0\end{pmatrix}\) and \(B = \begin{pmatrix}0 & 0\\ 1 & 0\end{pmatrix}\). Both are nilpotent of index \(2\), so \(\exp(A) = I_2 + A = \begin{pmatrix}1 & 1\\ 0 & 1\end{pmatrix}\) and \(\exp(B) = I_2 + B = \begin{pmatrix}1 & 0\\ 1 & 1\end{pmatrix}\). Compute the product: $$ \exp(A)\exp(B) \;=\; \begin{pmatrix}1 & 1\\
0 & 1\end{pmatrix}\begin{pmatrix}1 & 0\\
1 & 1\end{pmatrix} \;=\; \begin{pmatrix}2 & 1\\
1 & 1\end{pmatrix}. $$ On the other hand, \(A + B = \begin{pmatrix}0 & 1\\ 1 & 0\end{pmatrix}\) is the swap-matrix, an involution: \((A+B)^2 = I_2\). By the involution formula of §1.1 with \(t = 1\), \(\exp(A + B) = \cosh(1)\, I_2 + \sinh(1)\, (A+B) = \begin{pmatrix}\cosh 1 & \sinh 1\\ \sinh 1 & \cosh 1\end{pmatrix}\) (numerically \(\approx \begin{pmatrix}1.543 & 1.175\\ 1.175 & 1.543\end{pmatrix}\)). The two matrices differ, confirming that \(AB \neq BA\) breaks the morphism property: indeed \(AB = \begin{pmatrix}1 & 0\\ 0 & 0\end{pmatrix} \neq \begin{pmatrix}0 & 0\\ 0 & 1\end{pmatrix} = BA\). Method — Invoking the morphism property
Before using \(\exp(A + B) = \exp(A)\exp(B)\), always check \(AB = BA\). Three common in-program cases where commutation is automatic: - One of \(A\), \(B\) is scalar (\(\lambda I_n\) commutes with every matrix).
- Both diagonalise in the same basis (\(A = PDP^{-1}\), \(B = PD'P^{-1}\) with \(D, D'\) diagonal commute, so \(A, B\) commute).
- One is a polynomial in the other (e.g.\ \(B = p(A)\) commutes with \(A\)).
The morphism property can be summarised by a conditional diagram: the map \(\exp\) sends a commuting sum of matrices to the product of their exponentials.
The condition \(AB = BA\) sits in the middle of the diagram precisely to remind that the two paths agree only on that conditional locus.
Skills to practice
- Applying the morphism property
II
Practical computation of the exponential
II.1
The diagonalisable case
Diagonalisation is the operational route: it reduces \(\exp(A)\) to a diagonal matrix of scalar exponentials, which is trivial. The route is restricted to diagonalisable \(A\); §2.2 handles the non-diagonalisable case via the spectrum, the determinant, and the single-eigenvalue \(\lambda I + N\) shortcut.
Theorem — Exponential of a diagonalisable matrix
Let \(A = PDP^{-1}\) with \(D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)\) diagonal. Then \(\exp(A)\) is diagonalisable in the same basis: $$ \textcolor{colorprop}{\exp(A) \;=\; P\, \operatorname{diag}(e^{\lambda_1}, \ldots, e^{\lambda_n})\, P^{-1}.} $$
By the similarity Proposition of \S 1.3, \(\exp(A) = P \exp(D) P^{-1}\). It remains to compute \(\exp(D)\) for \(D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)\). The \(k\)-th power \(D^k = \operatorname{diag}(\lambda_1^k, \ldots, \lambda_n^k)\), so the series sums componentwise: $$ \exp(D) \;=\; \sum_{k \geq 0} \frac{D^k}{k!} \;=\; \operatorname{diag}\!\left( \sum_{k \geq 0} \frac{\lambda_1^k}{k!}, \ldots, \sum_{k \geq 0} \frac{\lambda_n^k}{k!} \right) \;=\; \operatorname{diag}(e^{\lambda_1}, \ldots, e^{\lambda_n}). $$ Combining gives the announced formula.
Example — A real \(2 \times 2\) diagonalisation
Compute \(\exp(A)\) for \(A = \begin{pmatrix}0 & -2\\ -1 & 1\end{pmatrix}\). Step 1: characteristic polynomial. \(\chi_A(X) = X^2 - X - 2 = (X-2)(X+1)\). Spectrum \(\operatorname{Sp}(A) = \{2, -1\}\), two distinct real eigenvalues, so \(A\) is diagonalisable over \(\mathbb{R}\). Step 2: eigenvectors. For \(\lambda_1 = 2\), \((A - 2I_2)X = 0\) gives \(X_1 = \begin{pmatrix}1\\ -1\end{pmatrix}\). For \(\lambda_2 = -1\), \((A + I_2)X = 0\) gives \(X_2 = \begin{pmatrix}2\\ 1\end{pmatrix}\). Set \(P = \begin{pmatrix}1 & 2\\ -1 & 1\end{pmatrix}\), \(D = \operatorname{diag}(2, -1)\). Step 3: inverse of \(P\). \(\det P = 1 + 2 = 3\), so \(P^{-1} = \frac{1}{3}\begin{pmatrix}1 & -2\\ 1 & 1\end{pmatrix}\). Step 4: assemble. $$ \exp(A) \;=\; P\, \operatorname{diag}(e^2, e^{-1})\, P^{-1} \;=\; \frac{1}{3}\begin{pmatrix}1 & 2\\
-1 & 1\end{pmatrix}\begin{pmatrix}e^2 & 0\\
0 & e^{-1}\end{pmatrix}\begin{pmatrix}1 & -2\\
1 & 1\end{pmatrix} \;=\; \frac{1}{3}\begin{pmatrix}e^2 + 2 e^{-1} & -2 e^2 + 2 e^{-1}\\
-e^2 + e^{-1} & 2 e^2 + e^{-1}\end{pmatrix}. $$ Example — Antisymmetric matrix and rotation
For \(\theta \in \mathbb{R}\), consider the antisymmetric matrix \(C(\theta) = \begin{pmatrix}0 & -\theta\\ \theta & 0\end{pmatrix}\) (sign convention: this is the standard rotation generator). Direct computation gives \(C(\theta)^2 = -\theta^2 I_2\), and by induction $$ C(\theta)^{2k} \,=\, (-1)^k \theta^{2k} I_2, \qquad C(\theta)^{2k+1} \,=\, (-1)^k \theta^{2k} C(\theta) \qquad (k \in \mathbb{N}). $$ Split the exponential series by parity: $$ \begin{aligned} \exp(C(\theta)) & = \sum_{k \geq 0} \frac{C(\theta)^{2k}}{(2k)!} + \sum_{k \geq 0} \frac{C(\theta)^{2k+1}}{(2k+1)!} && \text{(split even / odd powers)}\\
& = \left( \sum_{k \geq 0} \frac{(-1)^k \theta^{2k}}{(2k)!} \right) I_2 + \left( \sum_{k \geq 0} \frac{(-1)^k \theta^{2k}}{(2k+1)!} \right) C(\theta) && \text{(substitute the power formulas)}\\
& = \cos(\theta)\, I_2 + \frac{\sin(\theta)}{\theta}\, C(\theta) && \text{(Taylor series of \(\cos\), \(\sin\)).} \end{aligned} $$ (The factor \(\sin(\theta)/\theta\) at \(\theta = 0\) is read as the limit \(1\), giving \(\exp(0) = I_2\) --- consistent with \(\exp\) of the zero matrix.) Substituting \(C(\theta) = \begin{pmatrix}0 & -\theta\\ \theta & 0\end{pmatrix}\), $$ \exp(C(\theta)) \;=\; \begin{pmatrix}\cos\theta & -\sin\theta\\
\sin\theta & \cos\theta\end{pmatrix} $$ --- the rotation matrix of angle \(\theta\) in \(\mathbb{R}^2\).
Method — Computing \(\exp(A)\) by diagonalisation
Four-step procedure for \(A \in \mathcal{M}_n(K)\) diagonalisable: - Spectrum. Compute \(\chi_A\) and find \(\operatorname{Sp}(A)\) (over \(K\) if \(A\) diagonalises over \(K\); over \(\mathbb{C}\) otherwise). List the eigenvalues \((\lambda_1, \ldots, \lambda_n)\) with multiplicity, in the order matching an eigenvector basis chosen at the next step.
- Eigenvectors. For each \(\lambda_i\), solve \((A - \lambda_i I)X = 0\) to find a basis of \(E_{\lambda_i}\); concatenate into the change-of-basis matrix \(P = [X_1 | \cdots | X_n]\) and set \(D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)\).
- Diagonal exponential. \(\exp(D) = \operatorname{diag}(e^{\lambda_1}, \ldots, e^{\lambda_n})\) (trivial).
- Assemble. \(\exp(A) = P \exp(D) P^{-1}\) (matrix product).
Skills to practice
- Computing \(\exp(A)\) by diagonalisation
II.2
Spectrum and determinant of the exponential
Even when \(A\) is not diagonalisable, the spectrum of \(\exp(A)\) is determined by the complex spectrum of \(A\). Each eigenvector of \(A\) is an eigenvector of \(\exp(A)\) (one-way: \(\exp(A)\) may have additional eigenvectors when distinct eigenvalues of \(A\) share an exponential, e.g.\ \(\lambda\) and \(\lambda + 2\pi i\) both map to \(e^\lambda\) over \(\mathbb{C}\)). The determinant of \(\exp(A)\) has a closed form via the trace. Both results use trigonalisation over \(\mathbb{C}\), which is always possible because \(\chi_A\) splits over \(\mathbb{C}\) (d'Alembert-Gauss).
Proposition — Powers of an upper-triangular matrix
If \(T \in \mathcal{M}_n(K)\) is upper-triangular with diagonal \((t_{11}, \ldots, t_{nn})\), then for every \(k \in \mathbb{N}\), \(T^k\) is upper-triangular with diagonal \((t_{11}^k, \ldots, t_{nn}^k)\).
The product of two upper-triangular matrices is upper-triangular, and its diagonal is the entrywise product of the diagonals (the \((i,i)\) entry of \(UV\) is \(\sum_k u_{ik} v_{ki}\); since \(u_{ik} = 0\) for \(k < i\) and \(v_{ki} = 0\) for \(k > i\), only \(k = i\) contributes, giving \(u_{ii} v_{ii}\)). Iterating \(k\) times from \(T \cdot T \cdots T\) gives \(T^k\) upper-triangular with diagonal \((t_{ii}^k)\).
Proposition — Spectrum of \(\exp(A)\) over \(\mathbb{C}\)
For every \(A \in \mathcal{M}_n(K)\), $$ \textcolor{colorprop}{\operatorname{Sp}_{\mathbb{C}}(\exp(A)) \;=\; \{ e^\lambda \,:\, \lambda \in \operatorname{Sp}_{\mathbb{C}}(A) \},} $$ counted with multiplicity. Moreover, if \(X \in \mathbb{C}^n\) satisfies \(AX = \lambda X\), then \(\exp(A) X = e^\lambda X\).
Eigenvector statement. If \(AX = \lambda X\), induction gives \(A^k X = \lambda^k X\) for every \(k \geq 0\). Hence the partial sum \(S_n = \sum_{k \leq n} A^k/k!\) satisfies \(S_n X = \big( \sum_{k \leq n} \lambda^k/k! \big) X\). The map \(M \mapsto MX\) is linear on \(\mathcal{M}_n(\mathbb{C})\), hence continuous; passing to the limit \(n \to +\infty\), \(\exp(A) X = e^\lambda X\).
Spectrum. Over \(\mathbb{C}\), \(\chi_A\) splits (d'Alembert-Gauss, recalled from MPSI). By Reduction: eigen-elements and diagonalisation, \(A\) is therefore trigonalisable over \(\mathbb{C}\): \(A = P T P^{-1}\) with \(T \in \mathcal{M}_n(\mathbb{C})\) upper-triangular and diagonal \((\lambda_1, \ldots, \lambda_n)\) (the complex eigenvalues, counted with multiplicity). By the previous Proposition, \(T^k\) is upper-triangular with diagonal \((\lambda_i^k)\), so the partial sum \(\sum_{k \leq n} T^k/k!\) is upper-triangular with diagonal \(\big(\sum_{k \leq n} \lambda_i^k / k!\big)\). Passing to the limit, \(\exp(T)\) is upper-triangular with diagonal \((e^{\lambda_1}, \ldots, e^{\lambda_n})\). The eigenvalues of an upper-triangular matrix are its diagonal entries, so \(\operatorname{Sp}_{\mathbb{C}}(\exp(T)) = \{e^{\lambda_i}\}\) with multiplicities matching the diagonal repetitions. The similarity Proposition gives \(\exp(A) = P \exp(T) P^{-1}\), and similar matrices have equal spectra (recalled from Reduction).
Spectrum. Over \(\mathbb{C}\), \(\chi_A\) splits (d'Alembert-Gauss, recalled from MPSI). By Reduction: eigen-elements and diagonalisation, \(A\) is therefore trigonalisable over \(\mathbb{C}\): \(A = P T P^{-1}\) with \(T \in \mathcal{M}_n(\mathbb{C})\) upper-triangular and diagonal \((\lambda_1, \ldots, \lambda_n)\) (the complex eigenvalues, counted with multiplicity). By the previous Proposition, \(T^k\) is upper-triangular with diagonal \((\lambda_i^k)\), so the partial sum \(\sum_{k \leq n} T^k/k!\) is upper-triangular with diagonal \(\big(\sum_{k \leq n} \lambda_i^k / k!\big)\). Passing to the limit, \(\exp(T)\) is upper-triangular with diagonal \((e^{\lambda_1}, \ldots, e^{\lambda_n})\). The eigenvalues of an upper-triangular matrix are its diagonal entries, so \(\operatorname{Sp}_{\mathbb{C}}(\exp(T)) = \{e^{\lambda_i}\}\) with multiplicities matching the diagonal repetitions. The similarity Proposition gives \(\exp(A) = P \exp(T) P^{-1}\), and similar matrices have equal spectra (recalled from Reduction).
Proposition — Determinant of the exponential
For every \(A \in \mathcal{M}_n(K)\), $$ \textcolor{colorprop}{\det(\exp(A)) \;=\; e^{\operatorname{tr}(A)}.} $$
Trigonalise \(A\) over \(\mathbb{C}\) as in the previous Proof: \(A = PTP^{-1}\) with \(T\) upper-triangular of diagonal \((\lambda_1, \ldots, \lambda_n)\), the complex eigenvalues with multiplicity. The trace and determinant are invariant under similarity, so $$ \begin{aligned} \operatorname{tr}(A) & = \operatorname{tr}(T) \;=\; \sum_{i = 1}^n \lambda_i && \text{(similarity invariance of trace; diagonal entries of \(T\))}\\
\det(\exp(A)) & = \det(P \exp(T) P^{-1}) \;=\; \det(\exp(T)) && \text{(similarity invariance of determinant)}\\
& = \prod_{i=1}^n e^{\lambda_i} && \text{(determinant of upper-triangular = product of diagonal)}\\
& = e^{\sum_i \lambda_i} \;=\; e^{\operatorname{tr}(A)}. && \text{} \end{aligned} $$
Example — A trigonalisable but not diagonalisable matrix
Let \(A = \begin{pmatrix}1 & 1\\ 0 & 1\end{pmatrix}\). Then \(\chi_A(X) = (X - 1)^2\), single eigenvalue \(\lambda = 1\) of multiplicity \(2\), but \(E_1(A) = \ker(A - I_2)\) is the line \(\mathbb{R} e_1\) of dimension \(1 < 2\), so \(A\) is not diagonalisable. Yet \(A\) is trigonalisable (it is already triangular!). Decompose \(A = I_2 + N\) with \(N = \begin{pmatrix}0 & 1\\ 0 & 0\end{pmatrix}\) nilpotent of index \(2\). The summands \(I_2\) and \(N\) commute (everything commutes with \(I_2\)), so by the morphism Theorem of \S 1.3, $$ \exp(A) \;=\; \exp(I_2)\, \exp(N) \;=\; e\, I_2 \cdot (I_2 + N) \;=\; e \begin{pmatrix}1 & 1\\
0 & 1\end{pmatrix}. $$ Verification of the determinant: \(\det(\exp(A)) = e^2 = e^{1 + 1} = e^{\operatorname{tr}(A)}\), in agreement with the determinant Proposition. Method — Single eigenvalue and trigonalisable cases
When \(\operatorname{Sp}_{\mathbb{C}}(A) = \{\lambda\}\) (a single eigenvalue of full multiplicity), Cayley-Hamilton and \(\chi_A = (X - \lambda)^n\) give \((A - \lambda I_n)^n = 0\). Decompose $$ A \;=\; \lambda I_n \,+\, (A - \lambda I_n), $$ where the two summands commute (\(\lambda I_n\) commutes with everything) and the second is nilpotent. By the morphism property, $$ \exp(A) \;=\; e^\lambda \cdot \sum_{k = 0}^{n - 1} \frac{(A - \lambda I_n)^k}{k!} $$ --- a finite sum, fully explicit. The Dunford decomposition \(A = D + N\) with \(D\) diagonalisable, \(N\) nilpotent, \(DN = ND\) exists for any \(A\) and gives \(\exp(A) = \exp(D)\exp(N)\) --- generalising the single-eigenvalue case --- but its existence theorem is hors programme (see Going further). For non-diagonalisable matrices with several distinct eigenvalues, the program-strict route is to invoke the Dunford decomposition as admis or to fall back on explicit case-by-case techniques (e.g.\ Jordan blocks for \(n \leq 3\)). Skills to practice
- Using the spectrum and the single-eigenvalue shortcut
III
The constant-coefficient linear differential system
III.1
Solving the homogeneous system \(X' \equal AX\)
With the exponential established and computable, the homogeneous linear system with constant matrix coefficient is solved in one stroke: the general solution is \(\exp(tA)\) applied to a free vector constant. We distinguish the matrix system \(X' = AX\) (with \(X : \mathbb{R} \to K^n\) and \(A \in \mathcal{M}_n(K)\) constant) from the scalar linear equation \(x' = a(t) x + b(t)\) studied in Linear differential equations. The two are related by the matrix-system viewpoint of Linear differential equations §1.4 but the constant-coefficient case admits the cleaner closed form via the exponential.
Theorem — The Cauchy problem \(X' \equal AX\)
Let \(A \in \mathcal{M}_n(K)\) be a constant matrix. The Cauchy problem $$ X' \;=\; AX, \qquad X(t_0) \;=\; X_0 $$ (where \(X : \mathbb{R} \to K^n\), \(t_0 \in \mathbb{R}\), \(X_0 \in K^n\)) has the unique solution on \(\mathbb{R}\) $$ \textcolor{colorprop}{X(t) \;=\; \exp((t - t_0) A)\, X_0.} $$ The general solution of \(X' = AX\) is therefore \(X(t) = \exp(tA)\, C\) for \(C \in K^n\), with \(C = \exp(-t_0 A) X_0\) fitting the initial condition.
Existence. The function \(X(t) = \exp((t - t_0) A) X_0\) is of class \(\mathcal{C}^1\) on \(\mathbb{R}\) (the chain-rule Method of \S 1.2 with \(u(t) = t - t_0\), of derivative \(u'(t) = 1\)). Differentiate by the same Method: $$ X'(t) \;=\; A\, \exp((t - t_0) A)\, X_0 \;=\; A\, X(t). $$ At \(t = t_0\), \(X(t_0) = \exp(0) X_0 = I_n X_0 = X_0\). So \(X\) solves the Cauchy problem.
Uniqueness. The system \(X' = AX\) is a linear differential equation in the sense of Linear differential equations (with constant continuous coefficient \(a(t) \equiv A\) and right-hand side \(b(t) \equiv 0\)). The linear Cauchy theorem of that chapter (recalled, never re-proved here) gives existence-uniqueness on the whole of \(\mathbb{R}\). Our explicit \(X\) is therefore the solution.
Uniqueness. The system \(X' = AX\) is a linear differential equation in the sense of Linear differential equations (with constant continuous coefficient \(a(t) \equiv A\) and right-hand side \(b(t) \equiv 0\)). The linear Cauchy theorem of that chapter (recalled, never re-proved here) gives existence-uniqueness on the whole of \(\mathbb{R}\). Our explicit \(X\) is therefore the solution.
Example — The scalar case
For \(n = 1\), \(A = (a)\) with \(a \in K\), the system \(X' = AX\) is the scalar equation \(x' = a x\), and the Theorem recovers the MPSI elementary result \(x(t) = e^{a(t - t_0)} x_0\). The matrix exponential in dimension \(1\) is just the scalar exponential. Proposition — Variation of constants
For \(A \in \mathcal{M}_n(K)\) constant and \(B \in \mathcal{C}(I, K^n)\) continuous on an interval \(I \ni t_0\), the Cauchy problem $$ X' \;=\; AX + B(t), \qquad X(t_0) \;=\; X_0 $$ has the unique solution on \(I\) $$ \textcolor{colorprop}{X(t) \;=\; \exp((t - t_0) A)\, X_0 \,+\, \int_{t_0}^t \exp((t - s) A)\, B(s)\, ds.} $$
Factor the integral. For all \(t, s \in I\), \(tA\) and \(-sA\) commute (both scalar multiples of \(A\)); by the morphism Theorem of \S 1.3, \(\exp((t - s) A) = \exp(tA)\exp(-sA)\). So $$ \int_{t_0}^t \exp((t-s)A)\, B(s)\, ds \;=\; \exp(tA) \int_{t_0}^t \exp(-sA)\, B(s)\, ds. $$ Let \(J(t) = \int_{t_0}^t \exp(-sA) B(s) ds\) --- a \(\mathcal{C}^1\) function of \(t\) (continuous integrand, FTC for vector-valued maps recalled from Vector-valued functions of a real variable) with \(J'(t) = \exp(-tA) B(t)\) and \(J(t_0) = 0\).
Differentiate \(X\). Rewriting \(X(t) = \exp((t-t_0)A) X_0 + \exp(tA) J(t)\), apply the product rule and the chain-rule Method of \S 1.2 (applied to \(u(t) = t - t_0\) and to \(u(t) = t\)): $$ \begin{aligned} X'(t) & = A \exp((t-t_0) A)\, X_0 \,+\, A \exp(tA)\, J(t) \,+\, \exp(tA)\, J'(t) && \text{(product rule + chain-rule Method)}\\ & = A \exp((t-t_0) A)\, X_0 \,+\, A \exp(tA)\, J(t) \,+\, \exp(tA)\, \exp(-tA)\, B(t) && \text{(formula for \(J'\))}\\ & = A \big[ \exp((t-t_0) A) X_0 + \exp(tA) J(t) \big] \,+\, I_n\, B(t) && \text{(\(\exp(tA)\exp(-tA) = I_n\), \S 1.3)}\\ & = A\, X(t) \,+\, B(t). && \end{aligned} $$ At \(t = t_0\), \(J(t_0) = 0\), so \(X(t_0) = \exp(0) X_0 = X_0\). Uniqueness follows from the linear Cauchy theorem of Linear differential equations.
Differentiate \(X\). Rewriting \(X(t) = \exp((t-t_0)A) X_0 + \exp(tA) J(t)\), apply the product rule and the chain-rule Method of \S 1.2 (applied to \(u(t) = t - t_0\) and to \(u(t) = t\)): $$ \begin{aligned} X'(t) & = A \exp((t-t_0) A)\, X_0 \,+\, A \exp(tA)\, J(t) \,+\, \exp(tA)\, J'(t) && \text{(product rule + chain-rule Method)}\\ & = A \exp((t-t_0) A)\, X_0 \,+\, A \exp(tA)\, J(t) \,+\, \exp(tA)\, \exp(-tA)\, B(t) && \text{(formula for \(J'\))}\\ & = A \big[ \exp((t-t_0) A) X_0 + \exp(tA) J(t) \big] \,+\, I_n\, B(t) && \text{(\(\exp(tA)\exp(-tA) = I_n\), \S 1.3)}\\ & = A\, X(t) \,+\, B(t). && \end{aligned} $$ At \(t = t_0\), \(J(t_0) = 0\), so \(X(t_0) = \exp(0) X_0 = X_0\). Uniqueness follows from the linear Cauchy theorem of Linear differential equations.
Method — Solving \(X' \equal AX\) via the matrix exponential
Three steps for \(A \in \mathcal{M}_n(K)\) constant: - Compute \(\exp(tA)\). Use diagonalisation (\S 2.1) if \(A\) is diagonalisable, otherwise the single-eigenvalue \(\lambda I + N\) decomposition (\S 2.2) or direct trigonalisation.
- Write the general solution. \(X(t) = \exp(tA)\, C\) for \(C \in K^n\).
- Fit the initial condition. \(X(t_0) = X_0\) gives \(C = \exp(-t_0 A) X_0\), so \(X(t) = \exp((t - t_0) A) X_0\).
Skills to practice
- Solving \(X' \equal AX\) via the matrix exponential
III.2
Practical resolution in the diagonalisable case
When \(A\) is diagonalisable, going through \(\exp(tA)\) is wasteful. The change of variable \(Y = P^{-1} X\) decouples the system into \(n\) independent scalar equations, each solved trivially. Recombining gives the general solution as a sum of \(e^{\lambda_i t} X_i\) contributions, one per eigenvector --- the « spectral decomposition » of the solution.
Theorem — Diagonalisable resolution
Let \(A \in \mathcal{M}_n(K)\) be diagonalisable, \((X_1, \ldots, X_n)\) a basis of eigenvectors with associated eigenvalues \((\lambda_1, \ldots, \lambda_n)\) (listed with multiplicity in the order of the eigenvectors). The general solution of \(X' = AX\) is $$ \textcolor{colorprop}{X(t) \;=\; C_1\, e^{\lambda_1 t}\, X_1 \,+\, \cdots \,+\, C_n\, e^{\lambda_n t}\, X_n, \qquad C_1, \ldots, C_n \in K.} $$
Two routes.
- Via \(\exp(tA)\). Apply §2.1: \(\exp(tA) = P\, \operatorname{diag}(e^{\lambda_i t})\, P^{-1}\) where \(P\) has columns \(X_1, \ldots, X_n\). The general solution of \(X' = AX\) (Theorem of §3.1) is \(X(t) = \exp(tA) C\) for \(C \in K^n\); expanding, $$ X(t) \;=\; P\, \operatorname{diag}(e^{\lambda_i t})\, P^{-1}\, C \;=\; \sum_{i = 1}^n (P^{-1} C)_i\, e^{\lambda_i t}\, X_i. $$ Relabel the free constants as \(C_i = (P^{-1} C)_i\).
- Via the change of variable \(Y = P^{-1} X\). The system \(X' = AX\) with \(A = PDP^{-1}\) becomes \(Y' = D Y\) in the new variables (multiply both sides by \(P^{-1}\)). Since \(D = \operatorname{diag}(\lambda_i)\) is diagonal, \(Y' = DY\) decouples into \(y_i' = \lambda_i y_i\) for \(i = 1, \ldots, n\). Each scalar equation has general solution \(y_i(t) = C_i e^{\lambda_i t}\). Recombining, \(X(t) = P Y(t) = \sum_i C_i e^{\lambda_i t} X_i\).
Example — A \(2 \times 2\) saddle
Solve \(X' = A X\) for \(A = \begin{pmatrix}1 & 1\\ 2 & 0\end{pmatrix}\). Spectrum. \(\chi_A = X^2 - X - 2 = (X-2)(X+1)\), so \(\operatorname{Sp}(A) = \{2, -1\}\). Two real distinct eigenvalues, of opposite signs --- saddle case. Eigenvectors. For \(\lambda_1 = 2\), \((A - 2I_2)X_1 = 0\) gives \(X_1 = (1, 1)^\mathsf{T}\). For \(\lambda_2 = -1\), \((A + I_2) X_2 = 0\) gives \(X_2 = (1, -2)^\mathsf{T}\). General solution. $$ X(t) \;=\; C_1\, e^{2t} \begin{pmatrix}1\\
1\end{pmatrix} \,+\, C_2\, e^{-t} \begin{pmatrix}1\\
-2\end{pmatrix}, \qquad C_1, C_2 \in \mathbb{R}. $$ The phase portrait in the \((x_1, x_2)\)-plane has hyperbolic trajectories asymptotic to the eigenvector lines: \(X_1\) is the unstable direction (\(e^{2t} \to +\infty\)), \(X_2\) is the stable direction (\(e^{-t} \to 0\)). The origin is a saddle point.
Method — Solving by eigenvalue decoupling
For \(A \in \mathcal{M}_n(K)\) constant and diagonalisable, three steps: - Diagonalise. Compute \(\chi_A\) to find \(\operatorname{Sp}(A)\) and a basis of eigenvectors \((X_1, \ldots, X_n)\).
- Write the general solution. \(X(t) = \sum_{i = 1}^n C_i\, e^{\lambda_i t}\, X_i\) with \(C_i \in K\) free constants.
- Fit the Cauchy condition. \(X(t_0) = X_0\) gives a linear system for \((C_1, \ldots, C_n)\) --- solve it.
Going further
The chapter built the matrix exponential and applied it to the constant-coefficient linear system. Four threads continue. Antisymmetric matrices and the special orthogonal group. The §2.1 rotation example is the visible \(n = 2\) instance of a more general fact: \(\exp(\mathcal{A}_n(\mathbb{R})) \subset \mathrm{SO}_n(\mathbb{R})\). Proof sketch: if \(A^\mathsf{T} = -A\), then \(\exp(A)^\mathsf{T} = \exp(A^\mathsf{T}) = \exp(-A) = \exp(A)^{-1}\) (transpose commutes with the series; then morphism Theorem with \(-A\) and \(A\), which commute trivially), so \(\exp(A)\) is orthogonal; and \(\det \exp(A) = e^{\operatorname{tr}(A)} = e^0 = 1\) (since \(\operatorname{tr}(A) = 0\) for antisymmetric \(A\)), so \(\exp(A) \in \mathrm{SO}_n(\mathbb{R})\). This is the Lie-algebra-to-Lie-group correspondence at \(n\)-dimensional level. The Dunford decomposition writes a non-diagonalisable matrix as \(A = D + N\) with \(D\) diagonalisable, \(N\) nilpotent, \(DN = ND\); then \(\exp(tA) = \exp(tD)\exp(tN)\) is a polynomial-in-\(t\) times a diagonal exponential, and the solutions of \(X' = AX\) have the polynomial-times-exponential structure. The Dunford existence theorem is hors programme (see Polynomials of an endomorphism); in-program practical alternatives are the single-eigenvalue \(\lambda I + N\) shortcut of §2.2 and Dunford-as-admis on a case-by-case basis. Phase portraits in \(\mathbb{R}^2\) for \(X' = AX\) admit a six-fold classification by the signs and reality of the eigenvalues (saddle, stable/unstable node, stable/unstable spiral, centre); the §3.2 example treats only the saddle. Non-linear systems \(X' = F(X)\) admit a local existence-uniqueness theorem (Cauchy-Lipschitz), beyond MP.
Skills to practice
- Solving diagonalisable systems by eigenvalue decoupling
Jump to section