CommeUnJeu · L1 PCSI

Linear systems

⌚ ~107 min ▢ 13 blocks ✓ 42 exercises ➣ Prerequisites : Matrix calculus, Systems of Linear Equations

A linear system of $n$ equations in $p$ unknowns translates verbatim into a matrix equation $A X = B$, with $A \in M_{n,p}(\mathbb{K})$, $X \in M_{p,1}(\mathbb{K})$ and $B \in M_{n,1}(\mathbb{K})$. From this single rewriting, all the matrix calculus of the previous chapter becomes available: linearity of $X \mapsto A X$ gives a structure theorem (every solution is one particular solution plus a solution of the homogeneous system), elementary row operations preserve the solution set, and the Gauss pivot algorithm reduces any system to an echelon form from which the solutions can be read directly. The chapter closes with a short application: when $A$ is square invertible, the system is called a Cramer system and admits the unique solution $X = A^{-1} B$.
This chapter is deliberately brief: by convention, all technicality is excluded. Examples accordingly stay small (at most three equations, at most one free unknown); heavier drill lives in the exercise file. Determinantal Cramer formulas are deferred to Determinants, the rank-based existence and uniqueness criteria to Finite-dimensional vector spaces and Change of basis, the geometric interpretation to Affine subspaces, and the formal theory of the $\mathrm{Vect}$ notation to Vector spaces.

Conventions

Throughout this chapter, $\mathbb{K}$ denotes either $\mathbb{R}$ or $\mathbb{C}$. The letter $n$ counts the number of equations of the system, the letter $p$ counts the number of unknowns; in general $n \ne p$. The zero column of size $n$ is written $0_{n,1}$ (often abbreviated to $0$ when the size is clear from context). All row operations are written $L_i \leftrightarrow L_j$ (swap), $L_i \leftarrow \lambda L_i$ (dilation, $\lambda \ne 0$ --- also called « multiplication of a row by a non-zero scalar ») and $L_i \leftarrow L_i + \lambda L_j$ (transvection, $i \ne j$), consistently with the conventions of Matrix calculus.

I Matrix form of a linear system

A linear system of $n$ equations in $p$ unknowns is a list of equations of the form $a_{i1} x_1 + a_{i2} x_2 + \dots + a_{ip} x_p = b_i$ for $i = 1, \ldots, n$, each side linear in the unknowns $x_1, \ldots, x_p$. The whole list rewrites as a single matrix equation $A X = B$, with $A$ the matrix of coefficients, $X$ the column of unknowns and $B$ the column of right-hand sides. This unique rewriting will be enough to import all the matrix calculus of the previous chapter into the study of systems.

Definition — Linear system$\virgule$ matrix form

Let $n, p \ge 1$ be positive integers, not necessarily equal. A linear system of $n$ equations in $p$ unknowns with coefficients in $\mathbb{K}$ is a list $$ \begin{aligned} & a_{11} x_1 + a_{12} x_2 + \cdots + a_{1p} x_p = b_1, \\ & a_{21} x_1 + a_{22} x_2 + \cdots + a_{2p} x_p = b_2, \\ & \qquad\qquad\qquad\qquad \vdots \\ & a_{n1} x_1 + a_{n2} x_2 + \cdots + a_{np} x_p = b_n, \end{aligned} $$ where the $a_{ij} \in \mathbb{K}$ are the coefficients of the system, the $x_j$ are the unknowns and the $b_i \in \mathbb{K}$ are the right-hand sides (or « second members »). Writing $A = (a_{ij}) \in M_{n,p}(\mathbb{K})$, $X = (x_j) \in M_{p,1}(\mathbb{K})$ and $B = (b_i) \in M_{n,1}(\mathbb{K})$, the system reads $$ A X = B, $$ which we call its matrix form. The matrix $A$ is the matrix of the system, and $B$ is its right-hand side.

Example

The $3 \times 3$ system in unknowns $(x, y, z) \in \mathbb{R}^3$ $$ \begin{aligned} & 2x + y - 3z = 3, \\ & \phantom{2x +\,} 5y + z = 2, \\ & 9x + 10y + 2z = 1 \end{aligned} \quad \text{rewrites} \quad \begin{pmatrix} 2 & 1 & -3 \\ 0 & 5 & 1 \\ 9 & 10 & 2 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 3 \\ 2 \\ 1 \end{pmatrix}. $$ Here $n = p = 3$ and the matrix $A$ is square.

Example

The non-square $2 \times 3$ system $\{x + 2y - z = 4 \;;\; 3x - y + 5z = 0\}$ (two equations, three unknowns, so $n = 2$ and $p = 3$) rewrites $$ \begin{pmatrix} 1 & 2 & -1 \\ 3 & -1 & 5 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 4 \\ 0 \end{pmatrix}. $$ Conversely, the non-square system $\{2x + y = 1 \;;\; x - 3y = 4 \;;\; -x + y = 2\}$ (three equations, two unknowns, so $n = 3$, $p = 2$) has matrix form $$ \begin{pmatrix} 2 & 1 \\ 1 & -3 \\ -1 & 1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 1 \\ 4 \\ 2 \end{pmatrix}. $$ The case $n \ne p$ is the rule, not the exception.

Definition — Homogeneous system

The homogeneous system associated with $A X = B$ is the system $$ A X = 0_{n,1}, $$ obtained by replacing the right-hand side by the zero column of size $n$. When the size is clear from context, we abbreviate $0_{n,1}$ to $0$ and simply write $A X = 0$.

Example

The homogeneous system associated with the $3 \times 3$ system of Example 1 is $$ \begin{pmatrix} 2 & 1 & -3 \\ 0 & 5 & 1 \\ 9 & 10 & 2 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix}. $$ The matrix $A$ is identical; only the right-hand side has changed. In particular, $X = (0, 0, 0)^\top$ is always a solution of the homogeneous system --- it is called the trivial solution.

Definition — Compatible system

The system $A X = B$ is compatible if it admits at least one solution $X \in M_{p,1}(\mathbb{K})$, and incompatible otherwise. The homogeneous system $A X = 0$ is always compatible (the trivial solution $X = 0$ is a solution).

Proposition — Compatibility via columns of $A$

Let $A \in M_{n,p}(\mathbb{K})$ have columns $C_1, \ldots, C_p \in M_{n,1}(\mathbb{K})$ and let $B \in M_{n,1}(\mathbb{K})$. Then $A X = B$ is compatible if and only if $B$ is a linear combination of $C_1, \ldots, C_p$, that is, $$ \textcolor{colorprop}{A X = B \text{ compatible } \iff \exists\, (x_1, \ldots, x_p) \in \mathbb{K}^p,\ B = x_1 C_1 + x_2 C_2 + \cdots + x_p C_p.} $$

Proof

For $X = (x_1, \ldots, x_p)^\top \in M_{p,1}(\mathbb{K})$, the matrix product $A X$ unfolds (as proved in Matrix calculus) into the linear combination of the columns of $A$ weighted by the entries of $X$: $$ \begin{aligned} A X & = \begin{pmatrix} C_1 & C_2 & \cdots & C_p \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_p \end{pmatrix} && \text{(block view of $A$)} \\ & = x_1 C_1 + x_2 C_2 + \cdots + x_p C_p && \text{(column $\times$ unknown rule).} \end{aligned} $$ Hence $A X = B$ has a solution $X$ if and only if there exist scalars $x_1, \ldots, x_p$ such that $x_1 C_1 + \cdots + x_p C_p = B$, which is the announced equivalence.
This criterion is mostly conceptual: in practice, Gauss reduction (covered below) is the standard way to test compatibility.

Example

Consider $A X = B$ with $A = \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}$ and $B = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$. The two columns of $A$ are $C_1 = C_2 = (1, 1)^\top$. Any linear combination $x_1 C_1 + x_2 C_2 = (x_1 + x_2)(1, 1)^\top$ has identical entries. The right-hand side $B = (1, 0)^\top$ has different entries, so $B$ is not a linear combination of the columns. By Proposition 1, the system is incompatible.

Example

Take $A = \begin{pmatrix} 1 & 0 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 0 \end{pmatrix}$ and $B = \begin{pmatrix} 2 \\ 3 \\ 0 \end{pmatrix}$. The columns of $A$ are $C_1 = (1, 0, 0)^\top$, $C_2 = (0, 1, 0)^\top$, $C_3 = (1, 1, 0)^\top$. Observe that $B = 2 C_1 + 3 C_2$ already (taking $x_3 = 0$), hence $B$ is a column-combination and by Proposition 1 the system is compatible. The solution $X = (2, 3, 0)^\top$ is read off the same identity.

Skills to practice

Writing a system in matrix form
Checking compatibility

II Structure of the solution set

The next theorem is the most important result of the chapter. It reduces the question « what does the solution set of $A X = B$ look like? » to two independent sub-questions: (1) does there exist one solution? and (2) what does the solution set of the homogeneous system $A X = 0$ look like? The answer is built once and for all: every solution is one particular solution plus an arbitrary solution of the homogeneous system. The proof is a two-line consequence of the linearity of $X \mapsto A X$.

Theorem — Structure of the solution set

Let $A \in M_{n,p}(\mathbb{K})$ and $B \in M_{n,1}(\mathbb{K})$. Suppose the system $A X = B$ is compatible, and let $X_p \in M_{p,1}(\mathbb{K})$ be any particular solution, so that $A X_p = B$. Then the set $\mathcal{S}$ of all solutions of $A X = B$ is $$ \textcolor{colorprop}{\mathcal{S} = \{\, X_p + Y \;:\; Y \in M_{p,1}(\mathbb{K}),\ A Y = 0_{n,1} \,\}.} $$ In words: every solution of $A X = B$ is the sum of the fixed particular solution $X_p$ and a solution $Y$ of the homogeneous system associated.

Proof

Apply the linearity of the map $X \mapsto A X$ (proven in Matrix calculus): for any $X \in M_{p,1}(\mathbb{K})$, set $Y := X - X_p$. Then $$ \begin{aligned} A X = B & \iff A X = A X_p && \text{(since $A X_p = B$ by hypothesis)} \\ & \iff A X - A X_p = 0_{n,1} && \text{(subtract $A X_p$ from both sides)} \\ & \iff A (X - X_p) = 0_{n,1} && \text{(linearity of $X \mapsto A X$)} \\ & \iff A Y = 0_{n,1} && \text{(setting $Y := X - X_p$).} \end{aligned} $$ Hence $X$ is a solution of $A X = B$ if and only if $Y = X - X_p$ is a solution of the homogeneous system, that is, $X = X_p + Y$ with $A Y = 0$. The two sets are equal.

Definition — Notation $\mathrm{Vect}$ (writing convention)

For columns $X_1, \ldots, X_r \in M_{p,1}(\mathbb{K})$, we write $$ \mathrm{Vect}(X_1, \ldots, X_r) := \{\, \lambda_1 X_1 + \lambda_2 X_2 + \cdots + \lambda_r X_r \;:\; \lambda_1, \ldots, \lambda_r \in \mathbb{K} \,\} $$ for the set of all linear combinations of $X_1, \ldots, X_r$.
Remark. At this point in the course, $\mathrm{Vect}(\ldots)$ is only a writing convention for that set --- nothing more. The formal theory of subspaces generated by a family (sous-espace engendré, dimension, etc.) is developed in the chapter Vector spaces; we use the notation here purely to write the solution set of a system in a compact form.

Example

Solve the $2 \times 3$ compatible system $\{x + 2y - z = 1 \;;\; 2x + 5y + z = 2\}$ in $(x, y, z) \in \mathbb{R}^3$ and express the solution set both as an explicit parametric family and using the $\mathrm{Vect}$ notation.

Answer

Apply the row operation $L_2 \leftarrow L_2 - 2 L_1$ to the system: $$ \begin{aligned} & \begin{cases} x + 2y - z = 1, \\ 2x + 5y + z = 2 \end{cases} && \text{(initial)} \\ \iff{} & \begin{cases} x + 2y - z = 1, \\ \phantom{x + 2}y + 3z = 0 \end{cases} && \text{($L_2 \leftarrow L_2 - 2 L_1$, transvection).} \end{aligned} $$ The second equation gives $y = -3z$. Substituting into the first gives $x = 1 - 2y + z = 1 - 2(-3z) + z = 1 + 7z$. Hence the solution set is, with $\lambda := z$ as a free parameter, $$ \mathcal{S} = \left\{\, \begin{pmatrix} 1 + 7\lambda \\ -3\lambda \\ \lambda \end{pmatrix} \;:\; \lambda \in \mathbb{R} \,\right\}. $$ Equivalently, splitting the column into its $\lambda$-independent part (particular solution $X_p$) plus its $\lambda$-dependent part: $$ \mathcal{S} = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} + \mathrm{Vect}\!\left( \begin{pmatrix} 7 \\ -3 \\ 1 \end{pmatrix} \right). $$ The two forms describe the same set; the parametric form makes the family of solutions visible coordinate by coordinate, the $\mathrm{Vect}$ form makes the structure « particular + homogeneous » of Theorem 1 visible at a glance.

Method — Solving a system as particular plus homogeneous

To describe the solution set of a compatible system $A X = B$:

find one particular solution $X_p$ such that $A X_p = B$ (by inspection or by Gauss);
describe all solutions of the homogeneous system $A Y = 0$ (typically as $\mathrm{Vect}(Y_1, \ldots, Y_r)$ for explicit columns $Y_1, \ldots, Y_r$);
conclude $\mathcal{S} = X_p + \mathrm{Vect}(Y_1, \ldots, Y_r)$.

On first use, write the answer in both forms (explicit parametric family and $\mathrm{Vect}$ form). Afterwards the $\mathrm{Vect}$ form alone is enough.

Skills to practice

Expressing the solution set as particular plus homogeneous

III Geometric interpretation

The structure theorem above acquires a familiar geometric face when the unknowns lie in $\mathbb{R}^2$ or $\mathbb{R}^3$. In $\mathbb{R}^2$, a single non-degenerate linear equation defines a line; in $\mathbb{R}^3$, a single non-degenerate linear equation defines a plane. A system is then the intersection of these lines or planes. We sketch the picture informally; the rigorous affine theory (affine subspace, direction, dimension) is the subject of the chapter Affine subspaces.

In the plane $\mathbb{R}^2$

A single linear equation $a x + b y = c$ with $(a, b) \ne (0, 0)$ defines a line $D$ in $\mathbb{R}^2$. If $(a, b) = (0, 0)$, the equation degenerates: it is trivially satisfied if $c = 0$ (the whole plane is « solution »), or incompatible if $c \ne 0$.
For a $2 \times 2$ system $\{a_1 x + b_1 y = c_1 \;;\; a_2 x + b_2 y = c_2\}$ with both pairs $(a_i, b_i)$ non-zero, the solution set is the intersection of two lines $D_1$ and $D_2$. Three typical situations:

$D_1$ and $D_2$ are secant: they share exactly one point, and the system admits a unique solution;
$D_1$ and $D_2$ are parallel and distinct: they share no point, and the system is incompatible;
$D_1$ and $D_2$ are identical: the two equations describe the same line, and the solution set is the line itself.

Example

Solve and interpret geometrically each of the two $2 \times 2$ systems $$ (S_1): \begin{cases} x + y = 3, \\ 2x - y = 0; \end{cases} \qquad (S_2): \begin{cases} x + y = 1, \\ 2x + 2y = 5. \end{cases} $$

Answer

For $(S_1)$, add the two equations: $3x = 3$, hence $x = 1$ and $y = 3 - x = 2$. The unique solution is $(1, 2)$. Geometrically, the lines $x + y = 3$ and $2x - y = 0$ are secant at the point $(1, 2)$.
For $(S_2)$, apply $L_2 \leftarrow L_2 - 2 L_1$: the second equation becomes $0 = 5 - 2 = 3 \ne 0$. The system is incompatible. Geometrically, $x + y = 1$ and $2x + 2y = 5$ (equivalent to $x + y = 5/2$) are parallel and distinct, hence have no common point.

In space $\mathbb{R}^3$

A single linear equation $a x + b y + c z = d$ with $(a, b, c) \ne (0, 0, 0)$ defines a plane $P$ in $\mathbb{R}^3$.
For a $2 \times 3$ system (two equations in three unknowns), the solution set is the intersection of two planes $P_1$ and $P_2$. Depending on dependence and compatibility, this intersection is a plane (the two equations describe the same plane), a line (secant planes), or empty (parallel distinct planes).
For a $3 \times 3$ system, a further possible situation is a single point, when the three planes meet in general position.
No taxonomy table: the cases are illustrated through examples.

Example

Solve $\{x + 2y + z = 5 \;;\; 3x + y - 2z = 0\}$ in $(x, y, z) \in \mathbb{R}^3$ and interpret geometrically.

Answer

Apply two elementary row operations, one micro-step per line: $$ \begin{aligned} & \begin{cases} x + 2y + z = 5, \\ 3x + y - 2z = 0 \end{cases} && \text{(initial)} \\ \iff{} & \begin{cases} x + 2y + z = 5, \\ \phantom{3x +} -5y - 5z = -15 \end{cases} && \text{($L_2 \leftarrow L_2 - 3 L_1$, transvection)} \\ \iff{} & \begin{cases} x + 2y + z = 5, \\ \phantom{3x + 5}y + z = 3 \end{cases} && \text{($L_2 \leftarrow -\tfrac{1}{5} L_2$, dilation).} \end{aligned} $$ Take $\lambda := z$ as free parameter. The second equation gives $y = 3 - \lambda$, the first gives $x = 5 - 2y - z = 5 - 2(3 - \lambda) - \lambda = -1 + \lambda$. The solution set is $$ \mathcal{S} = \left\{ \begin{pmatrix} -1 + \lambda \\ 3 - \lambda \\ \lambda \end{pmatrix} \;:\; \lambda \in \mathbb{R} \right\} = \begin{pmatrix} -1 \\ 3 \\ 0 \end{pmatrix} + \mathrm{Vect}\!\left( \begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix} \right). $$ Geometrically, this is the line in $\mathbb{R}^3$ passing through the point $A = (-1, 3, 0)$ and directed by $\vec{d} = (1, -1, 1)$. The two planes $\{x + 2y + z = 5\}$ and $\{3x + y - 2z = 0\}$ are secant; they meet along this line.

Skills to practice

Solving a 2-equation system in $\mathbb{R}^2$
Solving a 2- or 3-equation system in $\mathbb{R}^3$

IV Augmented matrix and elementary row operations on a system

The Gauss pivot algorithm is performed by manipulating the coefficients of the system. To run it cleanly, one packs all those coefficients --- both the matrix $A$ and the right-hand side $B$ --- into a single object, the augmented matrix $(A \mid B)$, and works on it as a single matrix. The three elementary row operations of Matrix calculus then act on this augmented matrix exactly as before, and (this is the key fact) they preserve the solution set of the underlying system.

Definition — Augmented matrix of a system

For a linear system $A X = B$ with $A \in M_{n,p}(\mathbb{K})$ and $B \in M_{n,1}(\mathbb{K})$, the augmented matrix is the block matrix $$ (A \mid B) \in M_{n,\, p+1}(\mathbb{K}) $$ obtained by appending the column $B$ to the right of $A$. The vertical bar « $\mid$ » is purely visual; it separates the $p$ « coefficient » columns from the single « right-hand side » column. Elementary row operations on the system are performed on this single matrix.

Example

The $3 \times 3$ system $\{2x + y - 3z = 3 \;;\; 5y + z = 2 \;;\; 9x + 10y + 2z = 1\}$ has matrix $A$ and right-hand side $B$ given by $$ A = \begin{pmatrix} 2 & 1 & -3 \\ 0 & 5 & 1 \\ 9 & 10 & 2 \end{pmatrix}, \quad B = \begin{pmatrix} 3 \\ 2 \\ 1 \end{pmatrix}, $$ and the augmented matrix is $$ (A \mid B) = \left( \begin{array}{ccc|c} 2 & 1 & -3 & 3 \\ 0 & 5 & 1 & 2 \\ 9 & 10 & 2 & 1 \end{array} \right) \in M_{3, 4}(\mathbb{R}). $$ The first three columns are the columns of $A$, the fourth column is $B$, the vertical bar separates them.

Definition — Elementary row operations on a system

The three elementary row operations on a system $A X = B$ (equivalently on its augmented matrix $(A \mid B)$) are:

Swap $L_i \leftrightarrow L_j$ ($i \ne j$): exchange rows $i$ and $j$.
Dilation $L_i \leftarrow \lambda L_i$ ($\lambda \in \mathbb{K} \setminus \{0\}$): multiply row $i$ by the non-zero scalar $\lambda$.
Transvection $L_i \leftarrow L_i + \lambda L_j$ ($i \ne j$, $\lambda \in \mathbb{K}$): add $\lambda$ times row $j$ to row $i$ (in place).

These are the same three operations introduced in Matrix calculus, applied here to the augmented matrix of a system.

Proposition — Row operations preserve the solution set

Let $A X = B$ be a linear system with augmented matrix $(A \mid B)$, and let $A' X = B'$ be the system obtained from $A X = B$ by applying one elementary row operation to $(A \mid B)$. Then $A X = B$ and $A' X = B'$ have the same set of solutions.
In one sentence: an elementary row operation changes the equations but not their set of solutions.

Proof

Each elementary row operation is reversible by an operation of the same type, so it suffices to check that no solution is created or destroyed. Treat the three cases.

Swap $L_i \leftrightarrow L_j$. Swapping two equations of the system gives an equivalent system: the two systems have the same equations as sets, so the same solutions. The inverse swap restores the original.
Dilation $L_i \leftarrow \lambda L_i$, $\lambda \ne 0$. Replacing equation $i$ « $a_{i1} x_1 + \cdots + a_{ip} x_p = b_i$ » by « $\lambda(a_{i1} x_1 + \cdots + a_{ip} x_p) = \lambda b_i$ » is equivalent (since $\lambda \ne 0$, we can divide by $\lambda$ to recover the original). The other equations are unchanged. Hence same solutions. The inverse dilation $L_i \leftarrow \tfrac{1}{\lambda} L_i$ restores the original.
Transvection $L_i \leftarrow L_i + \lambda L_j$, $i \ne j$. If $X$ satisfies $A X = B$, in particular it satisfies equations $i$ and $j$, hence also any linear combination of them, in particular the modified equation $i$. The other equations are unchanged, so $X$ satisfies the new system. Conversely, the inverse transvection $L_i \leftarrow L_i - \lambda L_j$ recovers the original equation $i$ from the modified one (using the unchanged equation $j$), so any solution of the new system is a solution of the original.

In all three cases the two systems have the same solutions.

Example

Illustrate the three operations on the $2 \times 2$ system $(S)$: $\{x + 2y = 3 \;;\; 3x + 5y = 7\}$, working throughout on the augmented matrix.

Answer

The augmented matrix is $$ (A \mid B) = \left( \begin{array}{cc|c} 1 & 2 & 3 \\ 3 & 5 & 7 \end{array} \right). $$

Swap $L_1 \leftrightarrow L_2$. $$ \left( \begin{array}{cc|c} 1 & 2 & 3 \\ 3 & 5 & 7 \end{array} \right) \xrightarrow[L_1 \leftrightarrow L_2]{} \left( \begin{array}{cc|c} 3 & 5 & 7 \\ 1 & 2 & 3 \end{array} \right). $$ The new system $\{3x + 5y = 7 \;;\; x + 2y = 3\}$ is the same system with equations listed in reverse order.
Dilation $L_2 \leftarrow 2 L_2$. $$ \left( \begin{array}{cc|c} 1 & 2 & 3 \\ 3 & 5 & 7 \end{array} \right) \xrightarrow[L_2 \leftarrow 2 L_2]{} \left( \begin{array}{cc|c} 1 & 2 & 3 \\ 6 & 10 & 14 \end{array} \right). $$ The new equation $6x + 10y = 14$ is just twice the old equation; same solutions.
Transvection $L_2 \leftarrow L_2 - 3 L_1$. $$ \left( \begin{array}{cc|c} 1 & 2 & 3 \\ 3 & 5 & 7 \end{array} \right) \xrightarrow[L_2 \leftarrow L_2 - 3 L_1]{} \left( \begin{array}{cc|c} 1 & 2 & 3 \\ 0 & -1 & -2 \end{array} \right). $$ The transvection has cleared the coefficient of $x$ in equation $2$. The new system $\{x + 2y = 3 \;;\; -y = -2\}$ is immediately solvable: $y = 2$, then $x = 3 - 2y = -1$. The unique solution is $(-1, 2)$.

In all three cases, the underlying solution set $\{(-1, 2)\}$ is unchanged.

Remark --- composite shortcut $L_i \leftarrow \alpha L_i + \beta L_j$

In practice, one frequently writes a composite operation $L_i \leftarrow \alpha L_i + \beta L_j$ with $\alpha \ne 0$, $i \ne j$. This is a shorthand for two successive elementary operations: the dilation $L_i \leftarrow \alpha L_i$ (legal since $\alpha \ne 0$) followed by the transvection $L_i \leftarrow L_i + \beta L_j$. The condition $\alpha \ne 0$ is essential: applying « $L_2 \leftarrow 0 \cdot L_2 + L_1 $ » would replace row 2 by $L_1$, destroying row-2 information and changing the solution set. Whenever a problem involves a parameter that could vanish, treat the vanishing case separately before writing the composite operation.

Skills to practice

Writing the augmented matrix
Applying elementary row operations

V Gauss pivot algorithm

The Gauss pivot algorithm is a systematic procedure that, applied to the augmented matrix $(A \mid B)$ of any linear system, returns an equivalent system in echelon form --- a form from which the solution set can be read in a few back-substitution steps. This section stays brief and free of technical complications; the algorithm itself is the point, not the heaviness of the case analysis. Heavier examples live in the exercise file.

Definition — Echelon form of a system

A system (equivalently, an augmented matrix $(A \mid B)$) is in echelon form when the following two conditions hold:

every row that is entirely zero appears below every row that is not entirely zero (zero rows accumulate at the bottom);
in each non-zero row, the first non-zero entry --- called the pivot of that row --- is strictly to the right of the pivot of the row above.

A system is in reduced echelon form if, in addition, each pivot equals $1$ and is the only non-zero entry of its column.
Remark. These notions are kept here at an operational level. The formal uniqueness of the reduced echelon form, the row-rank theory, and the column-pivot theory belong to chapters Finite-dimensional vector spaces and Change of basis.

Method — Gauss pivot algorithm

To solve a linear system $A X = B$ by the Gauss pivot algorithm:

Step 1 --- write the augmented matrix. Form $(A \mid B) \in M_{n,\, p+1}(\mathbb{K})$.
Step 2 --- choose a pivot. In the leftmost non-zero coefficient column (one of the first $p$ columns of $(A \mid B)$, not the right-hand-side column), choose a non-zero entry as pivot; permute rows (swap $L_i \leftrightarrow L_1$) to bring this entry to the top, when needed. Prefer a pivot equal to $1$ to simplify later computations. If, after Step 3, all coefficient entries in some row are zero but the right-hand side is non-zero, the row reads $0 = c$ with $c \ne 0$ and the system is incompatible.
Step 3 --- eliminate below the pivot. Let $r$ be the pivot row, $k$ the pivot column, and $a_{rk}$ the pivot value. For each row $i > r$, apply the transvection $L_i \leftarrow L_i - \dfrac{a_{ik}}{a_{rk}} L_r$ to clear the entry below the pivot. After this step, the entire column $k$ below the pivot is zero.
Step 4 --- recurse on the sub-system. Hide the pivot row and the column of the pivot; the resulting smaller augmented matrix is a sub-system to which steps 2--3 are reapplied. Stop when no non-zero column remains.
Step 5 --- back-substitute. The system is now in echelon form. Read the solution by solving the bottom non-zero equation for its pivot unknown, then substituting upward. Equivalently (the remontée step), clear entries above each pivot by transvections, until the reduced echelon form is reached --- the solution is then read directly off the augmented matrix.

The last non-zero row of the echelon form is the first one used in back-substitution; one then moves upward row by row. An empty row $(0\;\,0\;\,\cdots\;\,0 \mid c)$ with $c \ne 0$ signals that the system is incompatible.

Example

Solve the $3 \times 3$ system $\{x + y + z = 4 \;;\; x + 2y + 3z = 9 \;;\; x + 3y + 6z = 16\}$ by Gauss, working on the augmented matrix.

Answer

Form $(A \mid B)$ and reduce step by step: $$ \begin{aligned} & \left( \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 1 & 2 & 3 & 9 \\ 1 & 3 & 6 & 16 \end{array} \right) && \text{(initial)} \\ \xrightarrow[L_2 \leftarrow L_2 - L_1]{} \ & \left( \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 0 & 1 & 2 & 5 \\ 1 & 3 & 6 & 16 \end{array} \right) && \text{(transvection)} \\ \xrightarrow[L_3 \leftarrow L_3 - L_1]{} \ & \left( \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 0 & 1 & 2 & 5 \\ 0 & 2 & 5 & 12 \end{array} \right) && \text{(transvection)} \\ \xrightarrow[L_3 \leftarrow L_3 - 2 L_2]{} \ & \left( \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 0 & 1 & 2 & 5 \\ 0 & 0 & 1 & 2 \end{array} \right) && \text{(transvection).} \end{aligned} $$ The system is in echelon form. Back-substitute: the third equation gives $z = 2$; the second gives $y = 5 - 2z = 1$; the first gives $x = 4 - y - z = 1$. The unique solution is $(x, y, z) = (1, 1, 2)$.

Method — Reading the solution set after Gauss reduction

Once the augmented matrix has been reduced to echelon form:

identify the pivot columns and the non-pivot columns;
designate the unknowns associated with non-pivot columns as free parameters (one per non-pivot column among $x_1, \ldots, x_p$);
express the unknowns associated with pivot columns by back-substitution, working from the bottom non-zero equation upward;
write the solution column as a sum of one $\lambda$-independent column (the particular solution $X_p$) and a linear combination of $\lambda$-dependent columns; rewrite in the form $X_p + \mathrm{Vect}(Y_1, \ldots, Y_r)$ when the structure is to be exposed.

If a row of the form $(0\,\cdots\,0 \mid c)$ with $c \ne 0$ appears, the system is incompatible and the solution set is empty.

Example

Solve the $3 \times 4$ compatible system whose augmented matrix is $$ \left( \begin{array}{cccc|c} 1 & 2 & 0 & 1 & 3 \\ 0 & 1 & 0 & -1 & 1 \\ 0 & 0 & 1 & 2 & 4 \end{array} \right) $$ (already in echelon form). Identify the pivot columns and express the solution set in both parametric and $\mathrm{Vect}$ form.

Answer

The pivots are at positions $(1, 1)$, $(2, 2)$, $(3, 3)$. Pivot columns: $1$, $2$, $3$; non-pivot column: $4$. Designate $\lambda := x_4$ as the single free parameter. Back-substitute from the bottom row: $$ \begin{aligned} \text{(eq. 3)} \quad & x_3 + 2 \lambda = 4 \quad \Longrightarrow \quad x_3 = 4 - 2\lambda, \\ \text{(eq. 2)} \quad & x_2 - \lambda = 1 \quad \Longrightarrow \quad x_2 = 1 + \lambda, \\ \text{(eq. 1)} \quad & x_1 + 2 x_2 + \lambda = 3 \quad \Longrightarrow \quad x_1 = 3 - 2(1 + \lambda) - \lambda = 1 - 3\lambda. \end{aligned} $$ The solution set is, in parametric form, $$ \mathcal{S} = \left\{ \begin{pmatrix} 1 - 3\lambda \\ 1 + \lambda \\ 4 - 2\lambda \\ \lambda \end{pmatrix} \;:\; \lambda \in \mathbb{R} \right\} = \begin{pmatrix} 1 \\ 1 \\ 4 \\ 0 \end{pmatrix} + \mathrm{Vect}\!\left( \begin{pmatrix} -3 \\ 1 \\ -2 \\ 1 \end{pmatrix} \right). $$ There is exactly one parameter because there is exactly one non-pivot column ($4 - 3 = 1$).

Example

Determine whether the system $\{x + y + z = 1 \;;\; 2x + 2y + 2z = 1\}$ is compatible.

Answer

Apply $L_2 \leftarrow L_2 - 2 L_1$ on the augmented matrix: $$ \left( \begin{array}{ccc|c} 1 & 1 & 1 & 1 \\ 2 & 2 & 2 & 1 \end{array} \right) \xrightarrow[L_2 \leftarrow L_2 - 2 L_1]{} \left( \begin{array}{ccc|c} 1 & 1 & 1 & 1 \\ 0 & 0 & 0 & -1 \end{array} \right). $$ The second row reads $0 = -1$, which is impossible. The system is incompatible. Geometrically, the two planes $\{x + y + z = 1\}$ and $\{2x + 2y + 2z = 1\} = \{x + y + z = 1/2\}$ are parallel and distinct in $\mathbb{R}^3$.

Example

Let $t \in \mathbb{R}$ be a parameter. Discuss the solutions of the $3 \times 3$ system $$ \begin{cases} x + y + z = 1, \\ x + t y + z = 2, \\ x + y + t z = 3 \end{cases} $$ according to the value of $t$. Keep the case analysis as small as possible.

Answer

Apply two transvections on the augmented matrix (each on its own line of the reduction chain): $$ \begin{aligned} & \left( \begin{array}{ccc|c} 1 & 1 & 1 & 1 \\ 1 & t & 1 & 2 \\ 1 & 1 & t & 3 \end{array} \right) && \text{(initial)} \\ \xrightarrow[L_2 \leftarrow L_2 - L_1]{} \ & \left( \begin{array}{ccc|c} 1 & 1 & 1 & 1 \\ 0 & t - 1 & 0 & 1 \\ 1 & 1 & t & 3 \end{array} \right) && \text{(transvection)} \\ \xrightarrow[L_3 \leftarrow L_3 - L_1]{} \ & \left( \begin{array}{ccc|c} 1 & 1 & 1 & 1 \\ 0 & t - 1 & 0 & 1 \\ 0 & 0 & t - 1 & 2 \end{array} \right) && \text{(transvection).} \end{aligned} $$

Case $t \ne 1$. The reduced augmented matrix is already in echelon form, with pivots $1$, $t - 1$, $t - 1$, all non-zero. Back-substitute directly from the system: the last equation gives $(t-1)\, z = 2$, hence $z = 2/(t - 1)$; the second equation gives $(t-1)\, y = 1$, hence $y = 1/(t - 1)$; the first equation gives $x = 1 - y - z = 1 - 3/(t - 1) = (t - 4)/(t - 1)$. The system has a unique solution.
Case $t = 1$. The reduced augmented matrix becomes $$ \left( \begin{array}{ccc|c} 1 & 1 & 1 & 1 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 2 \end{array} \right). $$ The second row reads $0 = 1$, which is impossible. The system is incompatible.

Skills to practice

Solving by Gauss with unique solution
Solving by Gauss with parametric solutions
Identifying incompatible systems
Solving a parametric system

VI Square invertible systems: Cramer systems

This short final section is an application of the invertible-matrix criterion proven in the previous chapter Matrix calculus. When the matrix of the system is square and invertible, the equation $A X = B$ admits a unique solution given by the explicit formula $X = A^{-1} B$. Such systems are called Cramer systems. The determinantal Cramer formulas (expressing $A^{-1}$ via determinants) are deferred to the chapter Determinants.

Definition — Cramer system

A linear system $A X = B$ is called a Cramer system when the matrix $A$ is square and invertible, that is, $A \in \mathrm{GL}_n(\mathbb{K})$ for some $n \ge 1$. By construction, this requires $n = p$ (square system).

Proposition — Unique solution of a Cramer system

Let $A \in \mathrm{GL}_n(\mathbb{K})$ and $B \in M_{n,1}(\mathbb{K})$. The Cramer system $A X = B$ admits a unique solution, given by the explicit formula $$ \textcolor{colorprop}{X = A^{-1} B.} $$

Proof

Since $A$ is invertible, multiplying $A X = B$ on the left by $A^{-1}$ gives $A^{-1} (A X) = A^{-1} B$, i.e.\ $(A^{-1} A) X = A^{-1} B$, i.e.\ $I_n X = A^{-1} B$, i.e.\ $X = A^{-1} B$. Conversely, the column $X := A^{-1} B$ satisfies $A X = A (A^{-1} B) = (A A^{-1}) B = I_n B = B$, hence is a solution. Therefore $A^{-1} B$ is the unique solution.

Recall from Matrix calculus

For $A \in M_n(\mathbb{K})$, the following three conditions are equivalent:

$A$ is invertible (i.e.\ $A \in \mathrm{GL}_n(\mathbb{K})$);
for every $B \in M_{n,1}(\mathbb{K})$, the system $A X = B$ admits a unique solution;
the homogeneous system $A X = 0_{n,1}$ admits only the trivial solution $X = 0$.

The proof was given in Matrix calculus and is not repeated here. This recall ties the matrix-side characterization of invertibility to the system-side language: « system has a unique solution for every right-hand side ».

Method — Solving a Cramer system --- two routes

A Cramer system $A X = B$ can be solved in two equivalent ways:

Via the inverse --- compute $A^{-1}$ (by Gauss-Jordan or by the $2 \times 2$ closed-form formula of Matrix calculus), then $X = A^{-1} B$. Preferred when many right-hand sides $B_1, B_2, \ldots$ are to be processed with the same $A$ (compute $A^{-1}$ once, reuse).
Via Gauss directly --- apply the Gauss pivot algorithm to the augmented matrix $(A \mid B)$ and read $X$ by back-substitution. Preferred for a single $B$.

The first route minimises work in batch; the second minimises work for a one-shot system.

Example

Solve the $2 \times 2$ Cramer system $\{2x + y = 5 \;;\; 3x + 4y = 10\}$, both via the inverse of $A$ and via Gauss on the augmented matrix.

Answer

Write $A = \begin{pmatrix} 2 & 1 \\ 3 & 4 \end{pmatrix}$, $B = \begin{pmatrix} 5 \\ 10 \end{pmatrix}$. By the $2 \times 2$ invertibility criterion of Matrix calculus, since $2 \cdot 4 - 1 \cdot 3 = 5 \ne 0$, the matrix $A$ is invertible.

Via the inverse. The $2 \times 2$ inverse formula of Matrix calculus gives $$ A^{-1} = \frac{1}{5} \begin{pmatrix} 4 & -1 \\ -3 & 2 \end{pmatrix}, $$ hence $$ X = A^{-1} B = \frac{1}{5} \begin{pmatrix} 4 & -1 \\ -3 & 2 \end{pmatrix} \begin{pmatrix} 5 \\ 10 \end{pmatrix} = \frac{1}{5} \begin{pmatrix} 4 \cdot 5 + (-1) \cdot 10 \\ -3 \cdot 5 + 2 \cdot 10 \end{pmatrix} = \frac{1}{5} \begin{pmatrix} 10 \\ 5 \end{pmatrix} = \begin{pmatrix} 2 \\ 1 \end{pmatrix}. $$
Via Gauss. Reduce the augmented matrix: $$ \left( \begin{array}{cc|c} 2 & 1 & 5 \\ 3 & 4 & 10 \end{array} \right) \xrightarrow[L_2 \leftarrow 2 L_2 - 3 L_1]{} \left( \begin{array}{cc|c} 2 & 1 & 5 \\ 0 & 5 & 5 \end{array} \right). $$ The second row gives $5 y = 5$, so $y = 1$; the first gives $2 x + 1 = 5$, so $x = 2$.

Both routes yield the same unique solution $(x, y) = (2, 1)$.

This closes the chapter. The same matrix-side toolkit of Matrix calculus now lets us write, classify, and solve any linear system encountered in this course. The geometric interpretation given above will be deepened in Affine subspaces; the rank-based existence and uniqueness criteria, deliberately not introduced here, will be formalised in Finite-dimensional vector spaces and Change of basis; the determinantal Cramer formulas will be derived in Determinants.

Skills to practice

Recognizing a Cramer system
Solving a Cramer system