🛷

Chapter 3. Elementary Matrix Operations and Systems of Linear Equations

Created

2020/12/21 15:08

3.1. Elementary Matrix Operations and Elementary Matrices

이번 절에서는 행렬의 rank를 보전하는 기본적인 세 가지 연산의 개념을 공부한다. 우선 그 세 가지 연산을 다음과 같이 정의한다.

(Def: Elementary Row Operation) Let $A$ be an $m \times n$ matrix. Any one of the following three operations on the rows (columns) of $A$ is called an elementary row (column) operation:

•

type 1: interchanging any tow rows (columns) of $A$;

•

type 2: multiplying any row (column) of $A$ by a nonzero scalar;

•

type 3: adding any scalar multiple of a row (column) of $A$ to another row (column).

이러한 elementary operations는 가역적이다. 즉, 행렬 $Q$가 $P$에서 elementary operation들을 통해 도출될 수 있으면, 그 반대 역시 마찬가지이다. 그런 의미에서 identity matrix에 elementary operations를 가한 행렬들은 identity matrix의 성질들을 일정 부분 보전하며, 따라서 다음과 같이 미리 정의해둔다.

(Def: Elementary Matrix) An $n \times n$ elementary matrix is a matrix obtained by performing an elementary operation on $I_n$. The elementary matrix is said to be of type 1, 2, or 3 according to whether the elementary operation performed on $I_n$ is a type 1, 2, or 3 operation, respectively.

그러면 놀랍게도(?) 어떤 행렬에 elementary operation을 가하는 것은 대응되는 elementary matrix를 해당 행렬에 곱하는 것과 같다. 증명은 type 1, 2, 3에 대해 일일이 수행하면 어렵지 않다.

(Thm 3.1) Let $A \in M_{m \times n}(F)$, and suppose that $B$ is obtained from $A$ by performing an elementary row (column) operation. Then there exists an $m \times m [n \times n]$ elementary matrix $E$ such that $B = EA [B = AE]$. In fact, $E$ is obtained from $I_m [I_n]$ by performing the same elementary row (column) operation as that which was performed on $A$ to obtain $B$. Conversely, if $E$ is an elementary $m \times m [n \times n]$ matrix, then $EA [AE]$ is the matrix obtained from $A$ by performing the same elementary row (column) operation as that which produces $E$ from $I_m [I_n]$.

위에서 언급했듯, 이러한 elementary operation은 가역적이다. 이에 따라 어떤 elementary matrix $E$에 대해 $EA = I$인 elementary matrix $A$가 존재한다. 이는 사실 identity matrix에 같은 타입의 상반되는 operation을 가해 얻은 elementary matrix이다.

(Thm 3.2) Elementary matrices are invertible, and the inverse of an elementary matrix is an elementary matrix of the same type.

3.2. The Rank of a Matrix and Matrix Inverses

그렇다면 행렬의 rank는 대체 뭘까? 이번 절에서는 rank와 그것을 이용해 행렬의 역을 구하는 방법을 공부한다.

우선 행렬의 rank를 정의한다. 선형 사상의 rank는 치역(정의역 전체에 대한 상)이 되는 벡터 공간의 dimension이었다. 사실 행렬의 rank가 바로 그것이다.

(Def: Rank) If $A \in M_{m \times n}(F)$, we define the rank of $A$, denoted $rank(A)$, to be the rank of the linear transformation $L_A: F^n \rightarrow F^m$.

신기할것도 없이 선형 사상의 rank에 관한 많은 정리들이 행렬의 rank에서도 따른다. 예컨대 $A \in M_{n \times n}(F)$에 대해 그 역행렬의 존재와 $rank(A) = n$는 필요충분조건이다. (전단사 함수에 대해서만 역함수가 정의되고, 또 행렬의 역행렬은 대응되는 선형 사상(함수)가 역함수를 가져야만 존재하기 때문이다.)

행렬은 임의의 선형 사상의, 적절한 standard ordered basis에 대한 matrix representation이다. 따라서 임의의 선형 사상 $L_A$의 rank는 곧 대응하는 행렬 $A$의 rank이다. 다음 정리가 그것을 말해준다.

(Thm 3.3) Let $T: V \rightarrow W$ be a linear transformation between finite-dimensional vector spaces, and let $\beta$ and $\gamma$ be ordered bases for $V$ and $W$, respectively. Then $rank(T) = rank([T]_{\beta}^{\gamma})$.

이제 선형 사상의 rank를 찾기 위해 대응되는 행렬의 rank를 찾으면 된다. Invertible matrix의 경우에는 전단사이기 때문에 다음의 정리가 쉽게 증명된다.

(Thm 3.4) Let $A$ be an $m \times n$ matrix. If $P$ and $Q$ are invertible $m \times m$ and $n \times n$ matrices, respectively, then

•

$rank(AQ) = rank(A)$.

•

$rank(PA) = rank(A)$.

•

$rank(PAQ) = rank(A)$.

또 모든 elementary matrix는 invertible이었다.

(Cor) Elementary row and column operations on a matrix are rank-preserving.

이제 행렬의 rank를 보전하는 연산을 배웠으니, 행렬의 rank를 특정하는 방법을 배워 보자. $F^n$의 standard ordered basis $\beta = \{ e_1, ..., e_n \}$이 주어져있을 때, 선형 사상 $L_A$의 range는 $R(L_A) = span(\beta) = span(\{ L_A(e_1), ..., L_A(e_n) \})$이다. 그런데 $A$의 $j$번째 열을 $a_j$라 한다면 각각 $L_A(e_j) = a_j$이므로 $R(L_A) = span(\{ a_1, ..., a_n \})$이다. 따라서 다음의 정리가 증명된다.

(Thm 3.5) The rank of any matrix equals the maximum number of its linearly independent columns; that is, the rank of a matrix is the dimension of the subspace generated by its columns.

즉 행렬의 rank를 구하기 위해서는 elementary operation들을 통해 column들의 linear dependency를 판단하기 좋은 형태로 행렬을 변경해나간 후 maximum linearly independent column들의 개수를 세면 될 것이다. 물론 컴퓨터가 알아서 다 잘 해 주겠지.

그렇지만 다음의 정리는 사람이 행렬의 rank를 손쉽게 계산할 수 있는 강력한 가이드라인을 제공해 준다. 증명은 매우 길고 지루하니 생략한다.

(Thm 3.6) Let $A$ be an $m \times n$ matrix of rank $r$. Then $r < m$, $r < n$, and, by means of a finite number of elementary row and column operations, $A$ can be transformed into the matrix

$$ D = \begin{pmatrix} I_r & O_1 \\ O_2 & O_3 \\ \end{pmatrix} $$

이제 다음의 정리들이 자연스럽게 따른다.

(Cor1) Let $A$ be an $m \times n$ matrix of rank $r$. Then there exist invertible matrices $B$ and $C$ of sizes $m \times m$ and $n \times n$, respectively, such that $D = BAC$, where $D$ is as shown above.

(Cor2) Let $A$ be an $m \times n$ matrix. Then

•

$rank(A^t) = rank(A)$.

•

The rank of any matrix equals the maximum number of its linearly independent rows; that is, the rank of a matrix is the dimension of the subspace generated by its rows.

•

The rows and columns of any matrix generate subspaces of the same dimension, numerically equal to the rank of the matrix.

(Cor3) Every invertible matrix is a product of elementary matrices.

이제 행렬곱(혹은 선형 사상의 합성)과 rank와의 관계를 규명한다. 다음의 정리는 생각해보면 당연하다.

(Thm 3.7) Let $T: V \rightarrow W$ and $U: W \rightarrow Z$ be linear transformations on finite-dimensional vector spaces $V$, $W$, and $Z$, and let $A$ and $B$ be matrices such that the product $AB$ is defined. Then

•

$rank(UT) ≤ rank(T)$.

•

$rank(UT) ≤ rank(U)$.

•

$rank(AB) ≤ rank(A)$.

•

$rank(AB) ≤ rank(B)$.

The Inverse of a Matrix

행렬 $A$의 역은 $A \in M_{n \times n}(F)$인 경우에만 정의된다. 그리고 이러한 $A$가 역행렬을 가질 필요충분조건은 $rank(A) = n$이다. 따라서 위의 방법으로 행렬의 rank를 평가함으로써 행렬의 invertibility를 평가할 수 있겠다.

이 절의 마지막 부분에서는 이러한 행렬의 역행렬을 elementary operation들을 통해 구하는 법을 공부한다.

먼저 augmented matrix를 정의한다.

(Def: Augmented Matrix) Let $A$ and $B$ be $m \times n$ and $m \times p$ matrices, respectively. By the augmented matrix $(A | B)$, we mean the $m \times (n + p)$ matrix $(A \ B)$, that is, the matrix whose first $n$ columns are the columns of $A$, and whose last $p$ columns are the columns of $B$.

이제 만약 augmented matrix $M = (A | B)$에 대해 $CM = (CA | CB)$임을 알 수 있다. 이에 따라 $C = (A | I)$에 대해 $A^{-1}C = (I | A^{-1})$이다. 그런데 정리 3.6에 따르면 elementary row operation으로 $A$를 $I$로 만들어낼 수 있다. 이러한 성질을 이용해 inverse matrix를 구한다.

3.3. Systems of Linear Equations - Theoretical Aspects

이번 절에서는 지금까지 배운 내용을 활용해 주어진 선형 방정식의 해가 형성하는 공간에 대해 공부한다. 먼저 다음과 같은 선형 방정식이 주어졌다고 하자.

$$ a_{11} + a_{12} x_2 + ... + a_{1n} x_n = b_1 \\ a_{21} + a_{22} x_2 + ... + a_{2n} x_n = b_2 \\ ...\\ a_{m1} + a_{m2} x_2 + ... + a_{mn} x_n = b_m $$

위의 방정식을 계수 행렬(coefficient matrix) $A$와 변수 벡터 $x$, 상수 벡터 $b$를 이용해 다음과 같이 다시 쓸 수 있다.

$$ Ax = b $$

이 때 $As = b$인 벡터 $s$를 선형 방정식의 해(solution)라 하며, 이러한 해를 모두 모은 집합을 solution set이라 한다. solution set이 공집합일 경우 방정식이 inconsistent하다, 그렇지 않을 경우 consistent하다 말한다.

또한 $b = 0$인 system을 homogeneous하다고 말한다. 이러한 homogeneous한 system의 solution set을 heterogeneous한 system의 solution set으로 확장할 수 있으니 먼저 homogenous system에 대해 공부하는 것이 합리적이다. 우선 homogenous system을 다음과 같이 정의한다.

(Def: Homogenous System) A system $Ax = b$ of $m$ linear equations in $n$ unknowns is said to be homogeneous if $b = 0$. Otherwise the system is said to be nonhomogeneous.

이제 Dimension Theorem에 의해 다음의 정리는 분명하다.

(Thm 3.8) Let $Ax = 0$ be a homogeneous system of $m$ linear equations in $n$ unknowns over a field $F$. Let $K$ denote the set of all solutions to $Ax = 0$. Then $K = N(L_A)$; hence $K$ is a subspace of $F^n$ of dimension $n - rank(L_A) = n - rank(A)$.

여기서 linear system의 solution set은 벡터 공간이라는 것이 밝혀졌다. 또 $rank(A) ≤ m < n$이므로 다음의 정리가 따른다.

(Cor) If $m < n$, the system $Ax = 0$ has a nonzero solution.

이제 nonhomogeneous system의 solution set을 honogeneous system의 solution set으로부터 도출해낼 수 있다. 우선 $Ax = b$인 nonhomogenous system에 대해 homogeneous system $Ax = 0$를 homogeneous system corresponding to $Ax = b$라 부른다.

Nonhomogeneous system $Ax = b$의 임의의 두 해 $s, w$가 주어졌다고 하자. 그러면 $A(s - w) = 0$이고, $s - w \in K_H$이다. 그러면 어떤 $k \in K_H$에 대해 $w = k + s$이다. $s, w$를 임의로 잡았기 때문에 다음의 정리가 증명된다.

(Thm 3.9) Let $K$ be the solution set of a system of linear equations $Ax = b$, and let $K_H$ be the solution set of the corresponding homogeneous system $Ax = 0$. Then for any solution $s$ to $Ax = b$, $K = \{ s \} + K_H = \{ s + k: k \in K_H \}$.

또한 system이 오직 하나의 해를 가진다면 corresponding homogeneous system의 solution이 0 벡터만을 포함하고, 그 반대는 자명하므로 다음의 정리가 증명된다.

(Thm 3.10) Let $Ax = b$ be a system of $n$ linear equations in $n$ unknowns. If $A$ is invertible, then the system has exactly one solution, namely, $A^{-1}b$. Conversely, if the system has exactly one solution, then $A$ is invertible.

이제 다음의 정리는 linear system의 consistency를 판단하는 방법을 제공한다. Linear system $Ax = b$가 consistent하다는 것은 즉 선형 사상 $L_A$의 range에 $b$가 포함된다는 것이다. 이는 $span(\beta) = span(\beta, b)$를 의미하며, 따라서 정리가 증명된다.

(Thm 3.11) Let $Ax = b$ be a system of linear equations. Then the system is consistent if and only if $rank(A) = rank(A | b)$.

3.4. Systems of Linear Equations - Computational Aspects

이번 절에서는 linear system을 같은 해를 가진, 보다 풀기 쉬운 linear system으로 바꾸어 그 해를 구하는 방법을 공부한다. 우선 linear equation의 equivalency를 정의한다.

(Def: Equivalency) Two systems of linear equations are called equivalent if they have the same solution set.

그러면 다음과 같이 equivalent system을 만들어낼 수 있다.

(Thm 3.13) Let $Ax = b$ be a system of $m$ linear equations in $n$ unknowns, and let $C$ be an invertible $m \times m$ matrix. Then the system $(CA)x = Cb$ is equivalent to $Ax - b$.

또 임의의 invertible matrix는 elementary matrix의 곱으로 나타낼 수 있다. 즉, 다음의 정리가 따른다.

(Cor) Let $Ax = b$ be a system of $m$ linear equations in $n$ unknowns. If $(A' | b')$ is obtained from $(A | b)$ by a finite number of elementary row operations, then the system $A'x = b'$ is equivalent to the original system.

즉 elementary operation은 행렬의 rank뿐만 아니라 system의 해 또한 보전한다. 이제 linear system $Ax = b$에 대한 augmented matrix $(A | b)$를 reduced row echelon form으로 바꾸어 system의 해를 구해낼 수 있다.

(Def: Reduced Row Echelon Form) A matrix is said to be in reduced row echelon form if the following three conditions are satisfied.

•

Any row containing a nonzero entry precedes any row in which all the entries are zero (if any).

•

The first nonzero entry in each row is the only nonzero entry in its column.

•

The first nonzero entry in each row is 1 and it occurs in a column to the right of the first nonzero entry in the preceding row.

다음과 같이 행렬을 reduced row echelon form으로 변환하는 과정을 Gaussian elimination이라 한다. 행렬을 reduced row echelon form으로 변환하는 많은 방법이 있지만, Gaussian elimination이 알려진 알고리즘 중에서는 가장 계산비용이 낮다. 실제로 컴퓨터에서도 이 알고리즘을 사용한다고.

•

forward pass: the augmented matrix is transformed into an upper triangular matrix in which the first nonzero entry of each row is 1, and it occurs in a column to the right of the first nonzero entry of each preceding row.

•

backward pass: the upper triangular matrix is transformed into reduced row echelon form by making the first nonzero entry of each row the only nonzero entry of its column.

다음의 정리에 따라 모든 linear system은 Gaussian elimination을 통해 풀어질 수 있다.

(Thm 3.14) Gaussian elimination transforms any matrix into its reduced row echelon form.

또한 다음의 정리 역시 reduced echelon form의 정의에 의해 명확하다.

(Thm 3.15) Let $Ax = b$ be a system of $r$ nonzero equations in $n$ unknowns. Suppose that $rank(A) = rank(A | b)$ and that $(A | b)$ is in reduced row echelon form. Then

•

$rank(A) = r$.

•

If the general solution obtained by the procedure above is of the form $s = s_0 + t_1 u_1 + t_2 u_2 + ... t_{n - r} u_{n - r}$, then $\{ u_1, ..., u_{n - r} \}$ is a basis for the solution set of the corresponding homogeneous system, and $s_0$ is a solution to the original system.

또 다음의 정리는 reduced echelon form에 대한 해석을 용이하게 해 준다. 정리는 복잡하지만 어렵지는 않으니 책을 참고하자.

(Thm 3.16) Let $A$ be an $m \times n$ matrix of rank $r$, where $r > 0$, and let $B$ be the reduced row echelon form of $A$. Then

•

The number of nonzero rows in $B$ is $r$.

•

For each $i = 1,2, ..., r$, there is a column $b_{j_i}$ of $B$ such that $b_{j_i} = e_i$.

•

The columns of $A$ numbered $j_1,j_2, ..., j_r$ are linearly independent.

•

For each $k = 1, 2, ..., n$, if column $k$ of $B$ is $d_1 e_1 + d_2 e_2 + ... + d_r e_r$, then column $k$ of $A$ is $d_1 a_1 + d_2 a_2 + ... + d_r a_r$.