$\mathbf{x}: n \times 1$
๋ถ์๋ ์ด์์numerator layout
$\mathbf{f} : m \times 1$, $\mathbf{x} : n \times 1$
๋ถ์๋ ์ด์์(์ผ์ฝ๋น์๊ณผ ๊ฐ์ ๊ฒฝ์ฐ)
์ธ ๋ฒกํฐ ๋ณ์ $\mathbf{x}$, $\mathbf{y}$, $\mathbf{z}$์ ๋ํด $\mathbf{y} = f(\mathbf{x})$, $\mathbf{z} = g(\mathbf{y})$์ธ ํจ์๊ด๊ณ๊ฐ ์์ ๋ $\dfrac{\partial \, \mathbf{z}}{\partial \, \mathbf{x}}$
$$ \mathbf{x} = \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \\ \end{bmatrix} \qquad \mathbf{y} = \begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{r} \\ \end{bmatrix} \qquad \mathbf{z} = \begin{bmatrix} z_{1} \\ z_{2} \\ \vdots \\ z_{m} \\ \end{bmatrix} $$๋ถ์ ๋ ์ด์์์ผ๋ก ํ๋ฉด ์ค์นผ๋ผ ๋ฏธ๋ถ์ ์ฒด์ธ๋ฃฐ๊ณผ ๋ณ ๋ค๋ฅผ ๊ฒ์ด ์์
์ค์นผ๋ผ๋ผ๋ฉด
์ฒ๋ผ ์ด๋ค ์์๋ก ์ฒด์ธ๋ฃฐ์ ์ ์ด๋ ์๊ด์์ง๋ง ๊ด์ต์ ์ผ๋ก ์ฒซ๋ฒ์งธ์ฒ๋ผ ์ค๋ฅธ์ชฝ์ผ๋ก ๊ฐ๋ฉด์ ์ฒด์ธ๋ฃฐ์ ์ ๋๋ค.
$\mathbf{A}$๊ฐ 2 x 2 ํ๋ ฌ์ด๋ฉด ๋ค์์ฒ๋ผ ๋๋ค.
$$ \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \otimes \mathbf{B} = \begin{bmatrix} a_{11}\mathbf{B} & a_{12}\mathbf{B} \\ a_{21}\mathbf{B} & a_{22}\mathbf{B} \end{bmatrix} $$(1)์ ์น๋ ์ผ๋ฐ์ ์ธ ์ ์น์ ๋์ผํ๊ฒ ๋จ $\mathbf{A}^{(1)} = \mathbf{A}^{\text{T}}$
(ํ๊ฐ์)์ ์น๋ vec๊ณผ ๋์ผํ๊ฒ ๋จ $\mathbf{A}^{(rows(\mathbf{A}))} = \text{vec}(\mathbf{A})$
์ ์น์ ๋ค์ด๊ฐ ์ ์๋ ์ซ์ $(r)$์ ํ๊ฐ์๋ฅผ ๋๋ ์ ์๋ ์์ฐ์
๋ฐ๋ผ์ ํ๋ฒกํฐ๋ (1)์ ์น๋ง ์ฑ๋ฆฝ
๋ถ๋ชจ์ vec์ฐ์ฐ์๋ฅผ ์ด์ฉํ์ฌ ๋ฒกํฐ๋ฅผ ๋ฒกํฐ๋ก ๋ฏธ๋ถํ๋๊ฒ๋ ๊ฐ๋ฅ(์๋ ์์ ์์ ํ์ธํจ)
ํ๋ ฌ์ ํ๋ ฌ๋ก ๋ฏธ๋ถํ๋ ๊ฒฝ์ฐ๋ ๋ ๋ฐฉ์ ๋ชจ๋ ๋๊ฐ์ด ์ ์ฉ ๊ฐ๋ฅ
$\mathbf{X} : m \times n $, $\mathbf{b} : n \times 1$, $\mathbf{Xb} : m \times 1$
๋ถ๋ชจ๋ฅผ ํ๋ ฌ๋ก ๊ทธ๋๋ก ๋ฏธ๋ถํ๋ ๊ฒฝ์ฐ
$\mathbf{X} : m \times n$, $\mathbf{Y} : n \times r$ , $\mathbf{Z} : p \times q$ ์ผ ๋ ๋ฏธ๋ถ ๊ฒฐ๊ณผ๋ $mp \times rq$
$$ \frac{\partial \, (\mathbf{XY})}{\partial \, \mathbf{Z}} = \left( \frac{\partial \, \mathbf{X} }{\partial \, \mathbf{Z}} \right) \left( \mathbf{I}_{q} \otimes \mathbf{Y} \right) + \left( \mathbf{I}_{p} \otimes \mathbf{X} \right)\left( \frac{\partial \, \mathbf{Y}}{\partial \, \mathbf{Z}} \right) $$ํ๋ ฌ๋ก ๋ฏธ๋ถ์ ํ ๋๋ ๊ณฑ์ ๋ฏธ๋ถ๋ฒ์ด ๊ทธ๋๋ก ์ ์ฉ๋๋ ์ฐจ์ ๋ง์ถค์ ์ฃผ์ ํด์ผ ํ๋ค.
์ ๋ฏธ๋ถ์ด ๋ค์์ฒ๋ผ ๋์ง ์๋๊ฒ์
$$ \frac{\partial \, (\mathbf{XY})}{\partial \, \mathbf{Z}} = \left( \frac{\partial \, \mathbf{X} }{\partial \, \mathbf{Z}} \right) \mathbf{Y} + \mathbf{X} \left( \frac{\partial \, \mathbf{Y}}{\partial \, \mathbf{Z}} \right) $$$\frac{\partial \, \mathbf{X} }{\partial \, \mathbf{Z}}$๊ฐ $mp \times nq$๊ฐ ๋๊ธฐ ๋๋ฌธ์ $\mathbf{Y}$๋ฅผ ๋ฐ๋ก ๊ณฑํ ์ ๊ฐ ์๊ธฐ ๋๋ฌธ์ด๋ค.
๋ค์์ ๊ณฑํด์ง๋ $\mathbf{Y}$๊ฐ ์ด๋ค ํํ๋ก ๋ณํด์ผ ์ ์ ํ ์์ ๊ณฑ์ ์ ์งํ ์ ์๋์ง ์์๋ณด๊ธฐ ์ํด $\mathbf{X} : 1 \times 2$, $\mathbf{Y} : 2 \times 1$, $\mathbf{Z} : 2 \times 2$๋ก ๋๊ณ ์๋ฅผ ๋ค์ด๋ณด๋ฉด
$$ \mathbf{X}\mathbf{Y} = \begin{bmatrix} \color{RoyalBlue}{X_1} & \color{OrangeRed}{X_2} \end{bmatrix} \begin{bmatrix} \color{RoyalBlue}{Y_1} \\ \color{OrangeRed}{Y_2} \end{bmatrix} = \color{RoyalBlue}{X_1} \color{RoyalBlue}{Y_1} + \color{OrangeRed}{X_2} \color{OrangeRed}{Y_2} $$์ฒ๋ผ $\mathbf{X}$์ $\mathbf{Y}$์ ๊ณฑ์ $X_i Y_i$๊ฐ ๋์ด์ผ ํ๋ค.
์๋์ฒ๋ผ $\mathbf{X}$๊ฐ ๋ฏธ๋ถ๋ ๊ฒฐ๊ณผ์ $\mathbf{Y}$๊ฐ $X_i Y_i$ ํํ๋ก ์ ์ ํ ๊ณฑํด์ง๊ธฐ ์ํด์๋
$$ \frac{\partial \, \mathbf{X} }{\partial \, \mathbf{Z}} = \begin{bmatrix} \dfrac{\partial \, \color{RoyalBlue}{X_1}}{\partial \, Z_{11}} & \dfrac{\partial \, \color{OrangeRed}{X_2}}{\partial \, Z_{11}} & \dfrac{\partial \, \color{RoyalBlue}{X_1}}{\partial \, Z_{12}} & \dfrac{\partial \, \color{OrangeRed}{X_2}}{\partial \, Z_{12}} \\ \dfrac{\partial \, \color{RoyalBlue}{X_1}}{\partial \, Z_{21}} & \dfrac{\partial \, \color{OrangeRed}{X_2}}{\partial \, Z_{21}} & \dfrac{\partial \, \color{RoyalBlue}{X_1}}{\partial \, Z_{22}} & \dfrac{\partial \, \color{OrangeRed}{X_2}}{\partial \, Z_{22}} \end{bmatrix} $$$\mathbf{Y}$์ ํํ๊ฐ ๋ค์์ฒ๋ผ ํ์ฅ๋์ด์ผ ํ๋ค.
$$ \begin{bmatrix} \color{RoyalBlue}{Y_1} & 0 \\ \color{OrangeRed}{Y_2} & 0 \\ 0 & \color{RoyalBlue}{Y_1} \\ 0 & \color{OrangeRed}{Y_2} \end{bmatrix} = \mathbf{I}_{2} \otimes \mathbf{Y} $$$\mathbf{X} : m \times n = m \times 1$, $\mathbf{Y} : n \times r = 1 \times 1$, $\mathbf{Z} : p \times q = p \times 1$ ์ผ ๋ $\dfrac{\partial \, (\mathbf{XY})}{\partial \, \mathbf{Z}}$
์ด๊ฒ์ vec ์ฐ์ฐ์๋ฅผ ์ฌ์ฉํ์ฌ ๋ฏธ๋ถํ๋ ๊ฒฝ์ฐ ์ด๋ฏธ ๋ถ์, ๋ถ๋ชจ๊ฐ ๋ชจ๋ ๋ฒกํฐ์ด๋ฏ๋ก ๊ฒฐ๊ณผ๋ ์์ ๋์ผํ๊ฒ ๋๋ค.
ํ์ง๋ง ์ด๊ฒ์ ํฌ๋ก๋ค์ปค ๊ณฑ์ ์ด์ฉํ ๋ฐฉ๋ฒ์ผ๋ก ๋ํ๋ด๋ฉด ์กฐ๊ธ ๋ณต์กํด์ง๋๋ฐ
๋ก ๋๋ฉฐ, ์ด๋ $\left( \frac{\partial \, \mathbf{X} }{\partial \, \mathbf{Z}} \right)$๊ฐ ๋ฒกํฐ๋ฅผ ๋ฒกํฐ๋ก ๋ฏธ๋ถํ๋ ๊ฒฝ์ฐ์ด๊ธฐ ๋๋ฌธ์ ์ผ์ฝ๋น์์ด ๋ ์๋ ์์ง๋ง ํฌ๋ก๋ค์ปค ๊ณฑ์ ์ด์ฉํ์ฌ ๋ฏธ๋ถ์ ๊ณ์ฐํ ๋ ๋ชจ๋ ๋ฏธ๋ถ์ ์ผ๊ด์ฑ์๊ฒ ํฌ๋ก๋ค์ปค ๊ณฑ์ ๋ฐฉ์์ผ๋ก ๊ธฐ์ ํด์ผ ํ๋ค. ๊ทธ๋ ์ง ์์ผ๋ฉด ์ฐจ์ ๋ง์ถค์ด ๊นจ์ง๋ ์ค๋ฅ๊ฐ ๋ฐ์ํ๋ค. ๋ชจ๋ ๋ฏธ๋ถ์ ํฌ๋ก๋ค์ปค ๊ณฑ์ผ๋ก ํ์ฅํ๋ฉด
$$ \underbrace{\left( \frac{\partial \,}{\partial \, \mathbf{Z}} \otimes \mathbf{X} \right)}_{mp \times 1} \underbrace{\left( \mathbf{I}_{1} \otimes \mathbf{Y} \right)}_{1 \times 1} + \underbrace{\left( \mathbf{I}_{p} \otimes \mathbf{X} \right)}_{mp \times p} \underbrace{\left( \frac{\partial \, }{\partial \, \mathbf{Z}} \otimes \mathbf{Y} \right)}_{p \times 1} $$๊ฐ ๋์ด ๊ฒฐ๊ณผ๋ $mp \times 1$์ด ๋๊ณ ์ด๋ฅผ (m)-transpose ์ํค๋ฉด $m \times p$ ์ผ์ฝ๋น์๊ณผ ์ผ์นํ๊ฒ ๋๋ค. ์ง์ ๊ณ์ฐ์ ํด๋ณด๋ฉด
$$ \begin{align} \frac{\partial \, \mathbf{X} }{\partial \, \mathbf{Z}} &= \frac{\partial \,}{\partial \, \mathbf{Z}} \otimes \mathbf{X} = \begin{bmatrix} \dfrac{\partial}{\partial \, Z_{1}} \\ \dfrac{\partial}{\partial \, Z_{2}} \\ \vdots \\ \dfrac{\partial}{\partial \, Z_{p}} \end{bmatrix}\otimes \mathbf{X} = \left[ \begin{array}{c} \color{RoyalBlue}{\begin{matrix} \dfrac{\partial \, X_1}{\partial \, Z_1} \\ \dfrac{\partial \, X_2}{\partial \, Z_1} \\ \vdots \\ \dfrac{\partial \, X_m}{\partial \, Z_1} \end{matrix}} \\ \hline \color{OrangeRed}{\begin{matrix} \dfrac{\partial \, X_1}{\partial \, Z_2} \\ \dfrac{\partial \, X_2}{\partial \, Z_2} \\ \vdots \\ \dfrac{\partial \, X_m}{\partial \, Z_2} \end{matrix}} \\ \hline \vdots \\ \hline \color{YellowGreen}{\begin{matrix} \dfrac{\partial \, X_1}{\partial \, Z_p} \\ \dfrac{\partial \, X_2}{\partial \, Z_p} \\ \vdots \\ \dfrac{\partial \, X_m}{\partial \, Z_p} \end{matrix}} \end{array} \right] \end{align} $$$$ \mathbf{I}_{1} \otimes \mathbf{Y} = \left[ Y_1 \right] $$$$ \begin{align} \mathbf{I}_p \otimes \mathbf{X} &= \begin{bmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{bmatrix} \otimes \mathbf{X} = \left[ \begin{array}{c} \begin{matrix} X_1 & 0 & \cdots & 0 \\ X_2 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ X_m & 0 & \cdots & 0 \end{matrix} \\ \hline \begin{matrix} 0 & X_1 & \cdots & 0 \\ 0 & X_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & X_m & \cdots & 0 \end{matrix} \\ \hline \vdots \\ \hline \begin{matrix} 0 & 0 & \cdots & X_1 \\ 0 & 0 & \cdots & X_2 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & X_m \end{matrix} \end{array} \right] \end{align} $$๊ฐ ํญ์ ์ค์ ๋ก ๊ณฑํด๋ณด๋ฉด
$$ \begin{align} &\underbrace{\left( \frac{\partial \,}{\partial \, \mathbf{Z}} \otimes \mathbf{X} \right)}_{mp \times 1} \underbrace{\left( \mathbf{I}_{1} \otimes \mathbf{Y} \right)}_{1 \times 1} + \underbrace{\left( \mathbf{I}_{p} \otimes \mathbf{X} \right)}_{mp \times p} \underbrace{\left( \frac{\partial \, }{\partial \, \mathbf{Z}} \otimes \mathbf{Y} \right)}_{p \times 1} \\ =& \, \left[ \begin{array}{c} \color{RoyalBlue}{\begin{matrix} Y_1 \dfrac{\partial \, X_1}{\partial \, Z_1} \\ Y_1 \dfrac{\partial \, X_2}{\partial \, Z_1} \\ \vdots \\ Y_1 \dfrac{\partial \, X_m}{\partial \, Z_1} \end{matrix}} \\ \hline \color{OrangeRed}{\begin{matrix} Y_1 \dfrac{\partial \, X_1}{\partial \, Z_2} \\ Y_1 \dfrac{\partial \, X_2}{\partial \, Z_2} \\ \vdots \\ Y_1 \dfrac{\partial \, X_m}{\partial \, Z_2} \end{matrix}} \\ \hline \vdots \\ \hline \color{YellowGreen}{\begin{matrix} Y_1 \dfrac{\partial \, X_1}{\partial \, Z_p} \\ Y_1 \dfrac{\partial \, X_2}{\partial \, Z_p} \\ \vdots \\ Y_1 \dfrac{\partial \, X_m}{\partial \, Z_p} \end{matrix}} \end{array} \right] + \left[ \begin{array}{c} \color{RoyalBlue}{\begin{matrix} X_1 \dfrac{\partial \, Y_1}{\partial \, Z_1} \\ X_2 \dfrac{\partial \, Y_1}{\partial \, Z_1} \\ \vdots \\ X_m \dfrac{\partial \, Y_1}{\partial \, Z_1} \end{matrix}} \\ \hline \color{OrangeRed}{\begin{matrix} X_1 \dfrac{\partial \, Y_1}{\partial \, Z_2} \\ X_2 \dfrac{\partial \, Y_1}{\partial \, Z_2} \\ \vdots \\ X_m \dfrac{\partial \, Y_1}{\partial \, Z_2} \end{matrix}} \\ \hline \vdots \\ \hline \color{YellowGreen}{\begin{matrix} X_1 \dfrac{\partial \, Y_1}{\partial \, Z_p} \\ X_2 \dfrac{\partial \, Y_1}{\partial \, Z_p} \\ \vdots \\ X_m \dfrac{\partial \, Y_1}{\partial \, Z_p} \end{matrix}} \end{array} \right] \quad = \quad \left[ \begin{array}{c} \color{RoyalBlue}{\begin{matrix} Y_1 \dfrac{\partial \, X_1}{\partial \, Z_1} + X_1 \dfrac{\partial \, Y_1}{\partial \, Z_1} \\ Y_1 \dfrac{\partial \, X_2}{\partial \, Z_1} + X_2 \dfrac{\partial \, Y_1}{\partial \, Z_1} \\ \vdots \\ Y_1 \dfrac{\partial \, X_m}{\partial \, Z_1} + X_m \dfrac{\partial \, Y_1}{\partial \, Z_1} \end{matrix}} \\ \hline \color{OrangeRed}{\begin{matrix} Y_1 \dfrac{\partial \, X_1}{\partial \, Z_2} + X_1 \dfrac{\partial \, Y_1}{\partial \, Z_2} \\ Y_1 \dfrac{\partial \, X_2}{\partial \, Z_2} + X_2 \dfrac{\partial \, Y_1}{\partial \, Z_2} \\ \vdots \\ Y_1 \dfrac{\partial \, X_m}{\partial \, Z_2} + X_m \dfrac{\partial \, Y_1}{\partial \, Z_2} \end{matrix}} \\ \hline \vdots \\ \hline \color{YellowGreen}{\begin{matrix} Y_1 \dfrac{\partial \, X_1}{\partial \, Z_p} + X_1 \dfrac{\partial \, Y_1}{\partial \, Z_p} \\ Y_1 \dfrac{\partial \, X_2}{\partial \, Z_p} + X_2 \dfrac{\partial \, Y_1}{\partial \, Z_p} \\ \vdots \\ Y_1 \dfrac{\partial \, X_m}{\partial \, Z_p} + X_m \dfrac{\partial \, Y_1}{\partial \, Z_p} \end{matrix}} \end{array} \right] \quad = \quad \left[ \begin{array}{c} \color{RoyalBlue}{\begin{matrix} \dfrac{\partial \, X_1 Y_1}{\partial \, Z_1} \\ \dfrac{\partial \, X_2 Y_1}{\partial \, Z_1} \\ \vdots \\ \dfrac{\partial \, X_m Y_1}{\partial \, Z_1} \end{matrix}} \\ \hline \color{OrangeRed}{\begin{matrix} \dfrac{\partial \, X_1 Y_1}{\partial \, Z_2} \\ \dfrac{\partial \, X_2 Y_1}{\partial \, Z_2} \\ \vdots \\ \dfrac{\partial \, X_m Y_1}{\partial \, Z_2} \end{matrix}} \\ \hline \vdots \\ \hline \color{YellowGreen}{\begin{matrix} \dfrac{\partial \, X_1 Y_1}{\partial \, Z_p} \\ \dfrac{\partial \, X_2 Y_1}{\partial \, Z_p} \\ \vdots \\ \dfrac{\partial \, X_m Y_1}{\partial \, Z_p} \end{matrix}} \end{array} \right] \end{align} $$์ต์ข ๊ฒฐ๊ณผ๋ฅผ (m)-transpose ์ํค๋ฉด ์ผ์ฝ๋น์์ด ๋๋ค.
$$ \left[\left( \frac{\partial \,}{\partial \, \mathbf{Z}} \otimes \mathbf{X} \right) \left( \mathbf{I}_{1} \otimes \mathbf{Y} \right) + \left( \mathbf{I}_{p} \otimes \mathbf{X} \right)\left( \frac{\partial \, }{\partial \, \mathbf{Z}} \otimes \mathbf{Y} \right)\right]^{(m)} = \left[ \begin{array}{c} \color{RoyalBlue}{\begin{matrix} \dfrac{\partial \, X_1 Y_1}{\partial \, Z_1} \\ \dfrac{\partial \, X_2 Y_1}{\partial \, Z_1} \\ \vdots \\ \dfrac{\partial \, X_m Y_1}{\partial \, Z_1} \end{matrix}} \\ \hline \color{OrangeRed}{\begin{matrix} \dfrac{\partial \, X_1 Y_1}{\partial \, Z_2} \\ \dfrac{\partial \, X_2 Y_1}{\partial \, Z_2} \\ \vdots \\ \dfrac{\partial \, X_m Y_1}{\partial \, Z_2} \end{matrix}} \\ \hline \vdots \\ \hline \color{YellowGreen}{\begin{matrix} \dfrac{\partial \, X_1 Y_1}{\partial \, Z_p} \\ \dfrac{\partial \, X_2 Y_1}{\partial \, Z_p} \\ \vdots \\ \dfrac{\partial \, X_m Y_1}{\partial \, Z_p} \end{matrix}} \end{array} \right] ^{(m)} = \quad \begin{bmatrix} \color{RoyalBlue}{\dfrac{\partial \, X_1 Y_1}{\partial \, Z_1}} & \color{OrangeRed}{\dfrac{\partial \, X_1 Y_1}{\partial \, Z_2}} & \cdots & \color{YellowGreen}{\dfrac{\partial \, X_1 Y_1}{\partial \, Z_p}} \\ \color{RoyalBlue}{\dfrac{\partial \, X_2 Y_1}{\partial \, Z_1}} & \color{OrangeRed}{\dfrac{\partial \, X_2 Y_1}{\partial \, Z_2}} & \cdots & \color{YellowGreen}{\dfrac{\partial \, X_2 Y_1}{\partial \, Z_p}}\\ \vdots & \vdots & \ddots & \vdots \\ \color{RoyalBlue}{\dfrac{\partial \, X_m Y_1}{\partial \, Z_1}} & \color{OrangeRed}{\dfrac{\partial \, X_m Y_1}{\partial \, Z_2}} & \cdots & \color{YellowGreen}{\dfrac{\partial \, X_m Y_1}{\partial \, Z_p}} \end{bmatrix} = \frac{\partial \, \mathbf{XY}}{\partial \, \mathbf{Z}} $$์ด๋ฏ๋ก ์ธ๋ฑ์ค ํ์์ผ๋ก ์ฐ๋ฉด
$$ \dfrac{\partial \, }{\partial \, (\mathbf{A})_{ij}} \sum_{i} \sum_{j} (\mathbf{A})_{ij}(\mathbf{B})_{ji} = (\mathbf{B})_{ji} $$๋ฐ๋ผ์
$$\dfrac{\partial \, }{\partial \, \mathbf{A}} \text{tr}(\mathbf{AB}) = \mathbf{B}^{\text{T}}$$๊ฐ์ ๋ฐฉ๋ฒ์ผ๋ก
$$ \dfrac{\partial \, }{\partial \, (\mathbf{B})_{ji}} \sum_{i} \sum_{j} (\mathbf{A})_{ij}(\mathbf{B})_{ji} = (\mathbf{A})_{ij} $$๋ฐ๋ผ์
$$\dfrac{\partial \, }{\partial \, \mathbf{B}} \text{tr}(\mathbf{AB}) = \mathbf{A}$$์ ์์ ๋ณด์ด๊ธฐ ์ํด์๋ ๋ฏธ๋ฆฌ ์์์ผํ ๋ด์ฉ์ด ์กฐ๊ธ ์๋ค. ์ฐ์ ์ญํ๋ ฌ์ ๋ค์์ฒ๋ผ ๊ตฌํ ์ ์์ผ๋ฉฐ[5]
$$ \mathbf{X}^{-1} = \frac{1}{\lvert \mathbf{X} \rvert } \left[ C_{ij} \right]^{\text{T}} $$์ ์์์ $C_{ij}$๋ ๋ค์์ฒ๋ผ ์ ์๋๋ ์ฌ์ธ์์ด๋ค. $M_{ij}$๋ $\mathbf{X}$์ iํ๊ณผ j์ด์ ์ ์ธํ์ฌ ์ป์ ๋ถ๋ถ ํ๋ ฌ์ ํ๋ ฌ์์ ๋ํ๋ธ๋ค.
$$ C_{ij} = (-1)^{i+j}M_{ij} $$์ด ์ฌ์ธ์์ ํ๋ ฌ์ ์ ์น $\left[ C_{ij} \right]^{\text{T}}$๋ฅผ adjugate ํ๋ ฌ[6]์ด๋ผํ๊ณ ๋ค์์ฒ๋ผ ํ์ํ๋ค.
$$ \text{adj}(\mathbf{X}) = \left[ C_{ij} \right]^{\text{T}} $$์ด๋ฅผ ์ด์ฉํ์ฌ ์ญํ๋ ฌ์ ๋ค์ ๋ํ๋ด๋ฉด
$$ \mathbf{X}^{-1} = \frac{1}{\lvert \mathbf{X} \rvert } \text{adj}(\mathbf{X}) \tag{1} $$ํ์ค ๋ฏธ๋ถ์ ํํCanonical differential form์ ๋ํด ์ด์ ๋๋ฑํ ๋ฏธ๋ถ ๋๋ ๋ํจ์ํํEquivalent derivative form์ ๋ค์๊ณผ ๊ฐ์ด ๋ช๊ฐ์ง๋ฅผ ์จ๋ณผ ์ ์๋ค.[7]
$$ \begin{align} dy = a dx &\implies \frac{dy}{dx} = a \\[5pt] dy = \mathbf{a} d\mathbf{x} &\implies \frac{dy}{\text{d}\mathbf{x}} = \mathbf{a} \\[5pt] dy = \text{tr}(\mathbf{A} \text{d}\mathbf{X}) &\implies \frac{dy}{\text{d}\mathbf{X}} = \mathbf{A} \end{align} \tag{2} $$์ ์์์ $dy$, $dx$, $\text{d}\mathbf{X}$๋ ๋ฏธ๋ถ์differential or infinitesimal๋ก ๋ณ์์ ๋ฏธ์๋ณ๋์ ๋ํ๋ด๊ณ ์ด ๋ฏธ์๋ณ๋์ ๋น์จ์ธ $\frac{dy}{dx}$, $\frac{dy}{\text{d}\mathbf{X}}$์ ๋ฏธ๋ถ ๋๋ ๋ํจ์derivative๋ผ ํ๋ค.[8],[9]
์ธ๋ฒ์งธ ์์ ์์์ ๋ณด์ธ $\frac{\partial \, }{\partial \, \mathbf{B}} \text{tr}(\mathbf{AB}) = \mathbf{A}$๋ฅผ ์ด์ฉํ๋ฉด
$$ \frac{ \text{tr}(\mathbf{A} \text{d}\mathbf{X}) }{\text{d}\mathbf{X}} = \frac{ \text{d}\left(\text{tr}(\mathbf{A} \mathbf{X})\right) }{\text{d}\mathbf{X}}= \mathbf{A} $$์์ ๋ฐ๋ก ์ ์ ์๋ค.
ํํธ ํ๋ ฌ์์ ๋ฏธ๋ถ์ ๊ดํ ์ผ์ฝ๋น ๊ณต์Jacobi's_formula[10]์ด ์๋๋ฐ ์ฌ๊ธฐ์ ์ด๋ฅผ ์ฆ๋ช ํ๊ธฐ๋ ๋๋ฌด ๊ธธ๊ณ ์ง๋ฃจํ๋ฏ๋ก ์ผ๋จ ๋ค์ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ์ ๋ค์ด๋๋ก ํ๋ค.
$$ \text{d} \lvert \mathbf{X} \rvert = \text{tr} \left( \text{adj}(\mathbf{X}) \text{d}\mathbf{X} \right) \tag{3} $$์ฆ๋ช ์ ์ํค์ ์์ฃผ ์์ธํ ๋์ ์๋ค.
์ด์์ ๋ด์ฉ์ ์ด์ฉํ๋ฉด ๋ณด์ด๊ณ ์ ํ๋ ๋ฏธ๋ถ์ ๋น๊ต์ ๊ฐ๋จํ๊ฒ ๋ณด์ผ ์ ์๋ค. ์(1)๋ก ๋ถํฐ
$$ \begin{align} \mathbf{X}^{-1} &= \frac{1}{\lvert \mathbf{X} \rvert} \text{adj}(\mathbf{X}) \\[5pt] \lvert \mathbf{X} \rvert \mathbf{X}^{-1} &= \text{adj}(\mathbf{X}) \\[5pt] \lvert \mathbf{X} \rvert \mathbf{X}^{-1} \partial \mathbf{X} &= \text{adj}(\mathbf{X})\partial \mathbf{X} \\[5pt] \text{tr}\left(\lvert \mathbf{X} \rvert \mathbf{X}^{-1} \partial \mathbf{X}\right) &= \text{tr}\left(\text{adj}(\mathbf{X})\partial \mathbf{X}\right) \end{align} $$์ด๋ฉฐ ์(3)์ ์ํด
$$ \partial \lvert \mathbf{X} \rvert = \text{tr}\left(\lvert \mathbf{X} \rvert \mathbf{X}^{-1} \partial \mathbf{X}\right) $$๊ฐ ๋๊ณ ์(2) 3๋ฒ์งธ ์์ ์ํด
$$ \partial \lvert \mathbf{X} \rvert = \text{tr}\left(\lvert \mathbf{X} \rvert \mathbf{X}^{-1} \partial \mathbf{X}\right) \implies \frac{ \partial \, \lvert \mathbf{X} \rvert}{ \partial\mathbf{X}} = \lvert \mathbf{X} \rvert \mathbf{X}^{-1} $$๊ฐ ๋จ์ ์ ์ ์๋ค.
๋๋ ์ข ๋ ํ์ด ์จ๋ณด๋ฉด $\frac{ \text{tr}(\mathbf{A} \text{d}\mathbf{X}) }{\text{d}\mathbf{X}} = \mathbf{A}$์ ์ํด
$$ \frac{ \partial \, \lvert \mathbf{X} \rvert}{ \partial \, \mathbf{X}} = \frac{ \text{tr}\left(\lvert \mathbf{X} \rvert \mathbf{X}^{-1} \color{RoyalBlue}{ \partial \mathbf{X}}\right) }{\color{RoyalBlue}{ \partial\mathbf{X}}} = \lvert \mathbf{X} \rvert \mathbf{X}^{-1} $$ํํธ $\text{tr}(\mathbf{AB}) = \text{tr}(\mathbf{BA})$ ์ด๋ฏ๋ก
$$ \frac{ \partial \, \lvert \mathbf{X} \rvert}{ \partial\mathbf{X}} = \frac{ \text{tr}\left(\lvert \mathbf{X} \rvert \mathbf{X}^{-1} \color{RoyalBlue}{ \partial \mathbf{X}}\right) }{\color{RoyalBlue}{ \partial \mathbf{X}}} = \frac{ \text{tr}\left( \color{RoyalBlue}{ \partial \mathbf{X}} \lvert \mathbf{X} \rvert \mathbf{X}^{-1} \right) }{\color{RoyalBlue}{ \partial \mathbf{X}}} = \left( \lvert \mathbf{X} \rvert \mathbf{X}^{-1} \right)^{\text{T}} = \lvert \mathbf{X} \rvert \left( \mathbf{X}^{-1} \right)^{\text{T}} $$์ ๊ฒฐ๊ณผ๋ฅผ ์ด์ฉํ๋ฉด
$$ \frac{\partial \, }{\partial \, \mathbf{X}} \ln \lvert \mathbf{X} \rvert = \frac{1}{ \lvert \mathbf{X} \rvert } \frac{\partial \, \lvert \mathbf{X} \rvert}{\partial \, \mathbf{X}} = \frac{1}{ \lvert \mathbf{X} \rvert } \lvert \mathbf{X} \rvert \mathbf{X}^{-1}=\mathbf{X}^{-1} $$๋๋
$$ \frac{\partial \, }{\partial \, \mathbf{X}} \ln \lvert \mathbf{X} \rvert = \frac{1}{ \lvert \mathbf{X} \rvert } \frac{\partial \, \lvert \mathbf{X} \rvert}{\partial \, \mathbf{X}} = \frac{1}{ \lvert \mathbf{X} \rvert } \lvert \mathbf{X} \rvert \left(\mathbf{X}^{-1}\right)^{\text{T}}= \left(\mathbf{X}^{-1}\right)^{\text{T}} $$$\mathbf{a}^{\text{T}} : 1 \times m$, $\mathbf{X} : m \times n$, $\mathbf{b} : n \times 1$ ์ธ ์์์ ๋ฒกํฐ์ ํ๋ ฌ์ด๋ผ๊ณ ๊ฐ์ ํ๋ค.
$\mathbf{a}^{\text{T}} \mathbf{X} \mathbf{b}$๋ ๊ฒฐ๊ณผ๊ฐ ์ซ์ ์ด๋ฏ๋ก $\text{tr}(\mathbf{a}^{\text{T}} \mathbf{X} \mathbf{b})$๋ก ํธ๋ ์ด์ค๋ฅผ ์์๋ ๊ฒฐ๊ณผ๊ฐ ๋ณํ์ง ์๋๋ค. ๋ฐ๋ผ์ $\frac{\partial \, }{\partial \, \mathbf{A}} \text{tr}(\mathbf{AB}) = \mathbf{B}^{\text{T}}$์ ์ฌ์ฉํ๋ฉด ๋ค์์ฒ๋ผ ๊ฐ๋จํ ๋ณด์ผ ์ ์๋ค.
$$ \frac{\partial \, \mathbf{a}^{\text{T}} \mathbf{X} \mathbf{b}}{\partial \, \mathbf{X}} = \frac{\partial \, \text{tr}\left(\mathbf{a}^{\text{T}} \mathbf{X} \mathbf{b}\right)}{\partial \, \mathbf{X}} = \frac{\partial \, \text{tr}\left(\mathbf{X} \mathbf{b} \mathbf{a}^{\text{T}} \right)}{\partial \, \mathbf{X}} = \left( \mathbf{b} \mathbf{a}^{\text{T}} \right)^{\text{T}} = \mathbf{a}\mathbf{b}^{\text{T}} $$๋๋ ์ฝ๊ฐ ๋ฒ๊ฑฐ๋กญ์ง๋ง ํฌ๋ก๋ค์ปค ๊ณฑ๊ณผ ๊ณฑ์ ๋ฏธ๋ถ๋ฒ์ ๊ทธ๋๋ก ์ ์ฉํด์๋ ๋ณด์ผ ์ ์๋ค.
$\left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}}\right)$์ $m \times m$์ธ ๋ถ๋ถํ๋ ฌ์ด ํ๋ฐฉํฅ์ผ๋ก $m$๊ฐ ๋์ด์ ํํ๋ก $m \times m^2$์ธ ํ๋ ฌ์ด ๋๋ค. ๋ถ๋ถํ๋ ฌ์ ๊ทธ ํ๋ ฌ์ด ์ ์ฒด ํ๋ ฌ์์ ์์นํ๋ ๊ณณ์ ํ์ $\mathbf{a}^{\text{T}}$๋ก ๊ฐ์ง๋ ํ๋ ฌ์ด๋ค.
$$ \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}}\right) = \begin{bmatrix} \color{RoyalBlue}{\begin{matrix}a_1 & a_2 & \cdots & a_m\end{matrix}} & | & \mathbf{0}^{\text{T}} & | & \cdots & | & \mathbf{0}^{\text{T}} \\ \mathbf{0}^{\text{T}} & | & \color{RoyalBlue}{\begin{matrix}a_1 & a_2 & \cdots & a_m\end{matrix}} & | & \cdots & | & \mathbf{0}^{\text{T}} \\ \vdots & | & \vdots & | & \ddots & | & \vdots \\ \mathbf{0}^{\text{T}} & | & \mathbf{0}^{\text{T}} & | & \cdots & | & \color{RoyalBlue}{\begin{matrix}a_1 & a_2 & \cdots & a_m\end{matrix}} \end{bmatrix} $$$\frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}}$์ $m \times n$ ๋ถ๋ถ ํ๋ ฌ์ด $m \times n$์ผ๋ก ๋ฐ๋ํ ํ์์ผ๋ก ๋์ด์ ํ๋ ฌ๋ก $m^2 \times n^2$ํ๋ ฌ์ด ๋๋ฉฐ ์ฌ๊ธฐ์ ๊ฐ ๋ถ๋ถ ํ๋ ฌ์ ์ ์ฒด ํ๋ ฌ์์ ๊ทธ ๋ถ๋ถํ๋ ฌ์ด ์์นํ๋ ์๋ฆฌ๋ง 1์ด๊ณ ๋๋จธ์ง๋ ๋ชจ๋ 0์ธ ํ๋ ฌ์ด ๋๋ค.
์ฆ, ์๋ ์์ฒ๋ผ ์ฒซ๋ฒ์งธ ๋ถ๋ถํ๋ ฌ์ (1,1)๋ง 1์ด๊ณ ๋๋จธ์ง๋ ๋ชจ๋ 0์ธ ๋ถ๋ถํ๋ ฌ์ด๊ณ , ๊ทธ ์ค๋ฅธ์ชฝ ์ ํ๋ ฌ์ (1,2)๋ง 1์ด๊ณ ๋๋จธ์ง๋ ๋ชจ๋ 0์ธ ๋ถ๋ถํ๋ ฌ์ด ๋๋ ์์ด๋ค.
$$ \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} = \left[ \begin{array}{c|c|c|c} \begin{matrix} 1 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{matrix} & \begin{matrix} 0 & 1 & \cdots & 0 \\ 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{matrix} & \begin{matrix}\cdots \\ \cdots \\ \cdots \\ \cdots \end{matrix} & \begin{matrix} 0 & 0 & \cdots & 1 \\ 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{matrix} \\ \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} \\ \begin{matrix} 0 & 0 & \cdots & 0 \\ 1 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{matrix} & \begin{matrix} 0 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{matrix} & \begin{matrix}\cdots \\ \cdots \\ \cdots \\ \cdots \end{matrix} & \begin{matrix} 0 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & 1 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{matrix} \\ \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} \\ \begin{matrix}\vdots&\vdots&\vdots&\vdots\end{matrix} & \begin{matrix}\vdots&\vdots&\vdots&\vdots\end{matrix} & \begin{matrix}\vdots&\vdots&\vdots&\vdots\end{matrix} & \begin{matrix}\vdots&\vdots&\vdots&\vdots\end{matrix} \\ \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} & \begin{matrix}-&-&-&-\end{matrix} \\ \begin{matrix} 0 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 1 & 0 & \cdots & 0 \end{matrix} & \begin{matrix} 0 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 1 & \cdots & 0 \end{matrix} & \begin{matrix}\cdots \\ \cdots \\ \cdots \\ \cdots \end{matrix} & \begin{matrix} 0 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{matrix} \end{array} \right] $$์ ๋ ํ๋ ฌ์ ๋จผ์ ๊ณฑํ๋ฉด $m \times n$ ๋ถ๋ถํ๋ ฌ์ด $n$๊ฐ ๋งํผ ํ๋ฐฉํฅ์ผ๋ก ๋์ด์ $m \times n^2$์ธ ํ๋ ฌ์ด ๋๋๋ฐ ์ ์ฒด ํ๋ ฌ์์ ๋ถ๋ถํ๋ ฌ์ด ์๋ ์์น์ ์ด์ด $\mathbf{a}$๊ฐ ๋๋ ํ๋ ฌ์ด๋ค.
$$ \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}}\right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} = \left[ \begin{array}{c|c|c|c} \begin{matrix} a_1 & 0 & \cdots & 0 \\ a_2 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ a_m & 0 & \cdots & 0 \end{matrix} & \begin{matrix} 0 & a_1 & \cdots & 0 \\ 0 & a_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & a_m & \cdots & 0 \end{matrix} & \begin{matrix}\cdots \\ \cdots \\ \cdots \\ \cdots \end{matrix} & \begin{matrix} 0 & 0 & \cdots & a_1 \\ 0 & 0 & \cdots & a_2 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & a_m \end{matrix} \end{array} \right] $$ํํธ $\left( \mathbf{I}_n \otimes \mathbf{b} \right)$์ $n \times n$ ๋ถ๋ถํ๋ ฌ์ด ์ด๋ฐฉํฅ์ผ๋ก ๋์ด์ $n^2 \times n$์ธ ํ๋ ฌ๋ก ์ ์ฒด ํ๋ ฌ์์ ๋ถ๋ถํ๋ ฌ์ด ์๋ ์์น์ ์ด์ด $\mathbf{b}$๊ฐ ๋๋ ํ๋ ฌ์ด๋ค.
$$ \left( \mathbf{I}_n \otimes \mathbf{b} \right) = \begin{bmatrix} \begin{matrix} b_1 & 0 & \cdots & 0 \\ b_2 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ b_n & 0 & \cdots & 0 \end{matrix} \\ \begin{matrix} - & - & - & - \end{matrix} \\ \begin{matrix} 0 & b_1 & \cdots & 0 \\ 0 & b_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & b_n & \cdots & 0 \end{matrix} \\ \begin{matrix} - & - & - & - \end{matrix} \\ \begin{matrix} \vdots & \vdots & \vdots & \vdots \end{matrix}\\ \begin{matrix} - & - & - & - \end{matrix} \\ \begin{matrix} 0 & 0 & \cdots & b_1 \\ 0 & 0 & \cdots & b_2 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & b_n \end{matrix} \end{bmatrix} $$๋ง์ง๋ง์ผ๋ก ๋ ํ๋ ฌ์ ๊ณฑํ๋ฉด ์ํ๋ ๊ฒฐ๊ณผ๋ฅผ ์ป์ ์ ์๋ค.
$$ \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}}\right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} \left( \mathbf{I}_n \otimes \mathbf{b} \right) = \begin{bmatrix} a_1 b_1 & a_1 b_2 & \cdots & a_1 b_n \\ a_2 b_1 & a_2 b_2 & \cdots & a_2 b_n \\ \vdots & \vdots & \ddots & \vdots \\ a_m b_1 & a_m b_2 & \cdots & a_m b_n \end{bmatrix} = \mathbf{a} \mathbf{b}^{\text{T}} $$์ ๋ฏธ๋ถ์ ์์ matrix cookbook eq.70 ๋ฏธ๋ถ ๊ณต์๊ณผ ํฌ๋ก๋ค์ปค ๊ณฑ์ ๋ ์ฑ์ง[11]
$$ \left( \mathbf{A} \otimes \mathbf{B} \right)^{-1} = \mathbf{A}^{-1} \otimes \mathbf{B}^{-1} $$$$ \left( \mathbf{A} \otimes \mathbf{B} \right)\left( \mathbf{C} \otimes \mathbf{D} \right) = \mathbf{A}\mathbf{C} \otimes \mathbf{B}\mathbf{D} $$์ ์ด์ฉํ์ฌ ๋ณด์ผ ์ ์๋ค.
์ญํ๋ ฌ์ ๊ฐ์ง๋ $\mathbf{X} : m \times m$์ $\mathbf{a}^{\text{T}} : 1 \times m$, $\mathbf{b} : m \times 1$ ์์์ ๋ฒกํฐ๋ฅผ ๊ฐ์ ํ๋ค.
$$ \begin{align} \frac{\partial \, \mathbf{a}^{\text{T}} \mathbf{X}^{-1} \mathbf{b}}{\partial \, \mathbf{X}} &= \frac{\partial \, \mathbf{a}^{\text{T}} \mathbf{X}^{-1}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{b} \right) + \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}} \mathbf{X}^{-1} \right) \frac{\partial \, \mathbf{b}}{\partial \, \mathbf{X}} \\[5pt] &= \left( \frac{\partial \, \mathbf{a}^{\text{T}}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) + \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}} \right) \frac{\partial \, \mathbf{X}^{-1}}{\partial \, \mathbf{X}} \right) \left( \mathbf{I}_m \otimes \mathbf{b} \right) \\[5pt] &= \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}} \right) \frac{\partial \, \mathbf{X}^{-1}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{b} \right) \end{align} \tag{1} $$ํํธ $\mathbf{X}^{-1} \mathbf{X} = \mathbf{I}$์์
$$ \frac{\partial \,\mathbf{X}^{-1} \mathbf{X}}{\partial \, \mathbf{X}} = \frac{\partial \, \mathbf{I}}{\partial \, \mathbf{X}} \\[5pt] \frac{\partial \,\mathbf{X}^{-1} }{\partial \, \mathbf{X}}\left( \mathbf{I}_m \otimes \mathbf{X} \right) + \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} = \mathbf{0} \\[5pt] \frac{\partial \,\mathbf{X}^{-1} }{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{X} \right) \left( \mathbf{I}_m \otimes \mathbf{X} \right)^{-1} + \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{X} \right)^{-1}= \mathbf{0} \\[5pt] \frac{\partial \,\mathbf{X}^{-1} }{\partial \, \mathbf{X}} = - \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{X} \right)^{-1} $$์ด์ $ \left( \mathbf{A} \otimes \mathbf{B} \right)^{-1} = \mathbf{A}^{-1} \otimes \mathbf{B}^{-1}$๋ฅผ ์ด์ฉํ๋ฉด ๋ค์์ฒ๋ผ ๋๋ค.
$$ \frac{\partial \,\mathbf{X}^{-1} }{\partial \, \mathbf{X}} = - \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) \tag{2} $$(2)๋ฅผ (1)์ ๋์ ํ๋ฉด
$$ \begin{align} \frac{\partial \, \mathbf{a}^{\text{T}} \mathbf{X}^{-1} \mathbf{b}}{\partial \, \mathbf{X}} &= - \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}} \right) \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \right) \left( \mathbf{I}_m \otimes \mathbf{b} \right) \\[5pt] &= - \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}} \mathbf{X}^{-1} \right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} \left( \mathbf{I}_m \otimes \mathbf{X}^{-1} \mathbf{b} \right) \quad \because \left( \mathbf{A} \otimes \mathbf{B} \right)\left( \mathbf{C} \otimes \mathbf{D} \right) = \mathbf{A}\mathbf{C} \otimes \mathbf{B}\mathbf{D} \end{align} $$์ ์๊ณผ ์์ ๋ฏธ๋ถ๊ณต์
$$ \begin{align} \frac{\partial \, \mathbf{a}^{\text{T}} \mathbf{X} \mathbf{b}}{\partial \, \mathbf{X}} = \left( \mathbf{I}_m \otimes \mathbf{a}^{\text{T}}\right) \frac{\partial \, \mathbf{X}}{\partial \, \mathbf{X}} \left( \mathbf{I}_n \otimes \mathbf{b} \right) = \mathbf{a} \mathbf{b}^{\text{T}} \end{align} $$์ ๋ณด์ผ ๋์ ๊ณผ์ ์ ๋น๊ตํ๋ฉด ์ต์ข ์ ์ผ๋ก ๋ค์์ ๋ณด์ผ ์ ์๋ค.
$$ \frac{\partial \, \mathbf{a}^{\text{T}} \mathbf{X}^{-1} \mathbf{b}}{\partial \, \mathbf{X}} = - \mathbf{X}^{-\text{T}} \mathbf{a} \mathbf{b}^{\text{T}} \mathbf{X}^{-\text{T}} $$Montana State University, 2012
Old and New Matrix Algebra Useful for Statistics, Thomas P., Minka (December 28, 2000), MIT Media Lab note (1997; revised 12/00). Retrieved 5 February 2016.
Linear Algebra & Matrix Calculus:https://www.slideshare.net/ssuser7e10e4/matrix-calculus, ์์ฑ๋น
The Matrix Cookbook, Kaare Brandt Petersen & Michael Syskind Pedersen, 2012
Advanced Engineering Mathematics 7.7 & 7.8, Erwin Kreyszig, Wiley
Adjugate_matrix:https://en.wikipedia.org/wiki/Adjugate_matrix
Matrix_calculus:https://en.wikipedia.org/wiki/Matrix_calculus
Differential_(infinitesimal):https://en.wikipedia.org/wiki/Differential_(infinitesimal) ์ฃผ์:(infinitesimal)๊น์ง ๋ชจ๋ ์ฃผ์ '(' ์์ _ ์์
Derivative:https://en.wikipedia.org/wiki/Derivative
Jacobi's_formula:https://en.wikipedia.org/wiki/Jacobi%27s_formula
Kronecker product:https://en.wikipedia.org/wiki/Kronecker_product
%%html
<link href='https://fonts.googleapis.com/earlyaccess/nanummyeongjo.css' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/earlyaccess/nanumgothiccoding.css' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/earlyaccess/notosanskr.css' rel='stylesheet' type='text/css'>
<!--https://github.com/kattergil/NotoSerifKR-Web/stargazers-->
<link href='https://cdn.rawgit.com/kattergil/NotoSerifKR-Web/5e08423b/stylesheet/NotoSerif-Web.css' rel='stylesheet' type='text/css'>
<style>
h1 { font-family: 'Noto Sans KR' !important; color:#348ABD !important; }
h2 { font-family: 'Noto Sans KR' !important; color:#467821 !important; }
h3, h4 { font-family: 'Noto Sans KR' !important; color:#A60628 !important; }
p:not(.navbar-text) { font-family: 'Noto Serif KR', 'Nanum Myeongjo'; font-size: 12pt; line-height: 200%; text-indent: 10px; }
li:not(.dropdown){ font-family: 'Noto Serif KR', 'Nanum Myeongjo'; font-size: 12pt; line-height: 200%; }
table { font-family: 'Noto Sans KR' !important; font-size: 11pt !important; }
li > p { text-indent: 0px; }
li > ul { margin-top: 0px !important; }
sup { font-family: 'Noto Sans KR'; font-size: 9pt; }
code, pre { font-family: 'Nanum Gothic Coding', monospace !important; font-size: 12pt !important; line-height: 130% !important;}
.code-body { font-family: 'Nanum Gothic Coding', monospace !important; font-size: 12pt !important;}
.ns { font-family: 'Noto Sans KR'; font-size: 15pt;}
.summary {
font-family: 'Georgia'; font-size: 12pt; line-height: 200%;
border-left:3px solid #FF0000;
padding-left:20px;
margin-top:10px;
}
.green { color:#467821 !important; }
.comment { font-family: 'Noto Sans KR'; font-size: 10pt; }
</style>