Coefficients and Characteristic Polynomials
Last time in our discussion regarding invariant polynomials, I mentioned that the coefficients $f_k(A)$ of the characteristic polynomial of a matrix $A$ are sums of the principal $k \times k$ minors modulo signs. This post will be a short one, giving some justification for this claim.
Before we prove the result I mentioned above, let’s recall what is meant by the Adjoint or Adjugate of a matrix. Given an $n \times n$ matrix $A = [a^i_j]$, the $(i,j)$ cofactors $C^i_j$ defined by
\[C^i_j := (-1)^{i+j}M^i_j\]where $M^i_j$ is the $(i,j)$ minor of $A$, can be arranged into a matrix $C = [C^i_j]$ giving rise to the following definition.
The adjugate satisfies some reasonable properties such as $\operatorname{adj}(cA) = c^{n-1}A$ for any scalar $c$, but the one we are mostly interested in today is something called Jacobi’s formula which relates the derivative of the determinant of a $A$ in terms of the adjugate of $A$ and the derivative of $A$.
The above formula will be extremely useful in proving the following fact about the coefficients of the characteristic polynomial $\det(t I - A)$.
The last argument in the above proof might be a bit cryptic, so here is an illustration of what’s happening. Consider a $4 \times 4$ matrix $A$ given by
\[A = \begin{pmatrix} a^1_1 & a^1_2 & a^1_3 & a^1_4 \\ a^2_1 & a^2_2 & a^2_3 & a^2_4 \\ a^3_1 & a^3_2 & a^3_3 & a^3_4 \\ a^4_1 & a^4_2 & a^4_3 & a^4_4 \end{pmatrix}.\]The principal submatrices of size $3 \times 3$ are given by
\[\begin{equation} \begin{pmatrix} a^2_2 & a^2_3 & a^2_4 \\ a^3_2 & a^3_3 & a^3_4 \\ a^4_2 & a^4_3 & a^4_4 \end{pmatrix} \quad \begin{pmatrix} a^1_1 & a^1_3 & a^1_4 \\ a^3_1 & a^3_3 & a^3_4 \\ a^4_1 & a^4_3 & a^4_4 \end{pmatrix} \quad \begin{pmatrix} a^1_1 & a^1_2 & a^1_4 \\ a^2_1 & a^2_2 & a^2_4 \\ a^4_1 & a^4_2 & a^4_4 \end{pmatrix} \quad \begin{pmatrix} a^1_1 & a^1_2 & a^1_3 \\ a^2_1 & a^2_2 & a^2_3 \\ a^3_1 & a^3_2 & a^3_3 \end{pmatrix}. \end{equation}\]Looking at, for example $A_1$, the sum $\operatorname{tr}(\operatorname{adj}(A_1))$ is given by summing the $2 \times 2$ minors
\[{\color{ForestGreen}\begin{vmatrix} a^3_3 & a^3_4 \\ a^4_3 & a^4_4 \end{vmatrix}} \quad {\color{Sepia}\begin{vmatrix} a^2_2 & a^2_4 \\ a^4_2 & a^4_4 \end{vmatrix}} \quad {\color{RoyalBlue}\begin{vmatrix} a^2_2 & a^2_3 \\ a^3_2 & a^3_3 \end{vmatrix}}.\]For $A_2$ we sum over
\[{\color{ForestGreen}\begin{vmatrix} a^3_3 & a^3_4 \\ a^4_3 & a^4_4 \end{vmatrix}} \quad {\color{Mulberry}\begin{vmatrix} a^1_1 & a^1_4 \\ a^4_1 & a^4_4 \end{vmatrix}} \quad {\color{ProcessBlue}\begin{vmatrix} a^1_1 & a^1_3 \\ a^3_1 & a^3_3 \end{vmatrix}}\]and for $A_3$
\[{\color{Sepia}\begin{vmatrix} a^2_2 & a^2_4 \\ a^4_2 & a^4_4 \end{vmatrix}} \quad {\color{Mulberry}\begin{vmatrix} a^1_1 & a^1_4 \\ a^4_1 & a^4_4 \end{vmatrix}} \quad {\color{BurntOrange}\begin{vmatrix} a^1_1 & a^1_2 \\ a^2_1 & a^2_2 \end{vmatrix}}.\]Lastly, for the matrix $A_4$ we have
\[{\color{RoyalBlue}\begin{vmatrix} a^2_2 & a^2_3 \\ a^3_2 & a^3_3 \end{vmatrix}} \quad {\color{ProcessBlue}\begin{vmatrix} a^1_1 & a^1_3 \\ a^3_1 & a^3_3 \end{vmatrix}} \quad {\color{BurntOrange}\begin{vmatrix} a^1_1 & a^1_2 \\ a^2_1 & a^2_2 \end{vmatrix}}.\]The reason this happens is that if we omit, say the $1$st and the $3$rd rows and columns of $A$ we obtain the minor
\[\begin{vmatrix} a^2_2 & a^2_4 \\ a^4_2 & a^4_4 \end{vmatrix}\]which shows up in the sum $\sum_{i=1}^n\operatorname{tr}(\operatorname{adj}(t I - A_i))$ first when $i = 1$ and for the second time when $i = 3$ as illustrated above. Similar reasoning shows that the $1 \times 1$ minors appear $6 = 3!$ times.
Before wrapping this up, I want to give a different viewpoint on the matter. The classical definition of the determinant for an $n \times n$ matrix $A$ is given by
\[\det(A) = \sum_{\pi \in S_n} \operatorname{sgn}(\pi)a^1_{\pi(1)}\cdots a^n_{\pi(n)}.\]Now, there might be a time1 in your life when you would like to have an alternative description for this, and turns out there is one. The Levi-Civita symbol is defined by
\[\varepsilon^{i_{1}i_{2}i_{3}\ldots i_{n}}={\begin{cases}+1&{\text{if }}(i_{1},i_{2},i_{3},\ldots,i_{n}){\text{ is an even permutation of }}(1,2,3,\dots ,n)\\-1&{\text{if }}(i_{1},i_{2},i_{3},\ldots ,i_{n}){\text{ is an odd permutation of }}(1,2,3,\dots ,n)\\\;\;\,0&{\text{otherwise}}\end{cases}}\]allows us to write
\[\det(A) = \varepsilon^{i_{1}i_{2}i_{3}\ldots i_{n}}a^1_{i_1}\cdots a^n_{i_n}\]which has a whopping $n^n$ terms2 compared to the $n!$ from the classical definition, but at least we can use the Einstein summation here. Let’s make this even worse and get rid of the explicit indices. We obtain
\[\begin{align*} \det(A) &= \frac{1}{n!}\varepsilon^{i_{1}i_{2}i_{3}\ldots i_{n}}\varepsilon_{j_{1}j_{2}j_{3}\ldots j_{n}}a^{j_1}_{i_1}\cdots a^{j_n}_{i_n} \\ &= \frac{1}{n!}\delta^{i_{1}i_{2}i_{3}\ldots i_{n}}_{j_{1}j_{2}j_{3}\ldots j_{n}}a^{j_1}_{i_1}\cdots a^{j_n}_{i_n}. \end{align*}\]The reason I wanted to do this is that now we have a more explicit description for the coefficient $f_{k}(A)$ of $t^{n-k}$ as
\[f_k(A) = \frac{1}{k!}\delta^{i_{1}i_{2}i_{3}\ldots i_{k}}_{j_{1}j_{2}j_{3}\ldots j_{k}}a^{j_1}_{i_1}\cdots a^{j_k}_{i_k},\]which is… actually, I believe this is a good point to end this.