Originally published at Inside the Mind of the G Machine. You can comment here or there.

Jordan normal form states that every square matrix is similar to a Jordan normal form matrix, one of the form

where each of the is square of the form

.

This is constructed via generalized eigenvectors. One can observe that each block matrix corresponds to an invariant subspace, and generalized eigenvectors (of a matrix) are a set of chains, each of which is its own invariant subspace.

We let be any linear transformation from to , where is of course a vector space.

It is more common knowledge that is the mechanism used to solve for eigenvectors. Let us first observe that is such that also since commutes with and that this extends to for any natural number . This gives us a way to identify larger invariant subspaces.

Let . is obvious, and there will be some and a smallest at which . Afterwards, the must all be equal. If not, there will be in the intermediary on iterating against a vector from which is contradicted.

Next, we observe that . Suppose not. Then, we would have some but not in .

Rank nullity theorem says that the remainder after pulling out for some eigenvalue is . We can run the same algorithm on that for another eigenvalue. So this is resolved by induction.

The result is that if has distinct eigenvalues , there are such that the domain of is

.

Now does correspond to a irreducible invariant subspace necessarily? No, as there is a difference between algebraic and geometric multiplicity.

Now we will show, as the **second part** of the proof, to be invoked on the components in the direct sum decomposition from the preceding **first part** of the proof that if is nilpotent, meaning that for some , then there are and such that is a basis (linearly independent by definition) for the domain of , with and . (Note that here is the smallest with .)

For any eigenvalue, there is an eigenvector space associated with it. Take its preimage with respect to . Do this successively until nothing remains, which will be at the th iteration. In particular, take to be the basis of the eigenvector space. For each one of these that has non-empty preimage, we take the element with the kernel projected out. This accumulates a set of vectors of the format specified. It has to be exhaustive with respect to the vector space under the nilpotence assumption, from which termination is also guaranteed. It remains to show that these are linearly independent. We can using our eigenvector space as our base case take an inductive hypothesis where the vectors accumulated prior to the th iteration are linearly independent. Now we show that the vector set remains so after adding in the ones obtained from taking preimage. We note that first, the added ones are linearly independent themselves (if a nontrivial linear combinations gives zero, applying would violate our inductive hypothesis). There is also that a nontrivial linear combination of the newly added ones cannot equal a linear combination of the rest. To show this, assume otherwise, and apply just enough times ( times) for one side to disappear. The other side should be a linear combination with respect to our designated basis of the eigenvector space, which cannot disappear. This concludes our construction.

Essentially we have chains (of vectors) which terminate when an element is no longer found after our preimage operation. Applying to this , we see that for some element in our chain, , where is the previous element of the chain, with signifying that we are the at an eigenvector (non-generalized), at the front. Ones along the superdiagonal correspond to the coefficient of above.