Optional: Convergence of exp

Convergence

Weierstrass M-test

We will now prove that the power series exp ( X ) = n 0 1 n ! X n converges absolutely and uniformly on bounded sets. This will be relatively painless, but then I will show that the power series for the partial derivatives converge uniformly on bounded sets (which is needed to deduce that exp is continuously differentiable and that we can differentiate term by term). This will take a lot longer, and should only be watched by those desiring a course of analytic self-flagellation.

We will use the Weierstrass M-test:

Theorem:

If f k ( X ) is a sequence of functions such that f k ( X ) M k for all k , X then, if M k converges, f k converges uniformly to a function f ( X ) .

Uniform convergence on bounded sets

So our strategy will be to find an upper bound for the operator norm of f k ( X ) = n = 0 k 1 n ! X n which is independent of X . This will fail because exp doesn't converge uniformly everywhere: we will need to assume X o p R for some R . We will then deduce uniform convergence on this bounded subset of 𝔤 𝔩 ( n , 𝐑 ) . But if we're interested in a particular matrix then it will satisfy X o p R for some R , so this is all we need.

Omitting the subscript o p , we have f k ( X ) = n = 0 k 1 n ! X n n = 0 k 1 n ! X n n = 0 k 1 n ! X n by the triangle inequality and the fact that X n X n . Now, assuming X R , we get f k ( X ) n = 0 k R n n ! Define M k := n = 0 k R n n ! and observe that M k converges to exp ( R ) as k .

Note that R is a number, so here we're just using convergence of the usual exponential function rather than matrix exp. We now apply the Weierstrass M-test and deduce uniform convergence of exp for X R .

Absolute convergence

Remark:

We've actually proved absolute convergence along the way. This means that if you take norms of every term in the power series then it still converges.

Remark:

Absolute convergence is the property that allows us to do rearrangements without changing the value of the sum.

Derivatives

We also want to be able to differentiate exp ( X ) term-by-term. For that, we need to show that the sequence of partial derivatives of partial sums converges uniformly on bounded sets. This is where the nightmare begins. Watch on at your own risk.

What do we mean by partial derivative? exp ( X ) is a matrix whose entries are functions of the n 2 matrix entries X 11 , X 12 , , X 1 n , X 21 , X 22 , X n n . I'm interested in taking the partial derivative of an entry of exp ( X ) with respect to a variable X i j .

Example:

For example, X 12 ( X 11 X 12 ) = X 11 and X 11 ( X 22 ) = 0 .

We are therefore interested in applying the Weierstrass M-test to the sequence of partial sums: f K ( X ) = X i j ( n = 0 K 1 n ! X n ) We will now prove that the L 1 -norm of f K ( X ) is bounded by M K for some convergent sequence M K . Since the L 1 -norm of a matrix is the sum of absolute values of entries, this means we need to bound | X i j ( n = 0 K 1 n ! X n ) k | .

This is a finite sum, so we can take the derivative inside the sum and get | n = 0 K 1 n ! X i j ( X n ) k | . For a start, what is ( X n ) k ? If X is an N -by- N matrix (to avoid notation-clashes) ( X n ) k = i 1 = 1 N i n - 1 = 1 N X k i 1 X i 1 i 2 X i n - 1 so, using the product rule, and just writing one big sum instead of lots of sums, we get X i j ( X n ) k = ( X k i 1 X i j X i 1 i 2 X i n - 1 + X k i 1 X i 1 i 2 X i j X i n - 1 + + X k i 1 X i n - 1 X i j ) Note that X k i 1 / X i j is either 1 or 0. It's 1 if k = i and i 1 = j . In terms of the Kronecker delta δ a b = { 0  if  a b 1  if  a = b , this means we have ( δ k i δ i 1 j X i 1 i 2 X i n - 1 + X k i 1 δ i 1 i δ i 2 j X i 2 i 3 X i n - 1 + + X k i 1 X i 1 i 2 X i n - 2 i n - 1 δ i n - 1 i δ j ) . In the first term, we can group δ i 1 j X i 1 i 2 X i n - 1 and when we sum over i 1 , i 2 , , i n - 1 this is just the j matrix entry of I X X = X n - 1 (because δ i 1 j is the i 1 j matrix entry of I ).

Similarly, in the second term, we can group X k i 1 δ i 1 i and δ i 2 j X i 2 i 3 X i n - 1 and, when we perform all the sums, these become X k i and ( X n - 2 ) j .

Proceeding in this manner, the sum goes away and we get: X i j ( X n ) k = δ k i X j n - 1 + X k i X j n - 2 + + X k i n - 1 δ j .

We're trying to bound | X i j ( n = 0 K 1 n ! X n ) k | and we now know this is equal to | n = 0 K 1 n ! ( δ k i X j n - 1 + X k i X j n - 2 + + X k i n - 1 δ j ) | Using the triangle inequality, this is bounded above by n = 0 K 1 n ! ( | δ k i | | X j n - 1 | + | X k i | | X n - 2 | j + + | X k i n - 1 | | δ j | ) Note that these are really absolute values because we are working with matrix entries rather than matrices.

Each term inside the bracket has the form | X k i m | | X j n - m - 1 | and we want to bound such quantities. By definition of the L 1 norm, we have | X k i m | X m L 1 and, because the L 1 and operator norms are Lipschitz equivalent, we have X m L 1 C X m o p C X o p m for some Lipschitz constant C .

Again, using Lipschitz equivalence we get C X o p m C D m X L 1 m for some Lipschitz constant D . Therefore we get | X k i m | | X j n - m - 1 | C D m X L 1 m C D n - m - 1 X L 1 n - m - 1 = C 2 D n - 1 X L 1 n - 1 .

All together, we get | X i j ( n = 0 K 1 n ! X n ) k | n = 0 K 1 n ! n C 2 D n - 1 X L 1 n - 1 = C 2 n = 0 K 1 ( n - 1 ) ! ( D X L 1 ) n - 1 So if we assume X L 1 R then this is bounded above by C 2 exp ( D R ) and Weierstrass's M-test applies, so the partial derivatives of the partial sums converge uniformly on bounded sets of matrices.

Don't say I didn't warn you.