Properties of the matrix exponential

Properties of the matrix exponential

In the last video, we introduced the exponential of a matrix, exp ( X ) = n = 0 1 n ! X n . In this video, we'll prove some nice properties of exp.

Lemma:
  1. This power series converges for all X . Moreover, it converges (a) absolutely and (b) uniformly along with all its partial derivatives on any bounded set of matrices. This is a slightly technical result to state and prove which we'll establish in some optional videos (video about matrix norms and video proving convergence). Absolute convergence means that we can reorder terms in the power series without worrying; uniform convergence along with partial derivatives means that exp is differentiable and we can differentiate inside the sum.

  2. Let t be a real variable. Then d d t exp ( t X ) = X exp ( t X ) . Here, exp ( t X ) is a matrix whose entries are functions of t ; d d t exp ( t X ) means the matrix you get by differentiating each entry of exp ( t X ) .

  3. If X Y = Y X then exp ( X ) exp ( Y ) = exp ( X + Y ) .

  4. exp ( X ) is invertible, with inverse exp ( - X ) .

Proof of the lemma

Proof:

To prove 2, we write the definition exp ( t X ) = n = 0 1 n ! t n X n and then differentiate term-by-term (allowed by part 1 of the Lemma). This gives: d d t exp ( t X ) = n = 0 1 n ! n t n - 1 X n . The n = 0 term vanishes and the n over n ! gives 1 / ( n - 1 ) ! , so (pulling out a common factor of X ) this equals X n = 1 t n - 1 ( n - 1 ) ! X n - 1 . Now relabelling m = n - 1 this becomes X m = 0 t m m ! X m = X exp ( t X ) , as required.

Let's just illustrate this with an example: take X = ( 0 - 1 1 0 ) ; then by the calculation from the last video, we get: d d t exp ( t ( 0 - 1 1 0 ) ) = d d t ( cos t - sin t sin t cos t ) = ( - sin t - cos t cos t - sin t ) and this should be equal to X exp ( t X ) , which is ( 0 - 1 1 0 ) ( cos t - sin t sin t cos t ) = ( - sin t - cos t cos t - sin t ) as expected.

(4) follows from (3) by taking Y = - X : since X commutes with - X , we get exp ( X ) exp ( - X ) = exp ( X - X ) = exp ( 0 ) = I .

Finally, we will prove (3). Let's write out the product exp ( X ) exp ( Y ) = i = 0 1 i ! X i j = 0 1 j ! Y j . We will do various manipulations which are justified by the absolute convergence claimed in part (1) of the Lemma. First, let's take all the summations outside: exp ( X ) exp ( Y ) = i = 0 j = 0 1 i ! j ! X i Y j . Next, we will write out the double sum as a grid. . i = 0 i = 1 i = 2 j = 0 I X 1 2 X 2 j = 1 Y X Y . j = 2 1 2 Y 2 . . . . . Let's group together the constant term I , the linear terms X + Y , the quadratic terms 1 2 X 2 + X Y + 1 2 Y 2 , etc. Let's call these groups the k = 0 , k = 1 , k = 2 terms etc (so k is the order of the terms in question).

With this regrouping of terms, our infinite sum becomes k = 0 i = 0 k 1 i ! ( k - i ) ! X i Y k - i . (Here, in each group, j is determined by i as j = k - i .)

The inner sum is a finite sum. It looks a lot like a binomial expansion, but not quite. To make it look more like a binomial expansion, we multiply the sum by k ! and simultaneously divide it by k ! (leaving the answer unchanged): k = 0 1 k ! i = 0 k k ! i ! ( k - i ) ! X i Y k - i . The term k ! i ! ( k - i ) ! is the binomial coefficient ( k i ) , so this is equivalent to k = 0 ( X + Y ) k provided that X Y = Y X .

To finish the proof, note that k = 0 1 k ! ( X + Y ) k equals exp ( X + Y ) by inspection.

We really need X Y = Y X . For example, ( X + Y ) 2 = X 2 + X Y + Y X + Y 2 X 2 + 2 X Y + Y 2 , unless X Y = Y X .

Remark:
  • The double-sum rearrangement in the proof of 4 is called the Cauchy product formula, and works whenever you have absolutely convergent power series.