Local logarithm

Some notation

We have introduced the exponential map, which takes any n -by- n matrix and returns an invertible n -by- n matrix. Symbolically, we write: exp : 𝔤 𝔩 ( n , 𝐑 ) G L ( n , 𝐑 ) . (We could replace 𝐑 with 𝐂 ). Here, G L ( n , 𝐑 ) denotes the group of invertible n -by- n matrices with real entries (L stands for "linear", G stands for "general) and 𝔤 𝔩 ( n , 𝐑 ) denotes the set of all matrices. This notation may look weird now, but as the course goes on we will see that to any group G of matrices there is a set 𝔤 of matrices whose exponentials live in G and the notation is always to turn the name of the group into lower-case Gothic font (e.g. S O ( n ) gives us 𝔰 𝔬 ( n ) ).

Logarithm

Can we define an inverse log : G L ( n , 𝐑 ) 𝔤 𝔩 ( n , 𝐑 ) ? It turns out that we can, though we run into some issues: even for numbers we have a problem, namely that e 0 = e 2 π i = e 4 π i = , so there are many choices of logarithm. In that situation, we fix this by taking a branch cut from the origin along the negative x -axis and only try to define the logarithm for numbers away from the branch cut (when we cross the branch cut we have to change which branch of the logarithm we're using if we want the logarithm to be continuous). We use a similar trick here, in that we only define the logarithm for a subset of matrices, those close to the identity matrix.

Theorem (Local invertibility of exp):

There exist neighbourhoods U 𝔤 𝔩 ( n , 𝐑 ) (containing the zero matrix) and V G L ( n , 𝐑 ) (containing the identity matrix) such that exp ( U ) = V and exp | U : U V is invertible. We call the inverse log : V U .

Remark:

This logarithm is infinitely differentiable and even analytic; we'll write down its Taylor series later.

Inverse function theorem

We will prove this theorem by appealing to a result from multivariable calculus called the inverse function theorem. We won't prove the inverse function theorem, but we will explain how to apply it to deduce local invertibility of the exponential map:

Theorem (Inverse function theorem):

Suppose F : 𝐑 N 𝐑 N is a continuously differentiable function. Think of F as a vector of functions ( F 1 ( x 1 , , x N ) , , F N ( x 1 , , x N ) ) . We define the derivative of F at the origin d 0 F to be the matrix of partial derivatives: d 0 F := ( F 1 x 1 ( 0 , , 0 ) F 1 x N ( 0 , , 0 ) F N x 1 ( 0 , , 0 ) F N x N ( 0 , , 0 ) ) If d 0 F is invertible then there exist neighbourhoods U 𝐑 N (containing 0 ) and V 𝐑 N (containing F ( 0 ) ) such that F ( U ) = V and F | U : U V is invertible. Moreover, F - 1 is differentiable with derivative d F ( 0 ) F - 1 = ( d 0 F ) - 1 .

Remark:

The nice thing about calculus is that it replaces nonlinear objects (like F ) with much simpler linear approximations (like d 0 F ), then it proves theorems of the form "assume something about the linearisation, then deduce something local about the nonlinear object". Here we're assuming the linearisation is invertible and we deduce that F is locally invertible.

Proof of local invertibility

Proof:

Let's see how to apply this to the exponential map and deduce the existence of the local logarithm. The exponential map goes from 𝔤 𝔩 ( n , 𝐑 ) to G L ( n , 𝐑 ) , and we can think of 𝔤 𝔩 ( n , 𝐑 ) as the Euclidean space 𝐑 n 2 with coordinates given by the matrix entries. Similarly, G L ( n , 𝐑 ) is contained in 𝐑 n 2 (again using matrix entries as local coordinates). So we can think of exp as a map 𝐑 n 2 𝐑 n 2 , which puts us in the setting of the inverse function theorem.

It remains to compute d 0 exp . When we identify 𝔤 𝔩 ( n , 𝐑 ) with 𝐑 n 2 , we're writing each n -by- n matrix as a vector just by writing the n 2 matrix entries in a column instead of a grid. In other words, using matrix entries as coordinates, we need to compute the matrix whose entries are: ( ( exp M ) 11 M 11 ( exp M ) 11 M 12 ( exp M ) 11 M n n ( exp M ) 12 M 11 ( exp M ) 12 M 12 ( exp M ) 12 M n n ( exp M ) n n M 11 ( exp M ) n n M 12 ( exp M ) n n M n n ) Here, all partial derivatives are evaluated at M = 0 . If this big matrix is confusing, think about the 4 -by- 4 matrix you get when n = 2 .

We know the Taylor series exp ( M ) = I + M + where the dots are higher-order. When we differentiate exp ( M ) with respect to M i j , the constant term I goes away, and the higher order terms still have factors of M k in them (for some k , ) so when we set M = 0 these go away. All we are left with is the linear term M , so ( exp M ) k M i j = M k M i j . The M i j s are just independent variables, so for example M 11 / M 11 = 1 while M 12 / M 11 = 0 . Therefore the matrix equals ( M 11 M 11 M 11 M 12 M 11 M n n M 12 M 11 M 12 M 12 M 12 M n n M n n M 11 M n n M 12 M n n M n n ) = ( 1 0 0 0 1 0 0 0 1 ) which is the identity matrix.

Another way of thinking about this is that the Taylor expansion of a function F is F ( x ) = F ( 0 ) + d 0 F ( x ) + , that is d 0 F ( x ) is the first order part of the Taylor expansion of F . Therefore exp ( M ) = I + d 0 exp ( M ) + and we know the first order part of the Taylor expansion of exp ( M ) is M , so ( d 0 exp ) ( M ) = M . So d 0 exp is the identity.

Since the d 0 exp is invertible (it's the identity!), this means that we can apply the inverse function theorem to deduce that exp admits a local inverse near the identity.

Remark:

It turns out that the logarithm is analytic, that is its Taylor series converges on some set of matrices. The Taylor series is exactly the same as the Taylor series for the usual logarithm, that is log ( I + M ) = M - 1 2 M 2 + 1 3 M 3 - 1 4 M 4 + Here, any matrix inside V is close to the identity, so can be written as I + M for a small matrix M .