Local logarithm

Some notation

We have introduced the exponential map, which takes any n-by-n matrix and returns an invertible n-by-n matrix. Symbolically, we write: exp from little g l n R to big G L n R. (We could replace the real numbers with the complex numbers). Here, big G L n R denotes the group of invertible n-by-n matrices with real entries (L stands for "linear", G stands for "general) and little g l n R denotes the set of all matrices. This notation may look weird now, but as the course goes on we will see that to any group big G of matrices there is a set little g of matrices whose exponentials live in big G and the notation is always to turn the name of the group into lower-case Gothic font (e.g. SO(n) gives us little s o n).

Logarithm

Can we define an inverse \log\colon GL(n,\RR)\to\mathfrak{gl}(n,\RR)? It turns out that we can, though we run into some issues: even for numbers we have a problem, namely that exp of 0 equals exp of 2 pi i equals exp of 4 pi i etc, so there are many choices of logarithm. In that situation, we fix this by taking a branch cut from the origin along the negative x-axis and only try to define the logarithm for numbers away from the branch cut (when we cross the branch cut we have to change which branch of the logarithm we're using if we want the logarithm to be continuous). We use a similar trick here, in that we only define the logarithm for a subset of matrices, those close to the identity matrix.

Theorem (Local invertibility of exp):

There exist neighbourhoods U inside little g l n R (containing the zero matrix) and V inside big G L n R (containing the identity matrix) such that exp of U equals V and exp restricted to U, going from U to V is invertible. We call the inverse log going from V to U.

Remark:

This logarithm is infinitely differentiable and even analytic; we'll write down its Taylor series later.

Inverse function theorem

We will prove this theorem by appealing to a result from multivariable calculus called the inverse function theorem. We won't prove the inverse function theorem, but we will explain how to apply it to deduce local invertibility of the exponential map:

Theorem (Inverse function theorem):

Suppose F from R N to R N is a continuously differentiable function. Think of F as a vector of functions F 1 of x 1 up to x N, dot dot dot, F N of x 1 up to x N. We define the derivative of F at the origin d at zero of F to be the matrix of partial derivatives: d at zero of F equals the matrix partial F 1 by partial x 1, across to partial F 1 by partial x N, down to partial F N by partial x 1, across to partial F N by partial x N If d at zero of F is invertible then there exist neighbourhoods U in R N (containing 0) and V in R N (containing F of zero) such that F sends U to V and F restricted to U, going from U to V is invertible. Moreover, F inverse is differentiable with derivative d at F of zero of F inverse equals the inverse of d at zero of F.

Remark:

The nice thing about calculus is that it replaces nonlinear objects (like F) with much simpler linear approximations (like d at zero of F), then it proves theorems of the form "assume something about the linearisation, then deduce something local about the nonlinear object". Here we're assuming the linearisation is invertible and we deduce that F is locally invertible.

Proof of local invertibility

Proof:

Let's see how to apply this to the exponential map and deduce the existence of the local logarithm. The exponential map goes from little g l n R to big G L n R, and we can think of little g l n R as the Euclidean space R (n squared) with coordinates given by the matrix entries. Similarly, big G L n R is contained in R (n squared) (again using matrix entries as local coordinates). So we can think of \exp as a map from R (n squared) to R (n squared), which puts us in the setting of the inverse function theorem.

It remains to compute d at zero of exp. When we identify little g l n R with R (n squared), we're writing each n-by-n matrix as a vector just by writing the n squared matrix entries in a column instead of a grid. In other words, using matrix entries as coordinates, we need to compute the matrix whose entries are: the partial derivative of the 1,1 entry of exp M by the 1,1 entry of M, the partial derivative of the 1,1 entry of exp M by the 1,2 entry of M, dot dot dot, across to the partial derivative of the 1,1 entry of exp M by the n,n entry of M, down to the partial derivative of the n,n entry of exp M by the 1,1 entry of M, and across to the partial derivative of the n,n entry of exp M by the n,n entry of M. Here, all partial derivatives are evaluated at M = 0. If this big matrix is confusing, think about the 4-by-4 matrix you get when n = 2.

We know the Taylor series exp M equals I plus M plus dot dot dot where the dots are higher-order. When we differentiate exp M with respect to M i j, the constant term I goes away, and the higher order terms still have factors of M k l in them (for some k, l) so when we set M = 0 these go away. All we are left with is the linear term M, so partial d exp M k l by d M i j equals partial M k l by M i j. The M i js are just independent variables, so for example partial d M 1 1 by d M 1 1 equals 1 while partial d M 1 2 by d M 1 1 equals 0. Therefore the matrix equals first row partial d M 1 1 by d M 1 1, partial d M 1 1 by d M 1 2, second row partial d M 1 2 by d M 1 2, partial d M 1 2 by d M 1 2, etc which has first row 1, 0, 0, dot dot dot, second row 0, 1, 0, dot dot dot etc which is the identity matrix.

Another way of thinking about this is that the Taylor expansion of a function F is F of x equals F of zero plus d at zero of F applied to the vector x plus dot dot dot, that is d at zero of F applied to the vector x is the first order part of the Taylor expansion of F. Therefore exp of M equals I plus d at zero of exp applied to M plus dot dot dot and we know the first order part of the Taylor expansion of exp of M is M, so d at zero of exp applied to M equals M. So d at zero of exp is the identity.

Since the d at zero of exp is invertible (it's the identity!), this means that we can apply the inverse function theorem to deduce that \exp admits a local inverse near the identity.

Remark:

It turns out that the logarithm is analytic, that is its Taylor series converges on some set of matrices. The Taylor series is exactly the same as the Taylor series for the usual logarithm, that is log of I plus M equals M minus a half M squared plus a third M cubed minus a quarter M to the 4 etc Here, any matrix inside V is close to the identity, so can be written as I plus M for a small matrix M.