38. Linear maps

38. Linear maps

Two definitions of linearity

At the outset of this course, we talked about the geometric transformations coming from matrices (rotations, reflections, shears etc). These geometric transformations have a name: they are called linear maps. In this video we'll give two definitions of linear maps and show they're equivalent. The first definition encapsulates how we've been dealing with linear maps so far:

Definition:

A map f : 𝐑 n 𝐑 m is linear if there exists an m -by- n matrix A such that f ( v ) = A v for all v 𝐑 n .

However, some linear maps are more natural to describe in another way, without giving the matrix A .

Fix n . Consider the space of polynomials P ( x ) of degree at most n . Differentiation gives a map d d x : P d P / d x from this space to itself. This map is linear. To understand why, we need to understand polynomials as vectors. We encode a polynomial P ( x ) = a n x n + + a 1 x + a 0 as its vector of coefficients ( a 0 a 1 a n ) . Then d P d x = n a n x n - 1 + + 2 a 2 x + a 1 , which corresponds to the vector of coefficients ( a 1 2 a 2 n a n 0 ) . This is the same as ( 0 1 0 0 0 0 0 2 0 0 0 0 0 3 0 0 0 0 0 n 0 0 0 0 0 ) ( a 0 a 1 a n ) so differentiation is linear (given by a matrix). In other words, if v P is the vector of coefficients of the polynomial P then D v P = v d P / d x , where D is this matrix.

This way of encoding polynomials as vectors is a bit artificial. For example, I could have chosen to write the vector with a n at the top and a 0 at the bottom, and the matrix D would have ended up looking quite different. The fact that differentiation of polynomials is a linear map is an intrinsic fact about differentiation, and our proof above obscures that. So here's an equivalent definition of linearity which is more intrinsic.

Definition:

A map f is linear if:

  • f ( v + w ) = f ( v ) + f ( w ) for all v , w

  • f ( λ v ) = λ f ( v ) for all v and for all λ 𝐑 .

I haven't specified the domain and target of f because I want to be intentionally vague: this definition makes sense whenever the domain and target of f admit operations of addition and rescaling (e.g. spaces of polynomials or functions as well as just 𝐑 n ). In the final video of the course, we'll see that the natural setting for this definition is the setting of vector spaces.

Differentiation of polynomials is linear because d d x ( P + Q ) = d P d x + d Q d x and d ( λ P ) d x = λ d P d x for any constant λ and polynomials P ( x ) , Q ( x ) .

The function f : 𝐑 𝐑 which converts metres to feet is linear. Since 1 metre is 3.281 feet, f ( x ) = 3.281 x . If you double the number of metres, you double the number of feet. If you take two distances x metres and y metres you can add them and then convert to feet ( f ( x + y ) ) or you can convert and then add ( f ( x ) + f ( y ) ) and you get the same answer. So f is linear.

The function f : 𝐑 𝐑 which converts Celsius to Kelvin is not linear. Recall that f ( 0 ) 273 . Any linear map satisfies f ( 0 ) = 0 , because f ( 0 ) = f ( 00 ) = 0 f ( 0 ) = 0 (some of those 0 s are numbers, some are vectors!).

I'm told the way they used to mark exams in Oxford was to take the marks from each question, square them and add them up. For example, if there were two questions and you got marks x and y then your final score would be x 2 + y 2 . This rewards those who do very well on a couple of questions (instead of scatter-gunning a little bit of knowledge over all questions). This function f ( x , y ) = x 2 + y 2 is not a linear map! For example, if you score 1 and 0 then you get 1 in total, but if you double your score for x then you quadruple your total. Sadly for those taking the exam, f ( 0 , 0 ) = 0 .

Equivalence

Lemma:

These two definitions of linearity are equivalent. In other words, the conditions

  • f ( v + w ) = f ( v ) + f ( w ) and f ( λ v ) = λ f ( v )

imply there exists a matrix A such that f ( v ) = A v , and any map of this form satisfies these conditions.

If f ( v ) = A v for some matrix A then f ( v + w ) = A ( v + w ) = A v + A w = f ( v ) + f ( w ) and f ( λ v ) = A ( λ v ) = λ A v = λ f ( v ) .

Conversely, consider the basis vectors e 1 = ( 1 0 0 ) , e 2 = ( 0 1 0 ) , , e n = ( 0 0 1 ) . Let A be the matrix whose columns are f ( e 1 ) , f ( e 2 ) , , f ( e n ) .

Then f ( v ) = f ( v 1 v n ) = f ( v 1 e 1 + + v n e n ) = v 1 f ( e 1 ) + + v n f ( e n ) . Since f ( e k ) is the k th column of A , it agrees with A e k (also the k th column of A ). Therefore f ( v ) = v 1 A e 1 + + v n A e n = A ( v 1 e 1 + + v n e n ) = A v , which shows that f is linear in the sense that it has the form f ( v ) = A v for some matrix A .