38. Linear maps

38. Linear maps

Two definitions of linearity

At the outset of this course, we talked about the geometric transformations coming from matrices (rotations, reflections, shears etc). These geometric transformations have a name: they are called linear maps. In this video we'll give two definitions of linear maps and show they're equivalent. The first definition encapsulates how we've been dealing with linear maps so far:

Definition:

A map f from R^n to R^m is linear if there exists an m-by-n matrix A such that f of v = A v for all v in R^n.

However, some linear maps are more natural to describe in another way, without giving the matrix A.

Fix n. Consider the space of polynomials P(x) of degree at most n. Differentiation gives a map d by d x sending P to d P by d x from this space to itself. This map is linear. To understand why, we need to understand polynomials as vectors. We encode a polynomial P(x) = a_n x to the n + dot dot dot + a_1 x + a_0 as its vector of coefficients a_0, a_1, up to a_n. Then d P by d x = n a_n x to the n minus 1 + dot dot dot + 2 a_2 x + a_1 which corresponds to the vector of coefficients a_1, 2 a_2, dot dot dot, n a_n. This is the same as applying the matrix 0, 1, 0, dot dot dot; 0, 0, 2, 0, 0, dot dot dot; 0, 0, 0, 3, 0, dot dot dot etc to the vector a_0, a_1, a_2, down to a_n so differentiation is linear (given by a matrix). In other words, if v_P is the vector of coefficients of the polynomial P then D v_P = v_{d P by d x}, where D is this matrix.

This way of encoding polynomials as vectors is a bit artificial. For example, I could have chosen to write the vector with a_n at the top and a_0 at the bottom, and the matrix D would have ended up looking quite different. The fact that differentiation of polynomials is a linear map is an intrinsic fact about differentiation, and our proof above obscures that. So here's an equivalent definition of linearity which is more intrinsic.

Definition:

A map f is linear if:

  • f of v + w = f of v + f of w for all v, w

  • f of lambda v = lambda times f of v for all v and for all lambda in R.

I haven't specified the domain and target of f because I want to be intentionally vague: this definition makes sense whenever the domain and target of f admit operations of addition and rescaling (e.g. spaces of polynomials or functions as well as just R^n). In the final video of the course, we'll see that the natural setting for this definition is the setting of vector spaces.

Differentiation of polynomials is linear because d by d x of P + Q = d P by d x + d Q by d x and d (lambda P) by d x = lambda d P by d x for any constant lambda and polynomials P(x), Q(x).

The function f from R to R which converts metres to feet is linear. Since 1 metre is approximately 3 point 2 8 1 feet, f of x = 3 point 2 8 1 x. If you double the number of metres, you double the number of feet. If you take two distances x metres and y metres you can add them and then convert to feet (f of x + y) or you can convert and then add (f of x + f of y) and you get the same answer. So f is linear.

The function f from R to R which converts Celsius to Kelvin is not linear. Recall that f of 0 is approximately 273. Any linear map satisfies f(0) = 0, because f of 0 = f of 0 times 0, which equals 0 times f of 0, which equals 0 (some of those 0s are numbers, some are vectors!).

I'm told the way they used to mark exams in Oxford was to take the marks from each question, square them and add them up. For example, if there were two questions and you got marks x and y then your final score would be x squared + y squared. This rewards those who do very well on a couple of questions (instead of scatter-gunning a little bit of knowledge over all questions). This function f of x, y = x squared + y squared is not a linear map! For example, if you score 1 and 0 then you get 1 in total, but if you double your score for x then you quadruple your total. Sadly for those taking the exam, f of 0, 0 = 0.

Equivalence

Lemma:

These two definitions of linearity are equivalent. In other words, the conditions

  • f of v + w = f of v + f of w and f of lambda v = lambda times f of v

imply there exists a matrix A such that f of v = A v, and any map of this form satisfies these conditions.

If f of v = A v for some matrix A then f of v + w = A times (v + w) = A w + A w = f of v + f of w and f of lambda v = A of lambda v = lambda A v = lambda f of v.

Conversely, consider the basis vectors e_1 = 1, 0, dot dot dot, 0; e_2 = 0, 1, 0, dot dot dot, up to e_n = 0, 0, dot dot dot, 0, 1 Let A be the matrix whose columns are f of e_1, f of e_2 across to f of e_n.

Then f of v equals f of v_1 down to v_n, which equals f of v_1 e_1 + v_2 e_2 + dot dot dot + v_n e_n, which equals v_1 f of e_1 + dot dot dot + v_n f of e_n Since f of e_k is the kth column of A, it agrees with A e_k (also the kth column of A). Therefore f of v = v A_1 e_1 + dot dot dot + v_n A e_n, which equals A of (v_1 e_1 + dot dot dot + v_n e_n), which shows that f is linear in the sense that it has the form f of v = A v for some matrix A.