(abcdef) is a 2-by-3 matrix.
03. Bigger matrices
03. Bigger matrices
Bigger matrices
Just as a 2-by-2 matrix defines a transformation of the plane, an m -by-n matrix defines a transformation 𝐑n→𝐑m . An m -by-n matrix is a rectangular array of numbers with m rows and n columns.
The transformation 𝐑n→𝐑m associated to an m -by-n matrix M is the map v↦Mv where:
-
v=(x1x2⋮xn)
-
M=(M11M12⋯M1nM21M22⋯M2n⋮⋮⋱⋮Mm1Mm2⋯Mmn)
-
Mv is the vector whose j th entry is obtained by multiplying the j th row of M into the column vector v , that is Mv=(M11x1+M12x2+⋯+M1nxnM21x1+M22x2+⋯+M2nxn⋮Mm1x1+Mm2x2+⋯+Mmnxn)
This vector Mv has height m because there are m rows of M to multiply into the vector v .
For example, (abcdefghi)(xyz)=(ax+by+czdx+ey+fzgx+hy+iz)
Take M=(cosθ-sinθ0sinθcosθ0001) . We get Mv=(xcosθ-ysinθxsinθ+ycosθz).
Take M=(100010) . We need to feed M a vector of height 3; it will output a vector of height 2. In other words, M defines a transformation 𝐑3→𝐑2 . What is the transformation? (100010)(xyz)=(xy).
Take M=(100100) . This gives a map 𝐑2→𝐑𝟑 : (100100)(xy)=(xy0)

These rectangular (nonsquare) matrices change the dimension of the space we're working with, e.g. map from a lower to a higher dimensional space or vice versa. You might wonder why we matrices which are bigger than 3-by-3, given that we live in a 3-dimensional universe. In fact:
the theory of special relativity treats space and time on an equal footing, and the Lorentz transformations, which describe all the weird relativistic effects like time dilation and length contraction, mix up space and time, and are given by 4-by-4 matrices.
in statistics, data is often represented as a vector of samples; the more samples you have, the bigger the dimension of the vector you need to encode them.
More examples
Take M=(112001) . This defines a map 𝐑2→𝐑3 : M(xy)=(x+y2xy).
-
The x -axis (vectors of the form (x0) ) goes to the set of vectors (x2x0) .
-
The y -axis (vectors of the form (0y) ) goes to the set of vectors (y0y) .
The image of M is the unique plane containing these two lines.

Take M=(10-101-1) . This defines a map 𝐑3→𝐑2 : (10-101-1)(xyz)=(x-zy-z).

This line along which we're projecting has a name: it's called the kernel of M . More on this later.