词汇表

选择左侧的一个关键字...

Linear AlgebraDot Products

阅读时间: ~35 min

Consider a shop inventory which lists unit prices and quantities for each of the products they carry. For example, if the store has 32 small storage boxes at $4.99 each, 18 medium-sized boxes at $7.99 each, and 14 large boxes at $9.99 each, then the inventory's price vector \mathbf{p} and quantity vector \mathbf{q} are

\begin{align*}\mathbf{p} = \begin{bmatrix} 4.99 \\\ 7.99 \\\ 9.99 \end{bmatrix}, \quad \mathbf{q} = \begin{bmatrix} 32 \\\ 18 \\\ 14 \end{bmatrix}.\end{align*}

The total value of the boxes in stock is

\begin{align*}(32)($4.99) + (18)($7.99) + (14)($9.99) = $443.36.\end{align*}

This operation—multiplying two vectors' entries in pairs and summing—arises often in applications of linear algebra and is also foundational in the theory of linear algebra.

Definition
The dot product of two vectors in \mathbb{R}^n is defined by

\begin{align*}\mathbf{x} \cdot \mathbf{y} = x_1y_1 + x_2y_2 + \cdots + x_n y_n.\end{align*}

Example
If \mathbf{x} = \begin{bmatrix} 1 \\\ 3 \\\ 5 \\\ 7 \end{bmatrix} and \mathbf{y} =\begin{bmatrix} 2 \\\ 4 \\\ 6 \\\ 8 \end{bmatrix}, then \mathbf{x} \cdot \mathbf{y} = + + + = 100.

One of the most algebraically useful features of the dot product is its linearity (which may be checked using the definition):

\begin{align*}\mathbf{x} \cdot (c\mathbf{y} + \mathbf{z}) = c \mathbf{x} \cdot \mathbf{y} + \mathbf{x} \cdot \mathbf{z}\end{align*}

The dot product also has two fundamental connections to geometry. The first is the identity

\begin{align*}|\mathbf{a}|^2 = \mathbf{a} \cdot \mathbf{a}\end{align*}

for all vectors \mathbf{a}. Let's see how this identity can work in conjunction with linearity of the dot product.

Exercise
Show that |\mathbf{a} +\mathbf{b}|^2 = |\mathbf{a}|^2 + 2 \mathbf{a}\cdot \mathbf{b} + |\mathbf{b}| ^2 for all vectors \mathbf{a} and \mathbf{b} in \mathbb{R}^n.

Solution. Using linearity of the dot product, we get

\begin{align*}(\mathbf{a} + \mathbf{b}) \cdot (\mathbf{a} + \mathbf{b}) &= \mathbf{a} \cdot (\mathbf{a} + \mathbf{b}) + \mathbf{b}\cdot (\mathbf{a} + \mathbf{b}) \\\ &= \mathbf{a} \cdot \mathbf{a} + \mathbf{a}\cdot\mathbf{b} + \mathbf{b} \cdot \mathbf{a} + \mathbf{b} \cdot \mathbf{b} \\\ &= |\mathbf{a}|^2 + 2\mathbf{a}\cdot\mathbf{b} + |\mathbf{b}|^2\end{align*}

as required.

The second connection between geometry and the dot product pertains to angle. If \theta is the angle between two vectors \mathbf{x} and \mathbf{y} (when they are situated so that their tails coincide), then

\begin{align*}\mathbf{x} \cdot \mathbf{y} = |\mathbf{x}| |\mathbf{y}|\cos\theta.\end{align*}

It follows that \mathbf{x} \cdot \mathbf{y} = 0 if and only if \mathbf{x} and \mathbf{y} meet at a angle. We say that two vectors \mathbf{x} and \mathbf{y} which satisfy \mathbf{x} \cdot \mathbf{y} = 0 are orthogonal.

Exercise
In natural language processing, one basic way to compare a finite number of text documents is to use vectorized word counts. Suppose the documents have a combined total of n distinct words, which are arranged in some order. Each document is then associated with a vector of length n whose $i$th entry indicates the number of times the i th word occurs in the associated document.

One way to measure similarity between two documents is to take the dot product of the associated unit vectors: If two documents A and B have associated vectors \mathbf{a} and \mathbf{b} respectively, their similarity is defined by

\begin{align*}S(A, B) = \frac{\mathbf{a} \cdot \mathbf{b}}{|\mathbf{a}| |\mathbf{b}|}.\end{align*}

By the dot product cosine formula, we have 0 \leq S(A, B) \leq 1 for any two documents A and B. Documents with no words in common are associated with orthogonal vectors and thus have 0 similarity. If two documents have similarity 1, their associated vectors are scalar multiples of each other, meaning that they have the same words and that the words appear in the same proportions.

The vectorized word count similarity between the sentences

"The rain in Spain falls mainly in the plain"

"The plain lane in Spain is mainly a pain"

is .

Solution. Listing the words in the order the, in, rain, Spain, falls, mainly, plain, lane, pain, is, a, the two vectorized word counts are [2,2,1,1,1,1,1,0,0,0,0] and [1,1,0,1,0,1,1,1,1,1,1]. Substituting into the definition of S, we get a similarity of approximately 0.647.

Exercise
Let \mathbf{v}_1, \dots, \mathbf{v}_n be a list of orthogonal non-zero vectors, that is, for all i, j \in \{1, \dots, n\}, suppose that \mathbf{v}_i \cdot \mathbf{v}_j = 0 whenever i \neq j. Show that this list is linearly independent.

Solution. Suppose, for the sake of contradiction, that the vectors are linearly . Then one of the vectors can be written as a linear combination of the others. Suppose \mathbf{v}_1 is such a vector. Then there exists a list of weights c_2, \dots, c_n such that

\begin{align*}\mathbf{v}_1 = c_2\mathbf{v}_2 + \cdots + c_n\mathbf{v}_n.\end{align*}

Then

\begin{align*}0 &= \mathbf{v}_1 \cdot \mathbf{v}_2 \\\ & = (c_2\mathbf{v}_2 + \cdots + c_n\mathbf{v}_n) \cdot \mathbf{v}_2 \\\ & = c_2|\mathbf{v}_2|^2.\end{align*}

Since |\mathbf{v}_2|^2 = 1, this implies that c_2 is zero. Repeating this for all vectors \mathbf{v}_3, \dots, \mathbf{v}_n we see that c_2=c_3 = \cdots = c_n = 0. Thus \mathbf{v}_1 is also zero (since it's a linear combination of the other vectors, with all zero weights), and that contradicts the fact that |\mathbf{v}_1|^2 = .

The same reasoning tells us that none of the vectors in the list can be equal to a linear combination of the others. Therefore the vectors must be linearly .

The following exercise illustrates another way of calculating matrix products. We will call it the matrix product dot formula:

Exercise
Let A = \begin{bmatrix} 3 & -1 & 2 \\\ 4 & 2 & 0 \end{bmatrix} and B = \begin{bmatrix} 4 & -5 & 0 & 1 \\\ 2 & 8 & 0 & 0 \\\ -1 & 5 & 3 & 2 \end{bmatrix}. Consider the matrix C whose $(i,j)$th entry is equal to the dot product of the $i$th row of A and the $j$th column of B. Show that C = AB, and use this fact to work out the full product AB.

Solution. By the product column rule, the first column of AB is A\mathbf{b}_1, where \mathbf{b}_1 is the first of B. Therefore, the first entry of that column is A_{1,1}B_{1,1} + A_{1,2}B_{2,1} +\cdots + A_{1,n}B_{n,1}. This is the dot product of the first row of A and the first column of B. The same reasoning applies to the other entries.

Calculating all eight such dot products, we find that

\begin{align*}\begin{bmatrix} 8 & -13 & 6 & 7 \\\ 20 & -4 & 0 & 4 \end{bmatrix}\end{align*}

Block multiplication

A block matrix is a matrix defined using smaller matrices which are called blocks. For example, suppose that

\begin{align*}A &= \begin{bmatrix} 2 & 4 & 7 & 6 \\ 5 & 2 & 4 & 5 \end{bmatrix} \\ B &= \begin{bmatrix} 1 & 3 \\\ 0 & 2 \end{bmatrix} \\ C &= \begin{bmatrix} 3 & 0 & 1 & 7 \end{bmatrix} \\ D &= \begin{bmatrix} 6 & 4 \end{bmatrix}\end{align*}

Then the block matrix \begin{bmatrix} A & B \\\ C & D \end{bmatrix} defined in terms of the blocks A, B, C, and D is

\begin{align*}\begin{bmatrix} A & B \\ C & D \end{bmatrix} = \begin{bmatrix} 2 & 4 & 7 & 6 & 1 & 3 \\ 5 & 2 & 4 & 5 & 0 & 2 \\ 3 & 0 & 1 & 7 & 6 & 4 \end{bmatrix}.\end{align*}

The advantage of writing a matrix in block form is that we can formally carry out the matrix multiplication dot formula, treating the blocks as matrix entries, and we get the correct result (in block form). For example,

\begin{align*}\begin{bmatrix} A & B \\ C & D \end{bmatrix}\begin{bmatrix} E & F \\ G & H \end{bmatrix} = \begin{bmatrix} AE + BG & AF + BH \\ CE + DG & CF + DH \end{bmatrix}\end{align*}

if \begin{bmatrix} A & B \\\ C & D \end{bmatrix} and \begin{bmatrix} E & F \\\ G & H \end{bmatrix} are block matrices with blocks A, B, C, D, E, F, G, blocks H. We call this the block matrix product formula.

Exercise
Verify the matrix product block formula above with

\begin{align*}E &= \begin{bmatrix} 7 \\\ 0 \\\ 2 \\\ 4 \end{bmatrix}, F = \begin{bmatrix} 5 & 3 \\\ 3 & 2 \\\ 0 & 6 \\\ 2 & 1 \end{bmatrix}, G = \begin{bmatrix} 6 \\\ 1 \end{bmatrix}, \text{ and } H = \begin{bmatrix} 2 & 0 \\\ 0 & 2 \end{bmatrix}.\end{align*}

Solution. We have

\begin{align*}AE + BG &= \begin{bmatrix} 61 \\\ 65 \end{bmatrix} \\ CE + DG &= \begin{bmatrix} 91 \end{bmatrix} \\ AF + BH &= \begin{bmatrix} 36 & 68 \\\ 41 & 52 \end{bmatrix} \\ CF + DH &= \begin{bmatrix} 41 & 30 \end{bmatrix}\end{align*}

while

\begin{align*}\begin{bmatrix} A & B \\\ C & D \end{bmatrix}\begin{bmatrix} E & F \\\ G & H \end{bmatrix} = \begin{bmatrix} 61 & 36 & 68 \\\ 65 & 41 & 52 \\\ 91 & 41 & 30 \end{bmatrix}.\end{align*}

So the block matrix product formula checks out.

Exercise
Show that if A is a matrix whose columns are \mathbf{a}_1, \ldots, \mathbf{a}_n and B is a matrix whose rows are \mathbf{b}_1', \ldots, \mathbf{b}_n', then AB = \mathbf{a}_1\mathbf{b}_1' + \mathbf{a}_2\mathbf{b}_2' + \cdots + \mathbf{a}_n\mathbf{b}_n'

Solution. This follows directly from the block matrix product formula by writing A is a block matrix with its columns as blocks and B with its rows as blocks.

Bruno
Bruno Bruno