= torch.arange(3), torch.arange(3) # torch.tensor([0, 1, 2])
u, v # u @ v torch.dot(u, v)
tensor(5)
Theo POMIES
August 29, 2025
September 4, 2025
\[ \mathbf{u} \cdot \mathbf{v} = \sum_{i} u_i v_i \]
\[ (\mathbf{A}\mathbf{u})_{i} = \mathbf{a}_{i,:} \cdot \mathbf{u} = \sum_{j} a_{i,j} u_j \]
\[ (\mathbf{A}\mathbf{B})_{i,j} = \mathbf{a}_{i,:} \cdot \mathbf{b}_{:, j} = \sum_{k} a_{i,k} b_{k,j} \]
\[ \mathbf{A}\mathbf{A}^{-1} = \mathbf{A}^{-1}\mathbf{A} = \mathbf{I},\quad \mathbf{A} \in \mathbb{R}^{n \times n} \] Note: Might not exist if \(\mathbf{A}\) is not invertible, that is \(\det(\mathbf{A}) = 0\)
\[ \det(\mathbf{A}^\top) = \det(\mathbf{A}) \] \[ \det(\mathbf{A}\mathbf{B}) = \det(\mathbf{A})\det(\mathbf{B}) \]
\[ A = \operatorname{diag}(a_1, \dots, a_n), \quad \det(A) = \prod_{i=1}^n a_i \]
\[ \det(T) = \prod_{i} t_{ii}, \quad \text{where $T$ is triangular} \]
\[ \mathbf{A} \, \textnormal{is orthogonal} \iff \mathbf{A}^\top\mathbf{A} = \mathbf{I}\: (= \mathbf{A}\mathbf{A}^\top) \]
Interstingly,
\[ \mathbf{A} \, \textnormal{is orthogonal} \iff \mathbf{A}^{-1} = \mathbf{A}^\top \]
In linear algebra, vectors are column vectors by default, so \[ \mathbf{u} \in \mathbb{R}^{n \times 1}, \quad \mathbf{u} = \begin{bmatrix} u_1 \\ \vdots \\ u_n \end{bmatrix} \]
It follows that when doing a Matrix-Vector product, the matrix is on the left \(\mathbf{A}\mathbf{u}\).
Similarly, when we say “we apply a linear transformation \(\mathbf{A}\) to \(\mathbf{B}\)”, we mean \(\mathbf{A}\mathbf{B}\).
I think of the determinant of \(\mathbf{A}\) as the scaling factor of the linear transformation represented by \(\mathbf{A}\).
This explains why a matrix whose determinant is 0 is not invertible: it “collapses”, and two images might have the same original input \(\mathbf{x}\)
Later!
Gaussian elimination — or row reduction – is an algorithm for solving systems of linear equations, based on the following operations:
Computing the determinant of a Matrix is not trivial at first glance.
But consider the following facts:
Knowing that, we want to find a representation of our original matrix \(\mathbf{A}\) that involves an Upper Triangular Matrix \(\mathbf{U}\), and one or more other matrices whose determinant is known or trivial to compute, as \(\mathbf{P}\mathbf{A} = \mathbf{L}\mathbf{U}\)
To go from \(\mathbf{A}\) to \(\mathbf{U}\) we’ll use Gaussian Elimination, \(\mathbf{P}\) tracks our permutations (row swaps) and \(\mathbf{L}\) tracks our row operations (row additions).
Now, because \(\mathbf{P}\) is orthogonal (yes, since its the identity matrix with row swaps, when performing \(\mathbf{P}^\top\mathbf{P}\), the ones in the rows meet the ones in the columns at the diagonal, zeros everywhere else, so we get \(\mathbf{P}^\top\mathbf{P} = \mathbf{I}\)), we then have \[ \mathbf{P}\mathbf{A} = \mathbf{L}\mathbf{U} \implies \mathbf{A} = \mathbf{P}^{-1}\mathbf{L}\mathbf{U} = \mathbf{P}^\top\mathbf{L}\mathbf{U} \]
Finally, this means that \[ \det(\mathbf{A}) = \det(\mathbf{P}^\top) \cdot \det(\mathbf{L}) \cdot \det(\mathbf{U}) = \det(\mathbf{P}) \cdot \det(\mathbf{L}) \cdot \det(\mathbf{U}) \]
Now I’m not gonna prove that, but:
Now, if we just keep track of row swaps, we can easily compute \(\det(\mathbf{A})\)!