Trianam's notes
a blog about machine learning and artificial intelligence

Linear Algebra

April 10, 2018

Stefano Martina



Notation

Vectors

A vector is a point in a multi-dimensional space. It can be viewed as an ordered array of coordinates respect to a base1. A vector can be arranged in a column or a row, the former when no otherwise specified.

Vectors are usually denoted by bold lower-case letters, and its elements with the same letter in italics with indices. A vector of a $n$-dimensional space $\vec{x}\in \mR^n$ is written as:

where $x_1$ is the coordinate of the first dimension, $x_2$ of the second one, and so on.

Matrices

A bidimensional vector is called a matrix. Matrices are denoted by bold upper-case letters, and its elements with the same letter in italic with indices. A matrix with $n$ rows and $m$ columns $\mat{A}\in\mR^{n\times m}$ is written as:

Sometimes it is useful to denote the elements of $\mat{A}$ with instead of $A_{i,j}$.

It is possible to denote the $i$-th row of $\mat{A}$ with and the $j$-th column with .

Tensors

Tensors are matrix generalization to multiple dimensions. They are multidimensional arrays of numbers arranged on a regular grid. Tensors are denoted with bold upper-case sans-serif letters, and its elements with lower-case letters.

A tensor $\ten{A}\in\mR^{n\times m\times\cdots\times p}$ has $n\times m\times\cdots\times p$ elements identified by $\tenE{A}_{i,j,\dots,k}$, with $i\in[1,n]$, $j\in[1,m]$, $\dots$ , $k\in[1,p]$.

Operations

Transposition

Vectors and matrices can be transposed. The transposed matrix of $\mat{A}\in\mR^{n\times m}$ is the matrix $\mat{A}^T\in\mR^{m\times n}$ obtained reflecting it along the main axis. For each element:

The transpose of a vector $\vec{x}$ transform it in a row vector if it was a column vector and vice versa.

Matrix product

It is possible to multiply two matrices $\mat{A}\in\mR^{m\times n}$ and $\mat{B}\in\mR^{n\times p}$. The result is the matrix $\mat{C}\in\mR^{m\times p}$:

where each element is defined as:

Matrix multiplication has distributive property

and associative property

but it does not have commutative property. $\mat{A}\mat{B}=\mat{B}\mat{A}$ does not always hold.

The transpose of a matrix product has the property:

Hadamard product

It is possible also to define an element-wise product of same-dimensionality matrices. The result of the Hadamard product of matrices $\mat{A}\in\mR^{n\times m}$ and $\mat{B}\in\mR^{n\times m}$ is the matrix $\mat{C}\in\mR^{n\times m}$:

where each element is defined as:

Vector dot product

The dot product (or scalar product) between two same-dimensionality vectors $\vec{x}\in\mR^n$ and $\vec{y}\in\mR^n$ is defined as the matrix product $\vec{x}^T\vec{y}$.

Formally it is a scalar defined as:

Unlike matrix product, dot product is commutative:

Linear combination and span

Given a set of vectors

it is possible to define a linear combination of such vectors with scalar coefficients $c_1,\dots,c_n$:

The span of the set \eqref{eq:setSpan} is the set of all vectors that can be obtained as linear combination of it:

Linear independence

A set of vectors like \eqref{eq:setSpan} is linearly independent if no vector of it can be expressed as linear combination of the others.

Identity and Inverse matrix

Identity matrix

Identity matrix is a special square matrix that does not change the value of a vector when it is multiplied to it. It is denoted with $\mat{I}_n\in\mR^{n\times n}$ and it is equal to:

where

This matrix has the property that:

Inverse matrix

Given a matrix $\mat{A}\in\mR^{n\times m}$, his inverse is denoted with $\mat{A}^{-1}\in\mR^{m\times n}$ and it is the matrix such that:

The inverse of a matrix not always exists. The existence of $\mat{A}^{-1}$ is related to solutions of a system of linear equations. We introduce the equation system:

where $A_{i,j}$ are coefficients, $x_i$ variables and $b_i$ constant terms. It is possible to express \eqref{eq:systemBig} in a compact form using matrix notation:

If $\mat{A}^{-1}$ exists, then we can use it to solve \eqref{eq:system}:

Thus, in order $\mat{A}^{-1}$ exists, \eqref{eq:system} need to have one and only one solution for every $\vec{b}$.

Note that for the definition of matrix product, $\mat{A}\vec{x}$ is a vector in $\mR^m$, and

or more compactly

In other terms $\vec{b}$ is a linear combination of the columns of $\mat{A}$. Thus, we need to verify if to determine if \eqref{eq:system} has a solution2.

Without going to details, to verify if \eqref{eq:system} has one and only one solution for every $\vec{b}$ (and thus $\mat{A}^{-1}$ exists), $\mat{A}$ need to be a square matrix with all the column linearly independent3.

Note that for square matrices



References

[Goodfellow2016] Goodfellow, I.; Bengio, Y.; Courville, A. (2016). Deep Learning. MIT Press.



Footnotes

  1. E.g. the canonical base with all ones. ↩︎

  2. is called column space or range of $\mat{A}$. ↩︎

  3. A matrix with linearly independent columns is called non singular. ↩︎