Permutation Matrices: Linear Algebra for Programmers

Permutation matrices - from shuffle to symmetry

Left multiplication (P.) shuffles the rows and right multiplication (.P) shuffles the columns.

A permutation matrix simply permutes (or shuffles) the rows or columns of a matrix. This seemingly naive operation leads to some interesting observations, especially in conjunction with other operations like transpose or inverse. This article explores some elementary yet interesting properties of this class of matrices.

Prerequisites

You should be well familiar with two different ways to look at matrix-vector products:

1. As a linear combination of columns of the matrix - usually denoted by \( Mv \) where \( v \) is a column vector.

2. As a linear combination of rows of the matrix - usually denoted by \( vM \) where \( v \) is a row vector.

If this is something unfamiliar to you, it is explained in chapter one and chapter two from the ground up.

Once you get the hang of it, it should not be difficult to image that a matrix whose columns and rows are all one-hot vectors at unique indices will permute the columns or rows of any matrix it is multiplied with.

Shuffling columns and rows

Consider the following permutation matrix \( P \): \( \begin{bmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)Multiplying a matrix to the right with \( P \) shuffles the columns of the matrix.\( \begin{bmatrix} | & | & | & | \\ c_1 & c_2 & c_3 & c_4 \\ | & | & | & | \end{bmatrix} \)\( \begin{bmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)=\( \begin{bmatrix} | & | & | & | \\ c_2 & c_3 & c_1 & c_4 \\ | & | & | & | \end{bmatrix} \)

Column shuffle: \( 1 \rightarrow 3, 2 \rightarrow 1, 3 \rightarrow 2, 4 \rightarrow 4 \)

Multiplying a matrix to the left with the same matrix \( P \) shuffles the rows of the matrix.\( \begin{bmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)\( \begin{bmatrix} - & r_1 & - \\ - & r_2 & - \\ - & r_3 & - \\ - & r_4 & - \end{bmatrix} \)=\( \begin{bmatrix} - & r_3 & - \\ - & r_1 & - \\ - & r_2 & - \\ - & r_4 & - \end{bmatrix} \)

Row shuffle: \( 1 \rightarrow 2, 2 \rightarrow 3, 3 \rightarrow 1, 4 \rightarrow 4 \)

Notice that the same matrix shuffles the rows and columns in different ways. In fact, the column shuffle is the inverse of the row shuffle and vice versa. Let's explore this idea further.

Row column duality

Consider the following bilinear form \( u^TPv \) where \( u \) and \( v \) are column vectors and \( P \) is a permutation matrix.\( \begin{bmatrix} u_1 & u_2 & ... & u_{n-1} & u_n \end{bmatrix} \) \( \begin{bmatrix} p_{11} & p_{21} & ... & p_{n1} \\ p_{12} & p_{22} & ... & p_{n2} \\ ... & ... & ... & ... \\ p_{1n} & p_{2n} & ... & p_{nn} \end{bmatrix} \) \( \begin{bmatrix} v_i \\ v_j \\ ... \\ v_k \\ v_l \end{bmatrix} \)

We want a permutation matrix such that this expression is \( \sum u_iv_i \)

We want the permutation matrix that ensures that \( u^TPv = \sum{u_iv_i} \) i.e. each \( u_i \) is multiplied by \( v_i \). Since \( (AB)C = A(BC) \), we have that \( u^T(Pv) = (u^TP)v \).

Note that \( Pv = \)\( \begin{bmatrix} v_1 \\ v_2 \\ ... \\ v_{n-1} \\ v_n \end{bmatrix} \). In other words, \( P \) de-shuffles the rows of \( v \) to match the order of \( u \).

Also note that \( u^TP = \)\( \begin{bmatrix} u_i & u_j & ... & u_k & u_l \end{bmatrix} \) \( \Rightarrow P^Tu = \)\( \begin{bmatrix} u_i \\ u_j \\ ... \\ u_k \\ u_l \end{bmatrix} \)In other words, \( P \) shuffles the columns of \( u^T \) to match the order of \( v \).

It is interesting to see how \( P \) "un-shuffles" the columns in the opposite way of \( P^T \). This is true for any permutation matrix \( P \).

Inverse of a permutation matrix

In case you did not notice, the actions of the matrices \( P, P^T \) are inverses of each other. In other words, the inverse of a permutation matrix is its transpose: \( P^{-1} = P^T \). Consider the earlier permutation matrix as an example:\( PP^T = \)\( \begin{bmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)\( \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)\(= I_4 = P^TP\)

Permutation and inverses: an interesting observation

If \( M \) is an invertible matrix then note that \( (MP)^{-1} = P^TM^{-1} \). In other words, the inverse of a matrix after permuting its columns is the same as permuting the rows of the inverse of the original matrix.

If:\( ( \)\( \begin{bmatrix} | & | & | & | \\ c_1 & c_2 & c_3 & c_4 \\ | & | & | & | \end{bmatrix} \)\( )^{-1} = \)\( \begin{bmatrix} - & r_1 & - \\ - & r_2 & - \\ - & r_3 & - \\ - & r_4 & - \end{bmatrix} \)Then:\( ( \)\( \begin{bmatrix} | & | & | & | \\ c_2 & c_3 & c_1 & c_4 \\ | & | & | & | \end{bmatrix} \)\( )^{-1} = \)\( \begin{bmatrix} - & r_2 & - \\ - & r_3 & - \\ - & r_1 & - \\ - & r_4 & - \end{bmatrix} \)

Application: Removing rows and columns from a matrix

Given the inverse \( M^{-1} \) of a matrix \( M \), under certain conditions it is faster to calculate the inverse of the matrix after removing a fixed number k of rows and columns by manipulating \( M^{-1} \). A specific version of this problem is easier to calculate: if the last k rows and columns are removed. So how do we convert an arbitrary removal of k rows and k columns into a removal of the last k rows and k columns? By shuffling the columns and rows as shown below.

As an example, consider a 8-by-8 matrix. You want to remove the columns [1,4,6] and the rows [2,3,5] from the matrix. We can multiply this matrix on both the sides by two permutation matrices chosen such that the rows and columns to remove are moved to the end.

Shuffle the columns and rows so that the rows and columns to remove (colored gray) are moved to the end.

Column shuffle: [0,7,1,2,6,3,5,4], row shuffle: [0,1,7,6,2,5,3,4]

I have written two articles that explore this problem in greater detail:

Inverse of a matrix after row and column removal

Inverse of a block of a matrix

Permutation group

The set of all n-by-n permutation matrices forms a group under matrix multiplication. This group is called the permutation group. The identity element of this group is the identity matrix. The inverse of a permutation matrix is its transpose. The product of two permutation matrices is another permutation matrix.

It is actually this property that leads us to find any "good" permutation (e.g. moving certain rows and columns to the end) from any random permutation.

Series: From permutations to Sinkhorn