Permutation Matrices: Linear Algebra for Programmers

Permutation matrices - from shuffle to symmetry

\( \begin{bmatrix} v_1 \\ \amber{v_2} \\ \rose{v_3} \\ \zinc{v_4} \end{bmatrix} \mathrel{ \overset{P} {\underset{P^T} {\substack{\longrightarrow \\[-0.3em] \longleftarrow}} }} \begin{bmatrix} \amber{v_2} \\ \zinc{v_4} \\ v_1 \\ \rose{v_3} \end{bmatrix} \)

A permutation matrix simply permutes (or shuffles) the elements of the input vector it is multiplied with. When it is multiplied with a matrix, it shuffles the rows or columns of the matrix. This seemingly naive operation leads to some interesting observations, especially in conjunction with other operations like transpose or inverse. This article explores some elementary yet interesting properties of this class of matrices.

Shuffling columns and rows of a matrix

Click the play/pause button to pause the animation and observe the effect of permutation matrices on the matrix.

Right multiplication with a permutation matrix shuffles the columns of the matrix. Left multiplication with a permutation matrix shuffles the rows of the matrix.

Consider the following permutation matrix:

\( P = \) \( \begin{bmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)

Multiplying a matrix to the right with \( P \) shuffles the columns of the matrix.\( \begin{bmatrix} | & | & | & | \\ c_1 & \amber{c_2} & \rose{c_3} & \zinc{c_4} \\ | & | & | & | \end{bmatrix} \) \( P = \) \( \begin{bmatrix} | & | & | & | \\ \amber{c_2} & \rose{c_3} & c_1 & \zinc{c_4} \\ | & | & | & | \end{bmatrix} \)

Column shuffle: \( 1 \rightarrow 3, 2 \rightarrow 1, 3 \rightarrow 2, 4 \rightarrow 4 \)

Multiplying a matrix to the left with the same matrix \( P \) shuffles the rows of the matrix.\(P\) \( \begin{bmatrix} - & r_1 & - \\ - & \amber{r_2} & - \\ - & \rose{r_3} & - \\ - & \zinc{r_4} & - \end{bmatrix} \) \( = \) \( \begin{bmatrix} - & \rose{r_3} & - \\ - & r_1 & - \\ - & \amber{r_2} & - \\ - & \zinc{r_4} & - \end{bmatrix} \)

Row shuffle: \( 1 \rightarrow 2, 2 \rightarrow 3, 3 \rightarrow 1, 4 \rightarrow 4 \)

Notice that the same matrix shuffles the rows and columns in different ways. In fact, the column shuffle is the inverse of the row shuffle and vice versa. Let's explore this idea further.

Inverse of a permutation matrix

In case you did not notice, the actions of the matrices \( P, P^T \) are inverses of each other. In other words, the inverse of a permutation matrix is its transpose: \( P^{-1} = P^T \). Consider the earlier permutation matrix as an example:\( PP^T = \)\( \begin{bmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)\( \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \)\(= I_4 = P^TP\)This is true for any permutation matrix.

Row column duality

Consider the following bilinear expression:\( x^TPy = \begin{bmatrix} x_1 & \amber{x_2} & \rose{x_3} & \zinc{x_4} \end{bmatrix} \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \rose{y_3} \\ y_1 \\ \amber{y_2} \\ \zinc{y_4} \end{bmatrix} = \sum_{i=1}^{4} x_i y_i \)Note that \( x^TPy = \langle x, \underbrace{Py}_{\amber{\text{shuffle} \ y}} \rangle = \langle \underbrace{P^Tx}_{\amber{\text{unshuffle} \ x}}, y \rangle \) where shuffle and unshuffle are inverse permutations of each other. This provides us two ways to align the elements of the two vectors as shown below.

<Pᵀx, y> (shuffle x by Pᵀ)

<x, Py> (shuffle y by P)

Permutation and inverses: an interesting observation

If \( M \) is an invertible matrix then note that \( (MP)^{-1} = P^TM^{-1} \). One can interpret each side as shown below.

\( \underbrace{(MP)^{-1}}_{\text{shuffle columns then invert}} = \underbrace{P^TM^{-1}}_{\text{invert then unshuffle rows}} \)

This insight is useful in cases where shuffling rows or columns converts the matrix into a form that is easier to invert. For example, one can swap columns to find a better pivot or to find the inverse of a sparse matrix.

Application: Removing rows and columns from a matrix

Given the inverse \( M^{-1} \) of a matrix \( M \), under certain conditions it is faster to calculate the inverse of the matrix after removing a fixed number k of rows and columns by manipulating \( M^{-1} \). A specific version of this problem is easier to calculate: if the last k rows and columns are removed. So how do we convert an arbitrary removal of k rows and k columns into a removal of the last k rows and k columns? By shuffling the columns and rows as shown below.

As an example, consider a 8-by-8 matrix. You want to remove the columns [1,4,6] and the rows [2,3,5] from the matrix. We can multiply this matrix on both the sides by two permutation matrices chosen such that the rows and columns to remove are moved to the end.

Shuffle the columns and rows so that the rows and columns to remove (colored gray) are moved to the end.

Column shuffle: [0,7,1,2,6,3,5,4], row shuffle: [0,1,7,6,2,5,3,4]

I have written two articles that explore this problem in greater detail:

Inverse of a matrix after row and column removal

Inverse of a block of a matrix

Permutation group

The set of all n-by-n permutation matrices forms a group under matrix multiplication. This group is called the permutation group. The identity element of this group is the identity matrix. The inverse of a permutation matrix is its transpose. The product of two permutation matrices is another permutation matrix.

Series: From permutations to Sinkhorn