# 7. Reordering Method

Denote the following.

• $$V$$ is a $$n$$ x $$n$$ weakness matrix where $$v_{ij}$$ represents some dominance over the i-th team by the j-th team.

• $$D$$ is $$V^T$$ and referred to as the strength matrix where $$d_{ij}$$ represents dominance of the i-th team over the j-th team.

• $$R$$ is the rank-differential matrix that is $$n$$ x $$n$$ and an upper triangle matrix of the following form.

$$R=\left[\begin{matrix}0 & 1 & 2 & \dots & n - 1 \\ & & 1 & \dots & n - 2 \\ & & \ddots & \ddots & \vdots \\ & & & \ddots & 1 \\ & & & & 0 \end{matrix}\right]$$

The idea of the Reordering Method is to find some $$Q$$ to reorder/transform $$D$$ to be as close as possible to $$R$$. $$Q$$ is referred to as the permutation matrix. The minimization problem is formulated as follows.

$$\underset{Q}{\min}||Q^TDQ - R||$$

$$Q$$ is a $$n$$ x $$n$$ sparse matrix with $$q_{ij} \in {0, 1}$$. Also, $$Q$$ is specified by a candidate ranking vector $$r$$. Each index and value of $$r$$ determines $$Q$$. For example, $$r = [4, 1, 3, 2, 0]$$ will create a $$Q$$ such that

• $$r_0 = 4$$: column 0, row 4 will be 1,

• $$r_1 = 1$$: column 1, row 1 will be 1,

• $$r_2 = 3$$: column 2, row 3 will be 1,

• $$r_3 = 2$$: column 3, row 2 will be 1, and

• $$r_4 = 0$$: column 4, row 0 will be 1.

$$Q(r=[4, 1, 3, 2, 0]) = \left[\begin{matrix}0 & 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \end{matrix}\right]$$

For a matrix of size $$n$$, there will be $$n!$$ permutations of $$r$$ possible. Note

• $$5! = 120$$ and

• $$10! = 3,628,800$$.

It is not computationally tractable to exhaustively try all permutations of $$r$$ to find the solution for the minimization problem. Instead, exploration and exploitation theoretical frameworks like genetic algorithms are advised to find the (near-) optimal solution.

## 7.1. NCAAF ACC 2005

This data $$V$$ comes from the NCAAF ACC 2005 result, and is the point differential. The order of the columns and rows are Duke, Miami, UNC, UVA and VT. $$D$$ is $$V^T$$ and $$R$$ is given as follows. What we want to do is to find a near-optimal $$Q$$ to permute $$D$$ into $$R$$ as closely as possible.

[1]:

import numpy as np

V = np.array([
[0, 45, 3, 31, 45],
[0, 0, 0, 0, 0],
[0, 18, 0, 0, 27],
[0, 8, 2, 0, 38],
[0, 20, 0, 0, 0]
])

D = V.T

R = np.array([
[0, 1, 2, 3, 4],
[0, 0, 1, 2, 3],
[0, 0, 0, 1, 2],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 0]
])


Let’s normalize $$D$$ and $$R$$ over the sum of all elements.

[2]:

D = D / D.sum()
D

[2]:

array([[0.        , 0.        , 0.        , 0.        , 0.        ],
[0.18987342, 0.        , 0.07594937, 0.03375527, 0.08438819],
[0.01265823, 0.        , 0.        , 0.00843882, 0.        ],
[0.13080169, 0.        , 0.        , 0.        , 0.        ],
[0.18987342, 0.        , 0.11392405, 0.16033755, 0.        ]])

[3]:

R = R / R.sum()
R

[3]:

array([[0.  , 0.05, 0.1 , 0.15, 0.2 ],
[0.  , 0.  , 0.05, 0.1 , 0.15],
[0.  , 0.  , 0.  , 0.05, 0.1 ],
[0.  , 0.  , 0.  , 0.  , 0.05],
[0.  , 0.  , 0.  , 0.  , 0.  ]])


Here’s a nifty function to produce $$Q$$ from $$r$$.

[4]:

def get_Q(r):
if 0 not in r:
r = [i - 1 for i in r]

n = len(r)
Q = np.zeros((n, n))
for col, row in enumerate(r):
Q[row][col] = 1
return Q


For $$r=[5, 1, 4, 3, 2]$$, let’s see what $$Q^TDQ$$ looks like. Note that $$r$$ as specified here states that - Duke is 4, - Miami is 1, - UNC is 4, - UVA is 3, and - VT is 2.

[5]:

Q = get_Q([5, 1, 4, 3, 2])
Q.T.dot(D).dot(Q)

[5]:

array([[0.        , 0.18987342, 0.16033755, 0.11392405, 0.        ],
[0.        , 0.        , 0.        , 0.        , 0.        ],
[0.        , 0.13080169, 0.        , 0.        , 0.        ],
[0.        , 0.01265823, 0.00843882, 0.        , 0.        ],
[0.08438819, 0.18987342, 0.03375527, 0.07594937, 0.        ]])


This result is what $$Q^TDQ - R$$ looks like.

[6]:

Q.T.dot(D).dot(Q) - R

[6]:

array([[ 0.        ,  0.13987342,  0.06033755, -0.03607595, -0.2       ],
[ 0.        ,  0.        , -0.05      , -0.1       , -0.15      ],
[ 0.        ,  0.13080169,  0.        , -0.05      , -0.1       ],
[ 0.        ,  0.01265823,  0.00843882,  0.        , -0.05      ],
[ 0.08438819,  0.18987342,  0.03375527,  0.07594937,  0.        ]])


This result is what $$||Q^TDQ - R||$$ looks like.

[7]:

np.linalg.norm(Q.T.dot(D).dot(Q) - R)

[7]:

0.4265304195816933


Interestingly, when VT is specified as 1 and Miami is specified as 2, then the best $$Q$$ is found!

[8]:

Q = get_Q([5, 2, 4, 3, 1])
np.linalg.norm(Q.T.dot(D).dot(Q) - R)

[8]:

0.14836639449076508


Here’s an exhuastive search over all the permutations of $$r$$. It’s disturbing that the second best solution is when Duke is second and Miami is last!

[9]:

from itertools import permutations, chain
import pandas as pd

candidates = permutations(range(D.shape[0]), D.shape[0])
candidates = map(lambda p: (p, get_Q(p)), candidates)
candidates = map(lambda tup: (tup[0], [np.linalg.norm(tup[1].T.dot(D).dot(tup[1]) - R)]), candidates)
candidates = map(lambda tup: list(chain(*tup)), candidates)

dist_df = pd.DataFrame(candidates, columns=['r0', 'r1', 'r2', 'r3', 'r4', 'distance'])\
.sort_values(['distance'])\
.reset_index(drop=True)


[9]:

r0 r1 r2 r3 r4 distance
0 4 1 3 2 0 0.148366
1 1 4 3 2 0 0.173290
2 4 1 2 3 0 0.180447
3 1 4 2 3 0 0.201440
4 4 3 1 2 0 0.234359
5 4 1 3 0 2 0.235257
6 1 3 4 2 0 0.241453
7 1 4 3 0 2 0.251720
8 4 1 2 0 3 0.254222
9 4 2 1 3 0 0.267958

Just out of curiosity, we can also look at the distribution of the top 10 candidates.

[10]:

dist_df.head(10).r0.value_counts()

[10]:

4    6
1    4
Name: r0, dtype: int64

[11]:

dist_df.head(10).r1.value_counts()

[11]:

1    4
4    3
3    2
2    1
Name: r1, dtype: int64

[12]:

dist_df.head(10).r2.value_counts()

[12]:

3    4
2    3
1    2
4    1
Name: r2, dtype: int64

[13]:

dist_df.head(10).r3.value_counts()

[13]:

2    4
0    3
3    3
Name: r3, dtype: int64

[14]:

dist_df.head(10).r4.value_counts()

[14]:

0    7
2    2
3    1
Name: r4, dtype: int64