23. Spherical Payoff

The spherical payoff SP is a scrictly proper scoring rule used to judge probabilistic classifiers. It is defined as follows.

\(SP = \dfrac{1}{N} \sum \dfrac{p_c}{\sqrt{p_0^2 + p_1^2}}\), where

\(N\) is the number of samples,
\(p_c\) is the probability predicted for the correct value,
\(p_0\) is the probability predicted for y=0, and
\(p_1\) is the probability predicted for y=1.

The value of SP is in the range \([0, 1]\), where

a value closer to 0 indicates “worst skill”, and
a value closer to 1 indicates the “best skill”.

Generally speaking, SP may be used beyond binary classification problems.

23.1. Load data

Let’s import a dataset about students and whether they have conducted research. The indepent variables in X are the student’s scores and peformance measures, and the dependent variable y is whether they have done research (y = 1) or not (y = 0).

[1]:

import pandas as pd
import numpy as np

url = 'https://raw.githubusercontent.com/selva86/datasets/master/Admission.csv'
Xy = pd.read_csv(url) \
    .drop(columns=['Chance of Admit ', 'Serial No.'])

Xy.shape

[1]:

(400, 7)

[2]:

Xy.columns

[2]:

Index(['GRE Score', 'TOEFL Score', 'University Rating', 'SOP', 'LOR ', 'CGPA',
       'Research'],
      dtype='object')

[3]:

Xy.head()

[3]:

	GRE Score	TOEFL Score	University Rating	SOP	LOR	CGPA	Research
0	337	118	4	4.5	4.5	9.65	1
1	324	107	4	4.0	4.5	8.87	1
2	316	104	3	3.0	3.5	8.00	1
3	322	110	3	3.5	2.5	8.67	1
4	314	103	2	2.0	3.0	8.21	0

23.2. Create Xy

We will split Xy in X and y individually.

[4]:

X = Xy.drop(columns=['Research'])
y = Xy['Research']

X.shape, y.shape

[4]:

((400, 6), (400,))

23.3. Split Xy into training and testing

We then split X and y into training and testing folds.

[5]:

from sklearn.model_selection import StratifiedKFold

tr_idx, te_idx = next(StratifiedKFold(n_splits=10, random_state=37, shuffle=True).split(X, y))

X_tr, X_te, y_tr, y_te = X.loc[tr_idx], X.loc[te_idx], y.loc[tr_idx], y.loc[te_idx]
X_tr.shape, X_te.shape, y_tr.shape, y_te.shape

[5]:

((360, 6), (40, 6), (360,), (40,))

Let’s make sure the proportions of 1’s and 0’s are preserved with the splitting.

[6]:

y_tr.value_counts() / y_tr.value_counts().sum(),

[6]:

(1    0.547222
 0    0.452778
 Name: Research, dtype: float64,)

[7]:

y_te.value_counts() / y_te.value_counts().sum()

[7]:

1    0.55
0    0.45
Name: Research, dtype: float64

23.4. Model learning

Let’s training a logistic regression model on the training data.

[8]:

from sklearn.linear_model import LogisticRegression

m = LogisticRegression(solver='saga', max_iter=5_000, random_state=37, n_jobs=-1)
m.fit(X_tr, y_tr)

[8]:

LogisticRegression(max_iter=5000, n_jobs=-1, random_state=37, solver='saga')

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

23.5. Scoring

The null (reference) model is the one that always predicts the expected probability of y=1 defined as follows.

\(\hat{y} = \dfrac{1}{N} \sum y\)

Notice in the code that we compute this constant prediction from the training data (not testing; although the data was split to preserve the proportions).

[9]:

p1 = (y_tr.value_counts() / y_tr.value_counts().sum()).sort_index().loc[1]

y_null = np.full(y_te.shape, p1)
y_pred = m.predict_proba(X_te)[:,1]

The SP of the null and alternative models are shown below. Since higher is better for SP, the alternative model has the better skill.

[10]:

def get_spherical_payoff(y_t, y_p):
    n = y_p if y_t == 1 else 1 - y_p
    d = np.sqrt(y_p ** 2 + (1 - y_p) ** 2)
    return n / d

def spherical_payoff(y_true, y_pred):
    sp = [get_spherical_payoff(y_t, y_p) for y_t, y_p in zip(y_true, y_pred)]
    sp = np.mean(sp)
    return sp

b = pd.Series([
    spherical_payoff(y_te, y_null),
    spherical_payoff(y_te, y_pred)
], ['null', 'alt'])

b

[10]:

null    0.710623
alt     0.792187
dtype: float64

The alternative model is an improvement over the null one by a factor 1.11.

[11]:

(b.loc['alt'] / b.loc['null'])

[11]:

1.1147782039437109