5. Multi-Objective Optimization for Demand Curve
In general, we can model a demand curve where we predict quantity \(q\) from price \(p\) and other covariates.
\(q = f(\theta) = f(p, x_0, x_1, \ldots, x_n)\)
If we have the model \(f(\theta)\), then we can find \((q^*, \theta^*)\) that maximizes either the total revenue or profit. Note the following.
\(r = qp\), and
\(t = q(p - c)\)
where
\(r\) is the total revenue,
\(q\) is quantity,
\(p\) is unit price,
\(t\) is the total profit, and
\(c\) is the unit cost.
It is not necessarily the case that maximizing revenue is also maximizing profit. In this notebook, we will generate fake data and illustrate how we can model the demand curve and then use this model to optimize for revenue, profit and both.
5.1. Data
The data is simulated as follows
\(Q \sim \mathcal{N}(50 - 1.5 P + 100 M_q, 1) + \epsilon\),
where
\(Q\) is the quantity,
\(P\) is the unit price,
\(M_q\) is the corresponding yearly quarter for the specified month, and
\(\epsilon \sim \mathcal{N}(10, 5)\) is the error.
[1]:
import numpy as np
import pandas as pd
import random
random.seed(37)
np.random.seed(37)
def f(p, month):
e = np.random.normal(10, 5, p.shape[0])
quarter = np.select([month < 4, month < 7, month < 11], [1, 2, 3], 4)
return np.random.normal(50 - 1.5 * p + 100 * quarter) + e
df = pd.DataFrame({
'p': np.random.randint(50, 100, 1_000),
'month': np.random.randint(1, 12, 1_000)
}) \
.assign(q=lambda d: f(d['p'], d['month']))
5.2. Visualize quantity over price trends
Below, we show the
spread of quantity by price and
also the sum of the quantity by price.
[2]:
import matplotlib.pyplot as plt
import seaborn as sns
fig, ax = plt.subplots(2, 1, figsize=(15, 6))
sns.boxplot(df, x='p', y='q', ax=ax[0])
df.groupby(['p'])['q'].sum().plot(kind='line', ylabel='q', ax=ax[1])
ax[0].set_title('Box plots of quantity at different prices')
ax[1].set_title('Sum of quantities for different prices')
fig.tight_layout()
We can breakdown the data by quarters and observe the spread of quantities by price. In all quarters except the last, all quantities trend downward.
[3]:
_temp = df.assign(quarter=lambda d: np.select([d.month < 4, d.month < 7, d.month < 10], [1, 2, 3], 4))
fig, axes = plt.subplots(4, 1, figsize=(20, 12))
for q, ax in zip([1, 2, 3, 4], np.ravel(axes)):
sns.boxplot(
_temp[_temp.quarter==q],
x='p',
y='q',
ax=ax
)
ax.set_title(f'Quantities by price, Q{q}')
fig.tight_layout()
5.3. Demand data
The demand curve data will be grouped by price-month and quantities summed over.
[4]:
demand_df = df.groupby(['p', 'month'])['q'].sum().to_frame().reset_index()
demand_df
[4]:
p | month | q | |
---|---|---|---|
0 | 50 | 1 | 97.630858 |
1 | 50 | 2 | 170.132677 |
2 | 50 | 3 | 181.190757 |
3 | 50 | 4 | 367.611310 |
4 | 50 | 5 | 185.221784 |
... | ... | ... | ... |
461 | 99 | 6 | 443.360809 |
462 | 99 | 7 | 210.213176 |
463 | 99 | 8 | 818.645468 |
464 | 99 | 9 | 430.012356 |
465 | 99 | 10 | 210.007255 |
466 rows × 3 columns
5.4. Linear modeling
We can certainly model the demand curve using linear regression. As you can see below, the coefficient corresponding to price is -3.20
. The mean absolute error MAE
and weighted absolute percentage error WAPE
are very big.
[5]:
from sklearn.linear_model import LinearRegression
X = demand_df[['p', 'month']]
y = demand_df['q']
m = LinearRegression()
m.fit(X, y)
pd.Series([m.intercept_, m.coef_[0], m.coef_[1]], ['intercept', 'p', 'month'])
Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
[5]:
intercept 221.907460
p -3.199247
month 66.123215
dtype: float64
[6]:
from sklearn.metrics import mean_absolute_error
mean_absolute_error(y, m.predict(X))
[6]:
175.57064604298589
[7]:
np.sum(np.abs(y - m.predict(X))) / np.sum(y)
[7]:
0.4666900845671383
5.5. Non-linear modeling
In the non-linear modeling approach, we use a random forest to model the demand curve. Interestingly, month
is more important than price. Also, using a non-linear model cuts the MAE and WAPE by over half of what we saw with the linear model. Let’s keep this model.
[8]:
from sklearn.ensemble import RandomForestRegressor
X = demand_df[['p', 'month']]
y = demand_df['q']
m = RandomForestRegressor(n_jobs=-1, random_state=37, n_estimators=40)
m.fit(X, y)
pd.Series(m.feature_importances_, X.columns)
[8]:
p 0.434937
month 0.565063
dtype: float64
[9]:
mean_absolute_error(y, m.predict(X))
[9]:
76.80770760690886
[10]:
np.sum(np.abs(y - m.predict(X))) / np.sum(y)
[10]:
0.20416508320929747
5.6. Visualizing quantities space
We can visualize the relationship between quantity (z-axis), price (x-axis) and month (y-axis). As you can tell, the surface is very complicated of how price and month interact to influence quantity.
[11]:
x, y = np.meshgrid(np.arange(50, 100.1, 0.1), np.arange(1, 13, 1))
z = m.predict(pd.DataFrame([(_x, _y) for _x, _y in zip(np.ravel(x), np.ravel(y))], columns=['p', 'month'])).reshape(x.shape)
x.shape, y.shape, z.shape
[11]:
((12, 501), (12, 501), (12, 501))
[12]:
import plotly.graph_objects as go
fig = go.Figure(data=[go.Surface(
x=x,
y=y,
z=z,
colorscale='Viridis',
opacity=0.5,
)])
fig.update_layout(
title=r'$q = f(p, m_q)$',
autosize=False,
width=600,
height=600,
margin={'l': 65, 'r': 50, 'b': 65, 't': 90}
)
fig.update_xaxes(title='p')
fig.update_yaxes(title=r'$m_q$')
fig.show()