5. Multi-Objective Optimization for Demand Curve

In general, we can model a demand curve where we predict quantity \(q\) from price \(p\) and other covariates.

\(q = f(\theta) = f(p, x_0, x_1, \ldots, x_n)\)

If we have the model \(f(\theta)\), then we can find \((q^*, \theta^*)\) that maximizes either the total revenue or profit. Note the following.

  • \(r = qp\), and

  • \(t = q(p - c)\)

where

  • \(r\) is the total revenue,

  • \(q\) is quantity,

  • \(p\) is unit price,

  • \(t\) is the total profit, and

  • \(c\) is the unit cost.

It is not necessarily the case that maximizing revenue is also maximizing profit. In this notebook, we will generate fake data and illustrate how we can model the demand curve and then use this model to optimize for revenue, profit and both.

5.1. Data

The data is simulated as follows

\(Q \sim \mathcal{N}(50 - 1.5 P + 100 M_q, 1) + \epsilon\),

where

  • \(Q\) is the quantity,

  • \(P\) is the unit price,

  • \(M_q\) is the corresponding yearly quarter for the specified month, and

  • \(\epsilon \sim \mathcal{N}(10, 5)\) is the error.

[1]:
import numpy as np
import pandas as pd
import random

random.seed(37)
np.random.seed(37)

def f(p, month):
    e = np.random.normal(10, 5, p.shape[0])
    quarter = np.select([month < 4, month < 7, month < 11], [1, 2, 3], 4)
    return np.random.normal(50 - 1.5 * p + 100 * quarter) + e

df = pd.DataFrame({
        'p': np.random.randint(50, 100, 1_000),
        'month': np.random.randint(1, 12, 1_000)
    }) \
    .assign(q=lambda d: f(d['p'], d['month']))

5.3. Demand data

The demand curve data will be grouped by price-month and quantities summed over.

[4]:
demand_df = df.groupby(['p', 'month'])['q'].sum().to_frame().reset_index()
demand_df
[4]:
p month q
0 50 1 97.630858
1 50 2 170.132677
2 50 3 181.190757
3 50 4 367.611310
4 50 5 185.221784
... ... ... ...
461 99 6 443.360809
462 99 7 210.213176
463 99 8 818.645468
464 99 9 430.012356
465 99 10 210.007255

466 rows × 3 columns

5.4. Linear modeling

We can certainly model the demand curve using linear regression. As you can see below, the coefficient corresponding to price is -3.20. The mean absolute error MAE and weighted absolute percentage error WAPE are very big.

[5]:
from sklearn.linear_model import LinearRegression

X = demand_df[['p', 'month']]
y = demand_df['q']

m = LinearRegression()
m.fit(X, y)

pd.Series([m.intercept_, m.coef_[0], m.coef_[1]], ['intercept', 'p', 'month'])
Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
[5]:
intercept    221.907460
p             -3.199247
month         66.123215
dtype: float64
[6]:
from sklearn.metrics import mean_absolute_error

mean_absolute_error(y, m.predict(X))
[6]:
175.57064604298589
[7]:
np.sum(np.abs(y - m.predict(X))) / np.sum(y)
[7]:
0.4666900845671383

5.5. Non-linear modeling

In the non-linear modeling approach, we use a random forest to model the demand curve. Interestingly, month is more important than price. Also, using a non-linear model cuts the MAE and WAPE by over half of what we saw with the linear model. Let’s keep this model.

[8]:
from sklearn.ensemble import RandomForestRegressor

X = demand_df[['p', 'month']]
y = demand_df['q']

m = RandomForestRegressor(n_jobs=-1, random_state=37, n_estimators=40)
m.fit(X, y)

pd.Series(m.feature_importances_, X.columns)
[8]:
p        0.434937
month    0.565063
dtype: float64
[9]:
mean_absolute_error(y, m.predict(X))
[9]:
76.80770760690886
[10]:
np.sum(np.abs(y - m.predict(X))) / np.sum(y)
[10]:
0.20416508320929747

5.6. Visualizing quantities space

We can visualize the relationship between quantity (z-axis), price (x-axis) and month (y-axis). As you can tell, the surface is very complicated of how price and month interact to influence quantity.

[11]:
x, y = np.meshgrid(np.arange(50, 100.1, 0.1), np.arange(1, 13, 1))
z = m.predict(pd.DataFrame([(_x, _y) for _x, _y in zip(np.ravel(x), np.ravel(y))], columns=['p', 'month'])).reshape(x.shape)

x.shape, y.shape, z.shape
[11]:
((12, 501), (12, 501), (12, 501))
[12]:
import plotly.graph_objects as go

fig = go.Figure(data=[go.Surface(
    x=x,
    y=y,
    z=z,
    colorscale='Viridis',
    opacity=0.5,
)])

fig.update_layout(
    title=r'$q = f(p, m_q)$',
    autosize=False,
    width=600,
    height=600,
    margin={'l': 65, 'r': 50, 'b': 65, 't': 90}
)
fig.update_xaxes(title='p')
fig.update_yaxes(title=r'$m_q$')

fig.show()

The contour plot is also complicated, but we can tell that the top region is where most of the high quantities reside from the interaction between month-quarter \(m_q\) and price \(p\).

[13]:
fig = go.Figure(data =
    go.Contour(
        z=z,
        x=np.unique(x),
        y=np.unique(y),
        colorscale='Viridis'
    ))
fig.update_layout(
    title=r'$q = f(p, m_q)$',
    autosize=False,
    width=600,
    height=600,
    margin={'l': 65, 'r': 50, 'b': 65, 't': 90}
)
fig.update_xaxes(title='p')
fig.update_yaxes(title=r'$m_q$')

fig.show()

5.7. Maximize total revenue

Let’s try to maximize total revenue.

[15]:
import optuna

optuna.logging.set_verbosity(optuna.logging.ERROR)

def rev_objective(trial, m):
    p = trial.suggest_float('p', 50, 100)
    month = trial.suggest_float('month', 1, 12)
    _df = pd.DataFrame([[p, month]], columns=['p', 'month'])
    z = m.predict(_df)
    q = z[0]

    r = p * q

    return r

study = optuna.create_study(**{
    'study_name': 'opt-revenue',
    'storage': 'sqlite:///_temp/opt-revenue.db',
    'load_if_exists': True,
    'direction': 'maximize',
    'sampler': optuna.samplers.TPESampler(seed=37),
    'pruner': optuna.pruners.MedianPruner(n_warmup_steps=10)
})

study.optimize(**{
    'func': lambda t: rev_objective(t, m),
    'n_trials': 1_000,
    'n_jobs': 6,
    'show_progress_bar': True
})
[16]:
study.best_params
[16]:
{'p': 88.4928117682058, 'month': 11.450811461657262}
[17]:
study.best_value
[17]:
103567.10107416332
[18]:
_p = study.best_params['p']
_m = study.best_params['month']
_q = m.predict(pd.DataFrame([study.best_params]))[0]
_rev = _p * _q
_pft = (_p - 25) * _q

_rev_results = {
    'p': _p,
    'month': _m,
    'q': _q,
    'revenue': _rev,
    'profit': _pft
}

_rev_results
[18]:
{'p': 88.4928117682058,
 'month': 11.450811461657262,
 'q': 1170.3447885173143,
 'revenue': 103567.10107416332,
 'profit': 74308.48136123046}

5.8. Maximize total profit

Let’s try to maximize profit.

[19]:
def pft_objective(trial, m):
    p = trial.suggest_float('p', 50, 100)
    month = trial.suggest_float('month', 1, 12)
    _df = pd.DataFrame([[p, month]], columns=['p', 'month'])
    z = m.predict(_df)
    q = z[0]

    r = (p - 25) * q

    return r

study = optuna.create_study(**{
    'study_name': 'opt-profit',
    'storage': 'sqlite:///_temp/opt-profit.db',
    'load_if_exists': True,
    'direction': 'maximize',
    'sampler': optuna.samplers.TPESampler(seed=37),
    'pruner': optuna.pruners.MedianPruner(n_warmup_steps=10)
})

study.optimize(**{
    'func': lambda t: pft_objective(t, m),
    'n_trials': 1_000,
    'n_jobs': 6,
    'show_progress_bar': True
})
[20]:
study.best_params
[20]:
{'p': 93.48324856585506, 'month': 11.324420125907485}
[21]:
study.best_value
[21]:
75902.68836214142
[22]:
_p = study.best_params['p']
_m = study.best_params['month']
_q = m.predict(pd.DataFrame([study.best_params]))[0]
_rev = _p * _q
_pft = (_p - 25) * _q

_pft_results = {
    'p': _p,
    'month': _m,
    'q': _q,
    'revenue': _rev,
    'profit': _pft
}

_pft_results
[22]:
{'p': 93.48324856585506,
 'month': 11.324420125907485,
 'q': 1108.3394837666272,
 'revenue': 103611.17545630709,
 'profit': 75902.68836214142}

5.9. Maximize profit and revenue

Let’s try to maximize both profit and revenue.

[23]:
def rev_pft_objective(trial, m):
    p = trial.suggest_float('p', 50, 100)
    month = trial.suggest_float('month', 1, 12)
    _df = pd.DataFrame([[p, month]], columns=['p', 'month'])
    z = m.predict(_df)
    q = z[0]

    pft = (p - 25) * q
    rev = p * q

    return rev, pft

study = optuna.create_study(**{
    'study_name': 'opt-rev-pft',
    'storage': 'sqlite:///_temp/opt-rev-pft.db',
    'load_if_exists': True,
    'directions': ['maximize', 'maximize'],
    'sampler': optuna.samplers.TPESampler(seed=37),
    'pruner': optuna.pruners.MedianPruner(n_warmup_steps=10)
})

study.optimize(**{
    'func': lambda t: rev_pft_objective(t, m),
    'n_trials': 1_000,
    'n_jobs': 6,
    'show_progress_bar': True
})
[24]:
best = max(study.best_trials, key=lambda t: (t.values[0], t.values[1]))
[25]:
best
[25]:
FrozenTrial(number=421, state=TrialState.COMPLETE, values=[103523.72177069388, 75815.2346765282], datetime_start=datetime.datetime(2023, 9, 7, 1, 15, 4, 325108), datetime_complete=datetime.datetime(2023, 9, 7, 1, 15, 5, 8747), params={'p': 93.40434342271607, 'month': 10.965941774282502}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'p': FloatDistribution(high=100.0, log=False, low=50.0, step=None), 'month': FloatDistribution(high=12.0, log=False, low=1.0, step=None)}, trial_id=422, value=None)
[26]:
best.params
[26]:
{'p': 93.40434342271607, 'month': 10.965941774282502}
[27]:
best.values
[27]:
[103523.72177069388, 75815.2346765282]
[28]:
m.predict(pd.DataFrame([best.params]))[0]
[28]:
1108.3394837666272
[29]:
_p = best.params['p']
_m = best.params['month']
_q = m.predict(pd.DataFrame([best.params]))[0]
_rev = _p * _q
_pft = (_p - 25) * _q

_revpft_results = {
    'p': _p,
    'month': _m,
    'q': _q,
    'revenue': _rev,
    'profit': _pft
}

_revpft_results
[29]:
{'p': 93.40434342271607,
 'month': 10.965941774282502,
 'q': 1108.3394837666272,
 'revenue': 103523.72177069388,
 'profit': 75815.2346765282}

5.10. Maximize quantity, revenue and profit

[ ]:
def q_rev_pft_objective(trial, m):
    p = trial.suggest_float('p', 50, 100)
    month = trial.suggest_float('month', 1, 12)
    _df = pd.DataFrame([[p, month]], columns=['p', 'month'])
    z = m.predict(_df)
    q = z[0]

    pft = (p - 25) * q
    rev = p * q

    return q, rev, pft

study = optuna.create_study(**{
    'study_name': 'opt-q-rev-pft',
    'storage': 'sqlite:///_temp/opt-q-rev-pft.db',
    'load_if_exists': True,
    'directions': ['maximize', 'maximize', 'maximize'],
    'sampler': optuna.samplers.TPESampler(seed=37),
    'pruner': optuna.pruners.MedianPruner(n_warmup_steps=10)
})

study.optimize(**{
    'func': lambda t: q_rev_pft_objective(t, m),
    'n_trials': 1_000,
    'n_jobs': 6,
    'show_progress_bar': True
})
[58]:
best = max(study.best_trials, key=lambda t: (t.values[0], t.values[1]))
[60]:
best.params
[60]:
{'p': 50.184457378381126, 'month': 11.04879031751835}
[59]:
best.values
[59]:
[1640.2904172779429, 82317.08453405192, 41309.82410210335]
[61]:
m.predict(pd.DataFrame([best.params]))[0]
[61]:
1640.2904172779429
[62]:
_p = best.params['p']
_m = best.params['month']
_q = m.predict(pd.DataFrame([best.params]))[0]
_rev = _p * _q
_pft = (_p - 25) * _q

_qrp_results = {
    'p': _p,
    'month': _m,
    'q': _q,
    'revenue': _rev,
    'profit': _pft
}

_qrp_results
[62]:
{'p': 50.184457378381126,
 'month': 11.04879031751835,
 'q': 1640.2904172779429,
 'revenue': 82317.08453405192,
 'profit': 41309.82410210335}

The best parameters when maximizing for revenue, profit, revenue and profit and quantity, revenue and profit are shown below.

[64]:
pd.DataFrame([
    _rev_results,
    _pft_results,
    _revpft_results,
    _qrp_results,
], index=['max_revenue', 'max_profit', 'max_both', 'max_all'])
[64]:
p month q revenue profit
max_revenue 88.492812 11.450811 1170.344789 103567.101074 74308.481361
max_profit 93.483249 11.324420 1108.339484 103611.175456 75902.688362
max_both 93.404343 10.965942 1108.339484 103523.721771 75815.234677
max_all 50.184457 11.048790 1640.290417 82317.084534 41309.824102

5.11. Visualize monthly revenue and profit

Below, we visualize expected revenue and profit over price for each month.

  • \(p_q\) is the price that optimizes the quantity

  • \(p_r\) is the price that optimizes the revenue

  • \(p_p\) is the price that optimizes the profit

[68]:
def get_estimate(month):
    return pd.DataFrame({'p': np.arange(50, 100.01, 0.01), 'month': month}) \
        .assign(
            q=lambda d: m.predict(d),
            revenue=lambda d: d['p'] * d['q'],
            profit=lambda d: (d['p'] - 25) * d['q']
        ) \
        .set_index(['p']) \
        [['q', 'revenue', 'profit']]

fig, axes = plt.subplots(6, 2, figsize=(10, 20))

for month, ax in zip(np.arange(1, 13, 1), np.ravel(axes)):
    _df = get_estimate(month)
    _df[['revenue', 'profit']].plot(kind='line', ax=ax)

    _p_qty = _df.sort_values(['q'], ascending=False).index[0]
    _p_rev = _df.sort_values(['revenue'], ascending=False).index[0]
    _p_pft = _df.sort_values(['profit'], ascending=False).index[0]

    ax.set_title(rf'Month {month}, $p_q$={_p_qty:.2f}, $p_r$={_p_rev:.2f}, $p_p$={_p_pft:.2f}')
    ax.set_ylabel('USD')

fig.tight_layout()
_images/optimizing-demand-curve-multiobjective_50_0.png

5.12. Visualizing revenue

Because we love interactive surface plots, let’s plot expected total revenue (z-axis) against price (x-axis) and month (y-axis).

[50]:
x, y = np.meshgrid(np.arange(50, 100.1, 0.1), np.arange(1, 13, 1))
z = m.predict(pd.DataFrame([(_x, _y) for _x, _y in zip(np.ravel(x), np.ravel(y))], columns=['p', 'month'])).reshape(x.shape)
r = x * z
t = (x - 25) * z
[50]:
((12, 501), (12, 501), (12, 501), (12, 501), (12, 501))
[51]:
fig = go.Figure(data=[go.Surface(
    x=x,
    y=y,
    z=r,
    colorscale='Viridis',
    opacity=0.5,
)])

fig.update_layout(
    title=r'$r = qp = f(p, m_q) p$',
    autosize=False,
    width=600,
    height=600,
    margin={'l': 65, 'r': 50, 'b': 65, 't': 90}
)
fig.update_xaxes(title='p')
fig.update_yaxes(title=r'$m_q$')

fig.show()

5.13. Visualizing profit

This next surface plot is of expected total profit (z-axis) against price (x-axis) and month (y-axis).

[52]:
fig = go.Figure(data=[go.Surface(
    x=x,
    y=y,
    z=t,
    colorscale='Viridis',
    opacity=0.5,
)])

fig.update_layout(
    title=r'$t = q(p - c) = f(p, m_q) (p - c)$',
    autosize=False,
    width=600,
    height=600,
    margin={'l': 65, 'r': 50, 'b': 65, 't': 90}
)
fig.update_xaxes(title='p')
fig.update_yaxes(title=r'$m_q$')

fig.show()
[ ]: