5. Frontdoor/Backdoor Adjustment Formulas

In a directed acyclic graph (DAG) representing a causal model, we can use the backdoor criterion to identify confounders and the frontdoor criterion to identify mediators. A set of variables \(C\) satisifies the backdoor criterion relative to \(T\) and \(Y\) if the following are true:

\(C\) blocks all backdoor paths from \(T\) to \(Y\)
\(C\) does not contain any descendants of \(T\)

A set of variables \(M\) satistifes the frontdoor criterion relative to \(T\) and \(Y\) if the following are true:

All paths from \(T\) to \(Y\) go through \(M\)
There is no active backdoor path from \(T\) to \(M\)
All backdoor paths from \(M\) to \(Y\) are blocked by \(T\)

If \((T, C, Y)\) satisfy the backdoor criterion, then the backdoor adjustment formula may be used to estimate the causal effect of \(T\) on \(Y\).

\(P(y|do(t)) = \sum_c P(y|t,c) P(c)\)

If \((T, M, Y)\) satisfy the frontdoor criterion, then the frontdoor adjustment formula may be used to estimate the causal effect of \(T\) on \(Y\).

\(P(y|do(t)) = \sum_m \big( P(m|t) \sum_t P(y|m,t) P(t) \big)\)

Theoretically, the backdoor and frontdoor adjustment formulas should produce the same causal effect estimates. Let’s see how these formulas work.

5.1. Causal network

The (toy) causal network below has the following variables.

\(T\): is for treatment
\(Y\): is for outcome
\(C\): is for confounder
\(M\): is for mediator

We specify the directed acylic graph (DAG) so the following relationships hold.

\(T \leftarrow C \rightarrow Y\)
\(T \rightarrow M \rightarrow Y\)

Clearly, \(C\) is a confounder of \(T\) and \(Y\) and \(M\) is a mediator between \(T\) and \(Y\).

[1]:

from pybbn.graph.dag import Bbn
from pybbn.graph.edge import Edge, EdgeType
from pybbn.graph.jointree import EvidenceBuilder
from pybbn.graph.node import BbnNode
from pybbn.graph.variable import Variable
from pybbn.pptc.inferencecontroller import InferenceController

C = BbnNode(Variable(0, 'confounder', ['false', 'true']), [0.8, 0.2])
T = BbnNode(Variable(1, 'treatment', ['false', 'true']), [0.8, 0.2, 0.2, 0.8])
M = BbnNode(Variable(2, 'mediator', ['false', 'true']), [0.75, 0.25, 0.1, 0.9])
Y = BbnNode(Variable(3, 'output', ['false', 'true']), [0.99, 0.01, 0.6, 0.4, 0.55, 0.45, 0.2, 0.8])

bbn = Bbn() \
    .add_node(C) \
    .add_node(T) \
    .add_node(M) \
    .add_node(Y) \
    .add_edge(Edge(C, T, EdgeType.DIRECTED)) \
    .add_edge(Edge(T, M, EdgeType.DIRECTED)) \
    .add_edge(Edge(C, Y, EdgeType.DIRECTED)) \
    .add_edge(Edge(M, Y, EdgeType.DIRECTED))

Now, let’s sample some data. Although the result is not shown, when the number of samples is in the tens of thousands, the frontdoor and backdoor adjustment formulas do not converge to one another. Above 500,000, the results are nearly identical. Thus, we sample 1 million samples from the causal network to compute the causal effect.

[3]:

import pandas as pd
from pybbn.sampling.sampling import LogicSampler

sampler = LogicSampler(bbn)
samples = sampler.get_samples(n_samples=1_000_000, seed=37)

df = pd.DataFrame(samples).rename(columns={0: 'C', 1: 'T', 2: 'M', 3: 'Y'})

5.2. Frontdoor adjustment formula

Here we apply the frontdoor adjustment formula.

[4]:

import itertools

def safe_divide(num, den):
    if den == 0:
        return 0
    return num / den

def get_filter(X, x):
    f = [f'{_X}=="{_x}"' for _X, _x in zip(X, x)]
    f = ' and '.join(f)
    return f

def get_domain_product_values(df, X):
    prod = [sorted(df[x].unique()) for x in X]
    prod = itertools.product(*prod)
    prod = list(prod)

    return prod

def get_domain_product_filters(df, X):
    prod = get_domain_product_values(df, X)
    prod = [' and '.join([f'{_x}=="{_v}"' for _x, _v in zip(X, tup)]) for tup in prod]
    return prod

def get_marg_prob(df, X, x):
    f = get_filter(X, x)
    N = df.shape[0]
    n = df.query(f).shape[0]
    return safe_divide(n, N)

def get_cond_prob(df, X, x, Y, y):
    num_f = get_filter(X + Y, x + y)
    den_f = get_filter(Y, y)

    n = df.query(num_f).shape[0]
    d = df.query(den_f).shape[0]
    return safe_divide(n, d)

def get_frontdoor_intv_prob(df, T, t, M, Y, y):
    sum_m = 0

    for m_filter in get_domain_product_filters(df, M):
        t_filter = get_filter(T, t)
        tm_filter = f'{t_filter} and {m_filter}'

        n = df.query(tm_filter).shape[0]
        d = df.query(t_filter).shape[0]

        p_m_given_t = safe_divide(n, d)

        sum_t = 0

        for t_filter in get_domain_product_filters(df, T):
            y_filter = get_filter(Y, y)
            tmy_filter = f'{t_filter} and {m_filter} and {y_filter}'
            tm_filter = f'{t_filter} and {m_filter}'

            n = df.query(tmy_filter).shape[0]
            d = df.query(tm_filter).shape[0]

            p_y_given_mt = safe_divide(n, d)

            n = df.query(t_filter).shape[0]
            d = df.shape[0]

            p_t = safe_divide(n, d)

            sum_t += (p_y_given_mt * p_t)

        sum_m += (p_m_given_t * sum_t)

    return sum_m

def get_frontdoor_probs(df, T, M, Y, y):
    def get_vals(X, x):
        return ','.join([f'{_X}={_x}' for _X, _x in zip(X, x)])

    def get_key(t):
        t_vals = get_vals(T, t)
        k = f'P({y_vals}|{t_vals})'
        return k

    def compute(t):
        return get_frontdoor_intv_prob(df, T, t, M, Y, y)

    y_vals = get_vals(Y, y)
    return {get_key(t): compute(t) for t in get_domain_product_values(df, T)}

f_probs = get_frontdoor_probs(df, ['T'], ['M'], ['Y'], ['true'])

5.3. Backdoor adjustment formula

Here we apply the backdoor adjustment formula.

[5]:

def get_backdoor_intv_prob(df, T, t, C, Y, y):
    sum_c = 0

    for c_filter in get_domain_product_filters(df, C):
        t_filter = get_filter(T, t)
        y_filter = get_filter(Y, y)
        tc_filter = f'{t_filter} and {c_filter}'
        tcy_filter = f'{tc_filter} and {y_filter}'

        n = df.query(tcy_filter).shape[0]
        d = df.query(tc_filter).shape[0]

        p_y_given_tc = safe_divide(n, d)

        n = df.query(c_filter).shape[0]
        d = df.shape[0]

        p_c = safe_divide(n, d)

        sum_c += (p_y_given_tc * p_c)

    return sum_c

def get_backdoor_probs(df, T, C, Y, y):
    def get_vals(X, x):
        return ','.join([f'{_X}={_x}' for _X, _x in zip(X, x)])

    def get_key(t):
        t_vals = get_vals(T, t)
        k = f'P({y_vals}|{t_vals})'
        return k

    def compute(t):
        return get_backdoor_intv_prob(df, T, t, C, Y, y)

    y_vals = get_vals(Y, y)
    return {get_key(t): compute(t) for t in get_domain_product_values(df, T)}

b_probs = get_backdoor_probs(df, ['T'], ['C'], ['Y'], ['true'])

5.4. Compare treatment effect estimations

As you can see below, although not identical, the treatment effects using the adjustment formulas are very close. Here’s a few different ways the formulas could diverge.

model specification: the model (structure and/or parameters) may be wrong
measurement error: the data could be measured incorrectly
unobserved confounding: there may be other confounders that are not observed
violation of assumptions: assumptions required by the frontdoor and backdoor criterion may be violated
data quality: there may be insufficient data

[6]:

f_probs['P(Y=true|T=true)'] - f_probs['P(Y=true|T=false)']

[6]:

0.2485019432382713

[7]:

b_probs['P(Y=true|T=true)'] - b_probs['P(Y=true|T=false)']

[7]:

0.2492747359976692