Marketing Mix Models are often discussed as if the hardest problem is fitting the model. In practice, fitting the model is only the beginning. The harder challenge is turning a complex, uncertain statistical model into something that can reliably support real-world decisions.
Our approach is built around a simple idea: separate learning the system from using the system.
The diagram below illustrates this two-layer architecture.
Learning the Structure of the System
At the foundation of our approach is a Bayesian Marketing Mix Model built using Google Meridian. This model is responsible for learning how marketing inputs relate to business outcomes over time.
At a high level, the model learns:
- How channel effects accumulate and decay (lagged impact)
- How returns diminish as spend increases (saturation)
- How media-driven lift interacts with baseline trends and seasonality
- How uncertain each of these effects actually is
These are not just modeling conveniences — they are structural properties of real marketing systems. Ignoring them leads to models that appear stable historically but fail under new budget scenarios.
Bayesian inference plays a central role here. Rather than estimating a single coefficient per channel, the model learns posterior distributions over parameters. This means uncertainty is explicit and preserved, not hidden behind point estimates.
From an engineering perspective, this has important consequences:
- Poorly specified models tend to fail loudly (via convergence or diagnostics) rather than silently
- Priors act as stabilizers when data is weak or noisy
- The model can express “we don’t know” instead of overconfident answers
Training this model is computationally expensive and intentionally treated as an offline, periodic process. Its purpose is to learn the rules of the system, not to answer every downstream question.
Why Coefficients Are Not Enough
Traditional MMM workflows often conclude once coefficients, ROI estimates, and attribution charts are produced. These outputs are useful summaries, but they are not decision engines.
Coefficient-centric MMMs implicitly assume that:
- Relationships are approximately linear near the historical operating point
- Budget changes are incremental
- Future decisions resemble past allocations
In practice, planning violates all of these assumptions.
Once budgets move meaningfully, or allocations shift across channels, the system becomes highly nonlinear. Marginal returns change, interactions matter, and historical averages lose relevance.
A single ROI number per channel cannot describe this behavior. Even a set of coefficients does not tell you:
- What happens when total budget increases by 30%
- How optimal allocations shift under new constraints
- Where diminishing returns actually begin
Rather than collapsing the model into a small set of summaries, we treat it as a generative system that can be queried across many possible futures.
Enumerating the Prediction Space
Once the Bayesian MMM is trained, we systematically evaluate it across a wide range of scenarios. This step is where our approach diverges most clearly from traditional MMM implementations.
We evaluate the model across combinations of:
- Total budget levelsIncluding decreases, increases, and baseline scenarios.
- Allocation mixesVarying how spend is distributed across channels.
Each evaluation produces a concrete prediction:
- Expected total outcome
- Channel-level contributions
- Implied efficiency at that point in the space
Over many evaluations, this produces a sampled response surface — a representation of how the model behaves across budget and allocation dimensions.
Instead of storing a single “answer,” we store many answers. Each corresponds to a plausible scenario the business might care about.
This shifts MMM from being a retrospective reporting tool to a forward-looking system model.
Calibration and Scale Alignment
One practical challenge with MMM outputs is that model scales do not always align perfectly with business reporting scales.
To address this, we explicitly align predictions with observed historical behavior:
- Spend is normalized so modeled budgets correspond to real-world dollars
- Outcomes are calibrated so baseline predictions match observed totals over a defined reporting window
This calibration is done carefully and locally. Rather than forcing the model to perfectly explain historical totals everywhere, we anchor it around baseline conditions and small perturbations.
The result is a system where:
- Predictions remain interpretable in business terms
- Relative differences across scenarios remain intact
- The model is not distorted to fit noise
This step is essential for making MMM outputs usable downstream without misleading precision.
Runtime Exploration Without Retraining
Once the prediction space exists, we no longer need to re-run Bayesian inference to answer most questions.
At runtime, we operate entirely within the learned surface:
- Interpolating between stored scenarios
- Exploring local neighborhoods around allocations of interest
- Evaluating new constraints or budget levels
- Computing marginal effects directly from predicted deltas
This is where lighter-weight ML and numerical methods come into play. They are not learning the system; they are querying the system that the Bayesian model has already defined.
This distinction matters. We are not stacking another opaque model on top of the MMM. We are extracting information that is already implied by it.
Because the expensive inference step is decoupled from exploration, runtime analysis remains fast, consistent, and repeatable.
Optimization as Exploration, Not Oracle
Budget optimization is often treated as the end goal of MMM. In many systems, this results in a single “optimal” allocation being presented as the answer.
We take a different approach.
Optimization is used as an exploratory tool rather than an oracle. Instead of producing a single recommendation, we surface:
- Efficient frontiers that show tradeoffs explicitly
- Marginal returns that explain sensitivity
- Multiple high-performing allocations rather than one fragile optimum
This allows planners to see how robust a recommendation is and where performance degrades as assumptions change.
In practice, this is far more useful than a single optimized point, especially when decisions involve constraints, uncertainty, and organizational tradeoffs.
Why This Architecture Scales Better
Separating learning from prediction has several concrete engineering advantages:
- Bayesian training can remain slow and careful without slowing down planning
- New scenarios can be evaluated without retraining
- Outputs are consistent across users and time
- The system naturally supports iterative “what-if” analysis
It also aligns with how marketing decisions are actually made: iteratively, under constraints, and with incomplete information.
Rather than forcing decision-makers to mentally interpolate between static charts, the system performs that interpolation explicitly and transparently.
MMM as a Persistent System Model
We think of our MMM not as a report, but as a persistent model of how marketing behaves.
- Bayesian inference defines the rules of the system
- Stored predictions define the space of possible outcomes
- Runtime methods allow that space to be explored in detail
This combination allows MMM to move beyond retrospective attribution and toward practical decision support.
The model does not answer a single question once. It remains available to answer many questions over time.