← Back to Projects

Multi-Touch Attribution (MTA) & Advanced Modeling

Marketing Strategy SQL & Snowflake Algorithmic Attribution

A comprehensive journey from Rule-Based Heuristics to Advanced Algorithmic Frameworks. Evolving a global client's attribution strategy to uncover true incremental value.

This project documents the transformation of a Global StockMedia Client's marketing measurement. We moved from a legacy Last-Click setup to a sophisticated Algorithmic Framework over two distinct phases, overcoming significant technical and stakeholder challenges along the way.

Live Case Study

Evolution of a Model: From Rules to Algorithms

The Stakeholders

  • Kamal (Lead Analyst): The architect of the new framework.
  • Ben (Performance Marketing): Defending the efficiency of Paid Search.
  • Sarah (Brand Marketing): Demanding credit for "Demand Creation" (Social/Display).
  • Finance: Demanding a clear, justifiable logic for budget allocation.
The Starting Point: The "Last-Click" Trap

When we began, the client was operating on a Single-Touch (Last-Click) model. This created a "Civil War" inside the marketing team.

  • The Problem: Google Paid Search was claiming 100% of the credit for almost every sale.
  • The Consequence: Brand channels (Social, Display, Blog) were being starved of budget because their ROAS looked like zero.
  • The Friction: Sarah (Brand) argued, "I create the demand, Ben (Search) just captures it." But the data only showed the capture.
Data Foundations & Attribution Scope

Before any modeling could occur, we established strict governance over the input data to ensure integrity across all stakeholders.

Metric / Parameter Definition & Governance Rule
Lookback Window 30 Days. Capped to balance data relevance with cookie decay realities. Interactions beyond 30 days are excluded to minimize noise.
Conversion Definition Paid Transaction. Free trials were excluded from the primary ROI model to align with Finance's revenue recognition standards.
Path Construction User-Level. Paths stitched via User ID (logged in) and First-Party Cookie (anonymous). Cross-device matching applied where deterministic links existed.
Direct Traffic Treatment Collapse Logic. "Direct" visits were treated as non-events if a prior marketing touch existed within 48 hours, attributing credit to the preceding channel.
Impression Handling View-Through Logic. Display/Social impressions treated as 10% of a click-weight, decaying over a 7-day half-life.
Data Grain Event-Level. All attribution processed at the timestamped event level (not session aggregate) for maximum precision.
Normalization 100% Match. Total attributed revenue must equal transaction revenue ±0.01% (floating point tolerance).
Phase 1: The Rule-Based Compromise

We knew Last-Click was wrong, but we needed to explore the landscape of options before deciding on the right path. We began by evaluating Rule-Based Models to find a fairer system.

Evaluating "All Possible" Attributions & Their Limitations

Before writing any code, we sat down with the stakeholders to review the menu of standard attribution models. Each had a fatal flaw for our specific business model:

Model Logic The Fatal Limitation (Why We Hesitated)
First Touch 100% to the 1st interaction. Zero Incentive to Close. It rewards "Clickbait" traffic that never converts. Ben (Performance) would stop bidding on high-intent keywords.
Linear Equal credit to all touches. False Equality. A 3-second accidental banner click gets the same credit ($50) as a high-intent search click. It flattens the data variance, hiding true drivers.
Time Decay Credit grows closer to sale. "Last-Click Lite". It still heavily favors bottom-funnel channels. It didn't solve Sarah's problem of proving the value of early-stage awareness.
W-Shaped 30% First / 30% Lead / 30% Last. Too B2B Focused. Our client was B2C/eCommerce. We didn't have a distinct "Lead Gen" stage for every user, making this shape a poor fit.

Iteration 1: Testing Linear & Time Decay

Despite the known limitations, we ran a pilot using Linear and Time Decay just to see the data shifts.

Finance: "The Linear model says Display is driving $1M. But when we look at the paths, these users clicked a banner 6 months ago and never came back until today. This credit feels unearned."

Result: Rejected. Finance refused to allocate budget based on "False Equality."

Channel Last Click Share Linear Share Time Decay Share Variance (Linear vs LC)
Paid Search 45% 25% 35% -44%
Social (Brand) 5% 20% 10% +300%
Display 2% 15% 8% +650%
Email 30% 25% 28% -16%

*Diagnostic table: Linear over-corrected drastically, reducing confidence in the model's stability.

The Phase 1 Solution: Position-Based + Adstock

After the failed pilot, we settled on a custom Position-Based (U-Shaped) Model enhanced with Adstock to account for brand memory.

The Phase 1 Accepted Model:
  • 40% Position: To the First Touch (The Introducer / Sarah).
  • 40% Position: To the Last Touch (The Closer / Ben).
  • 20% Shared: To the Middle (The Nurturers).
  • Adstock Integration: We didn't just count clicks; we applied a "Half-Life" decay to impressions, giving credit to the lingering effect of Brand ads.

Adstock Mechanics & Rationalization

We introduced Adstock to solve the "View-Through" debate. Impressions are not clicks, but they deposit a memory effect that decays over time.

  • Half-Life Definition: The time it takes for an ad's impact to drop to 50%. We set this to 7 days for Display and 1 day for Social.
  • Mathematical Intuition: Think of an ad like a bell ringing. The sound (impact) is loudest immediately but fades over time. Adstock measures the "echo" of that bell.
  • Impression Weighting: Impressions were weighted at 10% of a Click. This prevented high-frequency banner ads from drowning out high-intent clicks in the attribution logic.
  • Assumption: We assumed diminishing returns; seeing the same ad 10 times does not equal 10x impact. We applied a log transformation to frequency to cap this effect.

This model was accepted because it was explainable. It was a political compromise that allowed us to move off Last-Click.

Executive Takeaway (Phase 1): Moving from Last-Click to Position-Based unlocked visibility into upper-funnel value, but the static weights proved too rigid for seasonal market shifts, eventually leading to inaccurate budget caps during peak seasons.
The Catalyst: Why Phase 1 Failed

For six months, the Position-Based model worked well. But then, the Holiday Season arrived, and the model broke.

During Black Friday, consumer behavior changed. Users weren't "Nurtured" over 2 weeks; they saw an ad and bought instantly. Yet, our static model still forced 20% of credit to the middle, even when the middle didn't exist.

The Breaking Point: Finance asked, "Why are we assigning 20% to the middle when the path length is only 1 click?" We couldn't answer. The heuristics were rigid, but the market was dynamic. We realized we needed a model that could "learn" from the data.

The "Road Not Taken": Why We Rejected Algorithmic (Initially)

This forced us to revisit the Algorithmic Model (Markov Chains) which we had rejected in Phase 1. It is critical to understand why we rejected it initially, so we could solve those problems now.

  • 1. The "Black Box" Problem: Finance and the CMO could not audit the math. Unlike "I get 40%," Algorithmic models output a probability score. It felt like "Magic," not math.
  • 2. Stability Issues: A small tracking outage on Monday could swing budget allocations by 30% on Tuesday. Finance needs stable, predictable monthly forecasts.
  • 3. Data Volume: The client’s lower-funnel data was robust, but upper-funnel tracking (view-through data) was spotty. Markov chains break if the "links" in the chain are missing.
Phase 2: The Advanced Re-Development

To move to Phase 2, we had to solve the "Black Box" and "Stability" issues.

The Advanced Solution: Algorithmic (Markov Chains)

We re-designed the architecture to implement a Data-Driven Attribution (DDA) model using Markov Chains.

graph LR A[Start] -->|60%| B[Social] A -->|40%| C[Search] B -->|30%| D[Conversion] B -->|Removal Effect| E[Loss of 30%] style B fill:#dcfce7,stroke:#166534 style E fill:#fee2e2,stroke:#991b1b

How we solved the Phase 1 Limitations:

  • Solving "Black Box": We built a "Model Explainability" dashboard that visualized the Removal Effect. Stakeholders could see: "If we remove Facebook, conversion probability drops by 12%." This turned "Magic" into "Logic."
  • Solving Stability: We implemented Windowed Smoothing. Instead of updating weights daily, we used a rolling 30-day window to smooth out volatility, giving Finance the stability they needed.

Markov Mechanics: Deepening the Logic

We moved beyond simple explanations to show stakeholders exactly how the math makes decisions.

  • Baseline Probability: We start by calculating the conversion probability of the entire network with all channels present (e.g., 2.5% global CVR).
  • The Removal Effect vs. Marginal Contribution: We simulate removing one node (e.g., "Paid Search") and recalculate the network's total success rate. The difference—the drop in success—is that channel's attribution value. This is superior to marginal contribution because it captures synergies (e.g., Social+Search combos).
  • Path Dependency: Unlike linear models, Markov respects the order of events. It recognizes that "Search → Social" yields a lower probability than "Social → Search", adjusting the weights accordingly.
  • Incrementality vs. Attribution: Important Distinction: Attribution allocates credit based on observed correlation in paths. It is not causality (Incrementality). However, it is directionally robust for budget weighting because it captures the network effect of channels working together, which incrementality tests (siloed) often miss.
  • Decision-Usefulness: While not perfectly causal, this model allows for daily optimization at a granularity that A/B testing (which is slow and expensive) cannot support.
Executive Takeaway (Phase 2): The Algorithmic model provided the dynamic adaptability needed for finance-grade forecasting, turning "Marketing Magic" into auditable logic.
The Final Impact

The move to the Advanced Algorithmic model unlocked the next layer of growth.

  • Budget Shift: We identified that the static model was overvaluing "Retargeting" (Middle touches). The Algorithm revealed these users were already highly probable to convert.
  • Outcome: We cut Retargeting spend by 15% and reinvested it into "Prospecting" (First Touch), resulting in a 43% improvement in incremental ROI.

Decision Confidence & Stability

Even when accounting for model variance (±5% attribution swing), the signal to shift budget from Retargeting to Prospecting remained strong. The decision was directionally stable across all confidence intervals, giving the CFO the assurance needed to sign off on the reallocation.

Model Stability, Governance & Trust Controls

To ensure this model survived scrutiny from Finance and Leadership long-term, we implemented rigorous governance protocols.

  • Retraining Cadence: The Markov Transition Matrix is retrained Weekly. Daily retraining introduced too much noise from day-of-week volatility.
  • Rolling Window Logic: Attribution credits are reported based on a 30-Day Rolling Average. This smooths out short-term spikes (e.g., a viral social post) to prevent knee-jerk budget cuts.
  • Volatility Circuit Breakers: If a channel's attributed weight shifts by more than 15% week-over-week, the model triggers an alert for manual analyst review before pushing data to dashboards.
  • Outage Handling: In the event of pixel loss (e.g., a Facebook tracking outage), the model automatically falls back to the last known stable weights (Previous Week) rather than training on zero data.
  • Manual Override Logic: In cases of extreme external anomalies (e.g., site outages), Finance retains the right to lock attribution weights to the trailing 30-day average to prevent artificial volatility.
Critical Limitations: When NOT to Use This Model

Transparency builds trust. We explicitly documented scenarios where this model fails to prevent misuse.

  • Cold Start Scenarios: New channels with low data volume (<1,000 paths) yield unstable Markov probabilities. We default to Linear Attribution for the first 30 days of any new channel launch.
  • Seasonality Breaks: During extreme events like Black Friday, user behavior changes fundamentally (paths shorten). The model's historical training data becomes less predictive. We apply Time-Decay overrides during these 48-hour windows.
  • Offline Interactions: This model relies on digital tracking pixels. It cannot see TV or Billboard impact, requiring calibration via Media Mix Modeling (MMM) for holistic views.