In Modules 4 and 5, we mathematically transformed our Media variables (Spend, Impressions). But if you run a model using only media to predict sales, the results will be laughable.
Why? Because Media is not the only thing that drives sales. In fact, for most mature brands, Media drives only 10% - 20% of sales. The other 80% is the "Baseline."
We need to create variables for the invisible forces: Seasonality, Holidays, and Macroeconomics.
If you sell winter coats, sales go up in November. That is not because your ads are genius; it is because it is cold. If you don't control for this, your model will think your November ads are 5x more effective than your July ads.
Instead of creating 12 dummy variables (one for each month), advanced MMM uses Fourier Terms (Sin and Cos waves) to create a smooth seasonality curve.
import numpy as np # Create sine and cosine waves to mimic yearly cycles # period = 52.18 for weekly data df['sin_year'] = np.sin(2 * np.pi * df['week_index'] / 52.18) df['cos_year'] = np.cos(2 * np.pi * df['week_index'] / 52.18)
Holidays are "shocks" to the system. Black Friday, Christmas, Singles Day. These need to be modeled as binary flags (Dummy Variables).
The Trap: Don't just flag the day of the holiday. Flag the lead-up. People buy gifts before Christmas, not on Christmas day.
# 1. Create a binary flag df['is_black_friday'] = 0 df.loc[df.date == '2023-11-24', 'is_black_friday'] = 1 # 2. Add a 'Lead Up' flag if needed # This helps capture the shopping frenzy the week prior df['is_pre_xmas'] = df.date.apply( lambda x: 1 if (x.month == 12 and x.day < 25) else 0 )
Sometimes sales drop because the economy is bad, or a competitor launched a huge promo. If you have the data, add these columns to your ABT.
Before we move to the Modeling Phase (Module 7), your DataFrame should now have:
We have mathematically represented reality. Now, we are ready to find the coefficients.