A/B Testing Case Study: Real-Life End-to-End Example
A marketing team wants to test whether a new checkout design improves conversion rate compared to the existing design. This case study applies the Power Analysis Framework to a real-world scenario.
Part 1: Power Analysis
Before launching the test, we must determine the sample size required to trust the results.
Assumptions
- Baseline conversion rate (control): 5% (0.05)
- Expected conversion rate (test): 6% (0.06)
- Significance Level (α): 0.05 (95% confidence)
- Power (1 - β): 0.80 (80% power)
Step 1: Average Conversion Rate
$$p = \frac{p_1 + p_2}{2}$$
$$p = \frac{0.05 + 0.06}{2} = 0.055$$
Step 2: Minimum Sample Size per Variant
$$
n =
\frac{
\left[
Z_{1-\alpha/2} + Z_{1-\beta}
\right]^2
\times
p(1-p)
}
{(p_2 - p_1)^2}
$$
Using standard Z values:
- $Z_{1-\alpha/2} = 1.96$ (for 95% Confidence)
- $Z_{1-\beta} = 0.84$ (for 80% Power)
Conclusion: We need a minimum of ~4,100 users per variant to detect this 1% lift reliably.
Part 2: A/B Test Execution
The test ran for 2 weeks. Here is the actual data we collected.
Observed Experiment Data
| Group | Users (n) | Conversions (x) |
|---|---|---|
| Control | 4,200 | 210 |
| Test | 4,300 | 275 |
Step 1: Conversion Rates
$$p_1 = \frac{x_1}{n_1}
\quad\quad
p_2 = \frac{x_2}{n_2}$$
$$p_1 = \frac{210}{4200} = 0.050$$
$$p_2 = \frac{275}{4300} = 0.064$$
Step 2: Pooled Conversion Rate
$$p = \frac{x_1 + x_2}{n_1 + n_2}$$
$$p = \frac{210 + 275}{4200 + 4300} = \frac{485}{8500} = 0.057$$
Step 3: Standard Error
$$se =
\sqrt{
p(1-p)
\left(
\frac{1}{n_1} + \frac{1}{n_2}
\right)
}$$
$$
se =
\sqrt{
0.057 \times 0.943
\left(
\frac{1}{4200} + \frac{1}{4300}
\right)
}
= 0.00497
$$
Step 4: Z-Score
$$z = \frac{p_2 - p_1}{se}$$
$$z = \frac{0.064 - 0.050}{0.00497} = 2.82$$
Step 5: Statistical Decision
- Critical value (95% confidence): ±1.96
- Observed z-score: 2.82
Since $|z| > 1.96$, the result is statistically significant.
Step 6: Lift Percentage
$$\text{lift \%} =
\frac{p_2 - p_1}{p_1}$$
$$\text{lift \%} = \frac{0.064 - 0.050}{0.050} = 28\%$$
Final Inference
The new checkout design increased conversion rate by 28% (from 5.0% to 6.4%).
The improvement is statistically significant at the 95% confidence level.
Recommendation: The test variant should be rolled out.