← Back to Curriculum
Module 05

The P-Value Trap

What P < 0.05 actually means (and more importantly, what it doesn't).

In Module 04, we calculated sample sizes to avoid False Negatives (Beta). Now we need to talk about False Positives (Alpha), and the number everyone obsesses over: The P-Value.

Most marketers think: "P = 0.05 means there is a 95% chance my new design is better."

This is wrong. This misunderstanding leads companies to launch "winning" tests that actually have zero impact on revenue. Let's fix the definition.

1. The Definition

The P-Value is not about the hypothesis. It is about the data.

Correct Definition
"The probability of observing a difference this large (or larger) purely by random chance, assuming there is actually no difference (Null Hypothesis is true)."

Imagine you flip a coin 10 times and get 10 heads.
Is the coin rigged? Maybe.
Or did you just get super lucky with a fair coin? The P-Value calculates the odds of that "super lucky" event.

2. Common Misinterpretations

The Fallacy

"P = 0.05 means there is a 95% probability the test version is the winner."

The Truth

"P = 0.05 means there is only a 5% probability we would see this data if the test version was actually useless."

The Fallacy

"P = 0.05 means the result is important."

The Truth

"P-Value measures Surprise, not Size. A tiny lift (0.01%) can be statistically significant if you have 10 million users."

3. Statistical vs. Practical Significance

This brings us to the most expensive mistake in experimentation.

With enough traffic (Sample Size), any difference becomes statistically significant. If you test Google's blue links, changing the shade of blue might show a P-Value of 0.001. But the lift might be 0.0001%.

The Rule: Never launch a feature just because P < 0.05. Launch it only if P < 0.05 AND the Lift > MDE (Minimum Detectable Effect).

4. The Danger of "Peeking"

Because P-values fluctuate wildly at the beginning of a test (Law of Small Numbers), checking your dashboard every day is dangerous. If you check 10 times, your chance of seeing a False Positive rises from 5% to nearly 30%.

We will cover this extensively in Module 08: The Peeking Problem.

Previous Module ← Power Analysis