Power Analysis & A/B Testing Execution Framework

Modern A/B testing is only reliable when experiments are designed with clear statistical guardrails. This framework covers the complete flow—from power analysis to final experiment inference.

Part 1: Power Analysis

Power analysis is the first foundational step before running an A/B test. It defines the acceptable error levels and determines the minimum sample size required to make statistically valid decisions.

How Experiment Errors Are Defined

Reality \ Decision	Detect Effect	No Effect Detected
Effect Exists	True Positive	False Negative (β)
No Effect Exists	False Positive (α)	True Negative

α (alpha): probability of a false positive (Type I error)
β (beta): probability of a false negative (Type II error)
Power = $1 - \beta$

$$\text{power} = 1 - \beta$$

Step 1: Define Error Thresholds

Before running the experiment, we decide how much uncertainty we are willing to tolerate:

α controls the risk of concluding the test works when it does not (typically set to 0.05).
β controls the risk of missing a real improvement (typically set to 0.20, giving 80% Power).

Step 2: Estimate Expected Conversion Rates

Let:

$p_1$ = expected conversion rate of control
$p_2$ = expected conversion rate of test

Average conversion rate:

$$p = \frac{p_1 + p_2}{2}$$

Step 3: Calculate Minimum Sample Size

Using predefined values of α, β, and expected lift, the minimum sample size required per variant is:

$$ n = \frac{ \left[ Z_{1-\alpha/2} + Z_{1-\beta} \right]^2 \times p(1-p) } {(p_2 - p_1)^2} $$

                Power analysis converts acceptable experiment risk into the minimum data
                required for a reliable A/B test. Without this step, your test is just a guess.
            

Part 2: A/B Testing Execution (BITS)

Once power analysis is complete, the next step is to execute the A/B test correctly using a statistically valid comparison between control and test groups.

Step 1: Validate Sample Size

$n_1$ = sample size of control
$n_2$ = sample size of test

Both $n_1$ and $n_2$ must be greater than or equal to the required sample size $n$.

Step 2: Define the Outcome Metric

$x_1$ = number of conversions in control
$x_2$ = number of conversions in test

Step 3: Calculate Conversion Rates

$$p_1 = \frac{x_1}{n_1} \quad\quad p_2 = \frac{x_2}{n_2}$$

Step 4: Define the Test Objective

The goal is to determine whether the observed difference ($p_2 - p_1$) is statistically significant or due to random variation.

Step 5: Calculate the Pooled Conversion Rate

$$p = \frac{x_1 + x_2}{n_1 + n_2}$$

Step 6: Calculate the Standard Error

$$se = \sqrt{ p(1-p) \left( \frac{1}{n_1} + \frac{1}{n_2} \right) }$$

Step 7: Calculate the Z-Score

$$z = \frac{p_2 - p_1}{se}$$

Step 8: Statistical Decision

$|z| \ge 1.96$ → statistically significant (Reject Null Hypothesis)
$|z| < 1.96$ → likely due to chance (Fail to Reject Null)

Step 9: Calculate Lift Percentage

$$\text{lift \%} = \frac{p_2 - p_1}{p_1}$$

Step 10: Final Inference

Using statistical significance, direction of impact, and lift percentage, we determine whether the test variant should be adopted, rejected, or iterated further.

                A/B testing combines power analysis and execution discipline to ensure
                decisions are statistically sound and business-relevant.