Multi-arm A/B Test for 3 Fast Food Promotions

Python Tableau
Python Code | Dashboard

Why This Project?

Retail data often comes in weekly panel format with uneven performance across locations. This project mirrors real-world business experimentation where you need to compare promotions while accounting for store-level differences and repeated measures. It gave me the opportunity to demonstrate proper handling of clustered data, time structure, and model robustness.

Methodology

Before estimating the effect of each promotion on weekly sales, I conducted a series of data checks to ensure the validity of the analysis.

Data

The dataset included four weeks of sales data per store, with each store randomly assigned to one of three promotional conditions. The unit of analysis was the store-week, resulting in repeated observations per store. To account for this, I used clustered standard errors at the store level in all regression models.

Randomization and Covariate Balance

To confirm that the promotion assignment was successfully randomized, I checked:

All SMDs were below 0.2 except for Promo 3, which had a slightly older average store age, suggesting only mild imbalance.

Exploratory Analysis

I explored the relationship between potential covariates and weekly sales:

Outlier Detection and Removal

Using the interquartile range (IQR) method, I identified and removed extreme sales values within each promotion group. This reduced the influence of outliers and improved the symmetry of the outcome distribution.

Outcome Transformation

To account for right-skew in sales, I applied a log transformation using log1p(SalesInThousands). This stabilized variance and reduced the impact of high-end values on model results.

Modeling Approach

I ran a series of OLS regression models, progressively adjusting for:

Standard errors were clustered by store to account for within-store correlation. I compared models with and without outliers, and with both raw and log-transformed outcomes, to assess the robustness of results.

Results

I tested the impact of three promotional strategies on weekly sales using a series of OLS regression models. Models included controls for market size and store age, and used clustered standard errors to account for repeated measures within each store. Results were evaluated with and without outliers, and using both raw and log-transformed versions of the outcome variable.

Primary Findings

Effect Sizes

In the final model using log-transformed sales without outliers:

Robustness Checks

Results were consistent across:

Although residuals showed mild non-normality and some autocorrelation (Durbin-Watson < 1), the use of clustered standard errors addressed these issues.

Visualization

Plots of weekly sales over time confirmed that promotion effects were stable across the 4-week period. Histograms and boxplots showed a right-skewed distribution of sales, justifying the use of a log transformation. Outliers identified via the IQR method were primarily concentrated in smaller markets and higher sales values.