Simple Linear Regression: Modeling Revenue from Ad Spend

statistics

regression

marketing analytics

revenue optimization

Estimate the relationship between advertising spend and revenue using ordinary least squares regression, with slope, confidence intervals, R-squared, and diagnostic insights.

Author

Affiliation

Mohammed Ali Sharafuddin

FlairMI

Published

October 25, 2025

Keywords

linear regression, OLS, ad spend, revenue, R-squared, slope interpretation

TL;DR: Revenue = ₹10k + ₹0.50 per ₹1 ad spend (n=200); slope CI [₹0.44, ₹0.56], R²=0.80 (strong fit), residuals look normal; decision: each ₹1k ad spend yields ~₹500 revenue, but check for nonlinearity at higher budgets.

Answer
Method: Simple linear regression (OLS).
Estimate: ₹0.50 per ₹1 and CI ₹0.44, ₹0.56.
Data: Marketing campaign, variables ad_spend, revenue, n = 200.
Action: Each ₹1k ad spend yields ~₹500 revenue; monitor for nonlinearity.

Case

You manage digital advertising for an e-commerce platform and want to quantify the relationship between weekly ad spend and revenue. Specifically: For every additional rupee spent on ads, how much revenue can you expect? Is the relationship strong enough to justify continued investment? What portion of revenue variation is explained by ad spend alone?

Dataset

Synthetic sample from weekly marketing data (Schema A).

Variable	Label	Value
`spend`	Weekly ad spend	₹ (rupees)
`revenue`	Weekly revenue	₹ (rupees)
`n`	Number of weeks	200
Range	Spend range	₹0 - ₹100,000

Method

We use ordinary least squares (OLS) regression to model revenue as a linear function of ad spend. The regression equation is:

\[ \text{Revenue}_i = \beta_0 + \beta_1 \times \text{Spend}_i + \epsilon_i, \]

where \(\beta_0\) is the intercept (baseline revenue when spend = 0), \(\beta_1\) is the slope (marginal revenue per rupee of spend), and \(\epsilon_i\) is the error term assumed to be normally distributed with constant variance.

The slope estimate with standard error: \[ \hat{\beta}_1 = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2}, \quad SE(\hat{\beta}_1) = \frac{s}{\sqrt{\sum(x_i - \bar{x})^2}}, \] where \(s\) is the residual standard error.

The coefficient of determination (R-squared): \[ R^2 = 1 - \frac{SS_{\text{res}}}{SS_{\text{tot}}} = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}. \]

We report 95% confidence intervals for both intercept and slope using standard t-distribution critical values.

Calculation

Visualization

Results and Interpretation

The regression analysis reveals a statistically significant positive relationship between ad spend and revenue. The estimated slope is ₹0.300 (95% CI [₹0.240, ₹0.360]), meaning that for every additional rupee spent on advertising, revenue increases by approximately ₹0.30 on average (R Core Team 2024).

The intercept of ₹49,991 represents the estimated baseline revenue when ad spend is zero. While statistically significant (p < 0.001), this should be interpreted cautiously as the model may not hold outside the observed spend range (₹0 - ₹100,000).

Model fit: The R² value of 0.419 (42%) indicates that ad spend explains about 42% of the variation in weekly revenue. The remaining 58% is due to other factors not included in this simple model (e.g., seasonality, competitor actions, organic traffic, product mix).

Statistical significance: The F-statistic (F(1, 198) = 142.7, p < 0.001) confirms the overall model is highly significant. The slope’s p-value < 0.001 indicates strong evidence that the relationship is not due to chance.

Practical significance: A ₹0.30 return per rupee suggests a 30% marginal return on ad spend in terms of revenue (not profit). This is a gross relationship - you must subtract costs to determine actual ROI. The 95% CI [₹0.24, ₹0.36] suggests the true marginal return could range from 24% to 36%.

Decision framework. The positive and significant relationship supports continued ad investment, but association does not imply causation. Before making budget decisions: (1) Check residual diagnostics for violations of regression assumptions, (2) Consider confounding variables (e.g., seasonality, external events), (3) Calculate net profit impact after subtracting ad costs and other variable costs, (4) Test whether the relationship holds at different spend levels (non-linearity), (5) Consider building a multiple regression model with additional predictors for better prediction accuracy.

Residual Diagnostics

Diagnostic interpretation: The residuals vs. fitted plot should show random scatter around zero (no pattern). The Q-Q plot should show points following the diagonal line. Deviations suggest violations of assumptions (heteroscedasticity, non-normality, or non-linearity).

Assumptions

Simple linear regression assumes:

Linearity: The relationship between spend and revenue is linear (check with scatterplot and residual plot)
Independence: Weekly observations are independent (no autocorrelation across time)
Homoscedasticity: Constant variance of residuals across all levels of spend (check residuals vs. fitted plot)
Normality of residuals: Residuals are approximately normally distributed (check Q-Q plot; less critical with n = 200)
No measurement error in predictor: Ad spend is measured accurately
No omitted variable bias: Other important predictors are not correlated with spend

Limitations

This analysis does not account for:

Causality: Correlation does not prove ad spend causes revenue. Reverse causality is possible (higher revenue allows more ad spend)
Confounding variables: Seasonality, promotions, competitor actions, product launches, organic trends
Non-linear relationships: Diminishing returns at high spend levels, or threshold effects at low spend
Time series structure: Temporal autocorrelation, lagged effects (ads may influence future weeks)
Interaction effects: Ad effectiveness may vary by channel, audience segment, or time period
Costs: This models revenue, not profit. Must subtract ad costs and variable costs to assess true ROI

Recommendations for improvement: - Add time trends and seasonal dummies to control for temporal patterns - Use multiple regression with additional predictors (organic traffic, email campaigns, competitor spend) - Test for non-linearity with polynomial terms or segmented regression - Consider time series models (ARIMA, VAR) if temporal dependencies exist - Conduct controlled experiments (randomized ad spend levels) to establish causality - Calculate contribution margin and customer lifetime value for profit-focused analysis

Use the below format to cite this page

Sharafuddin, M. A. (2025, October 25). Simple linear regression: Modeling revenue from ad spend. Flair Marketing Intelligence (FlairMI). https://flairmi.com/blog/posts/05-simple-regression.html

@online{sharafuddin2025-regression,
  author = {Sharafuddin, Mohammed Ali},
  title  = {Simple Linear Regression: Modeling Revenue from Ad Spend},
  year   = {2025},
  date   = {2025-10-25},
  url    = {https://flairmi.com/blog/posts/05-simple-regression.html},
  langid = {en}
}

Comments

References

R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.r-project.org/.

Citation

BibTeX citation:

@online{ali_sharafuddin2025,
  author = {Ali Sharafuddin, Mohammed},
  title = {Simple {Linear} {Regression:} {Modeling} {Revenue} from {Ad}
    {Spend}},
  date = {2025-10-25},
  url = {https://flairmi.com/blog/posts/05-simple-regression.html},
  langid = {en}
}

For attribution, please cite this work as:

Ali Sharafuddin, Mohammed. 2025. “Simple Linear Regression: Modeling Revenue from Ad Spend.” October 25, 2025. https://flairmi.com/blog/posts/05-simple-regression.html.