Forecast Combination Methods: Simple Average vs MASE-Weighted

Category: Time-Series Forecasting Techniques • Article #10 • Reading time: 5 minutes

Introduction

When multiple models make different forecasts, combining them often beats any individual model. Simple averaging is straightforward but naive. MASE-weighted combinations use forecast errors to assign weights, often improving accuracy by 10-30%.

Simple Average Combination

Simplest method: average forecasts from N models. If Model A predicts 105 and Model B predicts 104, the combination predicts 104.5. This works surprisingly well and has no hyperparameters. Theoretically, averaging reduces variance of idiosyncratic errors while preserving systematic signals.

MASE Weighted Combination

MASE (Mean Absolute Scaled Error) is a scale-invariant metric comparing forecast error to naive forecast error. MASE-weighted combination assigns weights proportional to 1/MASE: better models get higher weights. A model with MASE=0.5 gets twice the weight of a model with MASE=1.0.

Implementation Steps

1. Evaluate all models on validation set, compute MASE for each. 2. Compute weights: w_i = (1/MASE_i) / sum(1/MASE_j). 3. On test set, combine predictions: y_pred = sum(w_i * y_pred_i). 4. Optional: use exponential weighting if recent performance matters more.

Empirical Comparison

Combining 5 diverse models (ARIMA, Prophet, LSTM, N-BEATS, Transformer) on 20-day return forecasts shows best individual model RMSE of 2.3%, simple average RMSE of 1.9%, and MASE-weighted RMSE of 1.7%—11% improvement of weighted over simple averaging.

Advanced Combination Methods

Granger causality weighting captures systematic bias patterns. Optimization-based combining treats combination weights as hyperparameters optimized via cross-validation, though this can be overfit-prone.

When Does Combining Help?

Combining helps most when models are diverse and errors are uncorrelated. Monitor correlation of prediction errors—if two models have error correlation >0.7, one is likely redundant. Diversify by using structurally different models.