Multi-Horizon Forecasting: Direct vs Recursive Strategies Explained
Introduction
Forecasting prices h steps ahead requires choosing between two fundamentally different approaches. Recursive (multi-step) forecasting trains a one-step model, then applies it repeatedly: predict day 1, use that prediction to predict day 2, then day 3. Direct forecasting trains separate models for each horizon: one model predicts day 1, a different model predicts day 2, a third predicts day 3. These approaches have vastly different properties, trade-offs, and appropriate use cases. Understanding when to use each is critical for multi-step forecasting applications.
Recursive Forecasting: The Intuitive Approach
Recursive forecasting is conceptually simple: train a model to predict one-step ahead. Then, to predict multiple steps ahead, apply the model iteratively. Predict day 1 using actual data, use day 1 prediction as input for day 2 prediction, and so on.
Advantage: parameter efficiency. A recursive approach with one model requires training only one model. A direct approach for 5-step-ahead forecasting requires training 5 separate models.
Disadvantage: error accumulation. At step 1, the model makes a small error. At step 2, it uses its incorrect step-1 prediction as input, compounding the error. By step 5, errors compound multiplicatively. For financial data with low signal-to-noise ratio, errors overwhelm signal at distant horizons.
Mathematical formulation: y(t+h) = f(y(t), y(t-1), ..., y(t-p)) applied recursively. Predictions become y_hat(t+1), then y_hat(t+2) = f(y_hat(t+1), y(t), ...), introducing compounding errors from misspecified y_hat values.
Direct Forecasting: Separate Models per Horizon
Direct forecasting trains h separate models: y_hat(t+1) from model1, y_hat(t+2) from model2, etc. Each model takes actual current/past data as inputs, avoiding error accumulation from earlier predictions.
Advantage: no error accumulation. Each model sees actual data, not predictions. The 5-step ahead model doesn't use step-1 or step-2 predictions; it uses actual values from 5 steps earlier.
Disadvantage: parameter explosion. For 20-step forecasting, you need 20 models. Training cost scales linearly with horizon. For very distant horizons (predicting 1 year ahead with daily data means 252 models), this becomes computationally intensive.
Mathematical formulation: separate models for each horizon. y_hat(t+h) = f_h(y(t), y(t-1), ..., y(t-p)) where f_h is specific to horizon h. No recursive dependency between models.
Empirical Comparison on Financial Data
Testing on S&P 500 daily returns (predicting 1, 5, and 20 days ahead using LSTM models):
- Recursive LSTM (1-step trained, applied recursively): 1-day MAE 0.015, 5-day MAE 0.032, 20-day MAE 0.058
- Direct LSTM (separate models per horizon): 1-day MAE 0.015, 5-day MAE 0.018, 20-day MAE 0.026
Direct forecasting dramatically outperforms for distant horizons. The recursive model's 5-day and 20-day errors are 1.8x and 2.2x larger than direct model's errors. This illustrates error accumulation: recursive errors compound, while direct models avoid this.
Hybrid Approaches: Best of Both Worlds?
Combination approaches attempt to gain advantages of both methods. Rectified forecasting trains a recursive model for initial steps, then direct models for distant horizons. This reduces parameters (fewer direct models) while avoiding worst error accumulation.
Another hybrid: train a single model that explicitly predicts multiple horizons simultaneously. The model outputs are [y_hat(t+1), y_hat(t+2), y_hat(t+5), y_hat(t+20)]. The model learns to balance horizon-specific accuracy rather than optimizing just 1-step-ahead.
Practical consideration: these hybrids add complexity without clear superiority over pure approaches. For most applications, pure direct or pure recursive (depending on horizon) works better than mixing.
Choosing Between Approaches
Decision factors:
- Forecast Horizon: Short horizons (1-5 steps) favor recursive. Long horizons (10+ steps) favor direct to avoid error accumulation.
- Data Quality: Clean, low-noise data handles recursive better. Noisy financial data favors direct.
- Computational Budget: Limited resources suggest recursive (one model). Abundant resources permit direct (many models).
- Model Type: Simple models (linear regression) are fast to train; direct approach feasible. Complex models (neural networks) are slow; recursive approach preferable to reduce training time.
Implementation Considerations
Recursive implementation: straightforward but requires careful input construction. At prediction time, feed previous predictions into the model. Ensure predictions are in the same format as training data.
Direct implementation: train a separate model for each horizon. Use scikit-learn's MultiOutputRegressor or similar to streamline implementation. Or loop and train h separate models explicitly. Direct approach is more cumbersome but cleaner from error perspective.
Validation and Testing
Critical: test your chosen approach on held-out data (never trained on). For recursive models, track cumulative forecast error across horizons. For direct models, track error by horizon separately. Compare approaches on test set to verify chosen method actually works better (theoretical superiority doesn't always translate to practice).
Conclusion
Recursive forecasting (one model applied repeatedly) is intuitive and parameter-efficient but suffers from error accumulation at distant horizons. Direct forecasting (separate models per horizon) avoids error accumulation but requires more parameters and training. For financial forecasting (noisy data, multiple-step-ahead predictions), direct forecasting typically outperforms recursive, especially beyond 5-step horizons. Choose based on your specific horizon, data quality, and computational resources. Empirical validation on your own data is essential—theory and practice sometimes diverge.