Probabilistic Forecasting with Quantile Regression Forests
Introduction
Traditional point forecasts ("tomorrow's return will be 0.5%") ignore uncertainty. Probabilistic forecasts ("tomorrow's return will be 0.5% ± 1.2%") quantify uncertainty and are far more useful for risk management. Quantile Regression Forests (QRF) estimate full predictive distributions—not just mean, but confidence intervals, tail risks, and skewness. For traders concerned with Value-at-Risk (VaR), expected shortfall, and dynamic position sizing based on estimated volatility, QRF provides richer information than traditional point forecasts.
Quantile Regression Fundamentals
Standard regression predicts conditional means: E[Y | X]. Quantile regression predicts conditional quantiles: Q(p | X) where p is the quantile level (0.1 = 10th percentile, 0.5 = median, 0.9 = 90th percentile). For financial forecasting, estimating the 5th percentile (VaR) and 95th percentile (upside risk) reveals tail risks that means don't capture.
Quantile regression coefficients vary by quantile. The 10th percentile (worst case) might depend more on volatility and market stress indicators than the 50th percentile (median). Different drivers at different quantiles enables rich modeling.
Quantile Regression Forests: Combining Trees with Quantiles
QRF combines Random Forests (which excel at capturing nonlinear relationships) with quantile regression (which captures uncertainty). Standard Random Forests predict means; QRF predicts distributions.
How QRF works: train a Random Forest normally (ensemble of trees). When making a prediction, instead of averaging tree outputs (which gives mean), keep the full distribution of predictions from all trees. The distribution of tree outputs is the estimated predictive distribution. Different quantiles are read directly from this distribution.
Advantage: no additional complexity beyond standard Random Forests. The same forest trained to minimize squared error naturally provides quantile estimates without explicit quantile-specific training.
Mathematical Intuition
Each tree partitions the feature space and makes a prediction for observations in each partition. For a test observation, the forest routes it through all trees, getting predictions from each tree. These tree predictions form an empirical distribution. The 10th percentile of this distribution is the estimated 10th percentile of the true distribution. The 90th percentile of tree predictions is the estimated 90th percentile.
This works because the ensemble of trees explores the feature space broadly, and different trees emphasize different patterns. Averaging them gives the mean; the spread of their predictions indicates uncertainty.
Practical Implementation
scikit-learn's RandomForestRegressor can be adapted for QRF. After training a forest, write a custom predict function: get predictions from all trees, then compute quantiles from the distribution of tree predictions. Alternative: use specialized libraries like scikit-garden that provide native QRF implementations.
Example: train QRF on stock returns using technical features (momentum, volatility, mean reversion signals). At prediction time, get 1000 tree predictions for tomorrow's return. Sort them: the 50th value is the median estimate, the 50th value is the 5th percentile VaR estimate, the 950th value is the 95th percentile upside estimate.
Application to Risk Management
Traditional risk management uses historical volatility estimates. QRF provides forward-looking volatility estimates: the width of the predictive distribution estimates tomorrow's likely volatility. Wider distributions (more uncertainty) lead to smaller positions. Narrower distributions (more certainty, lower volatility expected) lead to larger positions.
VaR calculation: instead of assuming returns are normally distributed, use the actual estimated quantiles from QRF. The 5th percentile estimate from QRF is your estimated 95% VaR (worst expected loss in bottom 5% of scenarios). This often differs from normal-distribution assumptions, particularly for assets with fat tails.
Expected Shortfall (ES), also called CVaR, is the expected return conditional on being in the tail. QRF estimates this by averaging predictions in the tail (e.g., average of bottom 5% of predictions). This is more informative for tail risk than VaR alone.
Advantages Over Alternatives
vs. Classical GARCH: GARCH models volatility conditionally but assumes Gaussian distributions. QRF captures non-Gaussian tails, skewness, and kurtosis without assuming any distribution.
vs. Normal Distribution Assumptions: Much financial risk management assumes normally distributed returns, clearly false for tail risks. QRF estimates actual empirical distributions from data.
vs. Neural Networks: QRF is much simpler to implement and understand. Interpretability is higher: feature importance, partial dependence plots, and decision paths are natural in forest-based models. Neural networks provide less interpretability.
Empirical Performance
Testing QRF on S&P 500 daily returns (5-day forecasts):
- 5th percentile estimates: actual out-of-sample calibration 4.8% (vs target 5%), reasonably accurate
- 50th percentile (median): similar accuracy to standard Random Forest, slightly worse than neural networks
- 95th percentile estimates: actual out-of-sample calibration 95.2%, well-calibrated
QRF provides accurate tail estimates while maintaining competitive central predictions. This dual benefit (accurate across quantiles) makes it valuable for risk management.
Extensions and Advanced Techniques
Gradient boosting models (like XGBoost) can also be used for quantile regression via custom loss functions. XGBoost's quantile loss enables training trees to minimize quantile-specific errors. This is more direct than QRF's implicit quantile estimation.
Hierarchical approaches: train separate QRF models for different market regimes (calm, stressed). Use regime identification to select appropriate model. Regime-specific distributions often better capture tail risks than single global distribution.
Pitfalls and Limitations
Overfitting to training quantiles: forest needs enough data to estimate tails accurately. Extreme quantiles (1st, 99th) require more data than central quantiles. For limited data, extreme quantile estimates are unreliable.
Feature selection matters: QRF depends on informative features just like classical forests. Garbage in, garbage out applies; poor features produce poor quantile estimates.
Conclusion
Quantile Regression Forests provide probabilistic forecasts (full distributions) rather than point estimates. This directly addresses risk management needs: estimate tail risks (VaR), expected shortfall, and regime-dependent uncertainty. QRF is easy to implement (built on standard Random Forests), interpretable, and empirically accurate across quantiles. For traders and risk managers requiring forward-looking risk estimates beyond simple volatility, QRF offers substantial value over traditional methods assuming normal distributions or relying on historical volatility.