Prospect-Theory-Inspired Reward Shaping in RL Trading Agents
Introduction
Prospect theory—the behavioral economics theory of how humans evaluate risk and return—deviates from classical utility theory. Humans are loss-averse, feel more pain from losses than joy from equal gains, and exhibit reference-dependent behavior. Reinforcement learning trading agents trained with prospect-theory-based reward functions behave more realistically and often outperform agents with simple return-maximization objectives.
Prospect Theory Foundations
Key aspects: (1) Loss aversion: losses weighted more heavily than gains; (2) Reference dependence: utility depends on deviations from reference point, not absolute levels; (3) Diminishing sensitivity: marginal impact decreases with magnitude. A prospect-theory reward function incorporates these elements: Reward = Gains - 2.25 × Losses (empirical loss-aversion parameter), with diminishing sensitivity to extreme outcomes.
RL Agent Implementation
Train PPO or A3C agents with prospect-theory reward functions. Test on backtests and live trading. Agents develop positions that avoid excessive drawdowns and large concentrated bets, mirroring human risk preferences. Performance: prospect-theory agents generate slightly lower returns but much lower maximum drawdowns, improving risk-adjusted returns (Sharpe ratios).
Conclusion
Incorporating behavioral insights from prospect theory into RL reward functions produces more realistic and robust trading agents.