Solving Finite-Horizon Discounted Non-Stationary MDPs
Solving Finite-Horizon Discounted Non-Stationary MDPs
Author(s): El Akraoui Bouchra, Cherki DaouiSubject(s): Economy, Business Economy / Management, Financial Markets
Published by: Wydawnictwo Naukowe Uniwersytetu Szczecińskiego
Keywords: Markov Decision Process; Dynamic Programming; Backward Induction algorithm
Summary/Abstract: Research background: Markov Decision Processes (MDPs) are a powerful framework for modeling many real-world problems with finite-horizons that maximize the reward given a sequence of actions. Although many problems such as investment and financial market problems where the value of a reward decreases exponentially with time, require the introduction of interest rates. Purpose: This study investigates non-stationary finite-horizon MDPs with a discount factor to account for fluctuations in rewards over time. Research methodology: To consider the fluctuations of rewards with time, the authors define new nonstationary finite-horizon MDPs with a discount factor. First, the existence of an optimal policy for the proposed finite-horizon discounted MDPs is proven. Next, a new Discounted Backward Induction (DBI) algorithm is presented to find it. To enhance the value of their proposal, a financial model is used as an example of a finite-horizon discounted MDP and an adaptive DBI algorithm is used to solve it. Results: The proposed method calculates the optimal values of the investment to maximize its expected total return with consideration of the time value of money. Novelty: No existing studies have before examined dynamic finite-horizon problems that account for temporal fluctuations in rewards.
Journal: Folia Oeconomica Stetinensia
- Issue Year: 23/2023
- Issue No: 1
- Page Range: 1-15
- Page Count: 15
- Language: English