Solving Finite-Horizon Discounted Non-Stationary MDPs Cover Image

Solving Finite-Horizon Discounted Non-Stationary MDPs
Solving Finite-Horizon Discounted Non-Stationary MDPs

Author(s): El Akraoui Bouchra, Cherki Daoui
Subject(s): Economy, Business Economy / Management, Financial Markets
Published by: Wydawnictwo Naukowe Uniwersytetu Szczecińskiego
Keywords: Markov Decision Process; Dynamic Programming; Backward Induction algorithm

Summary/Abstract: Research background: Markov Decision Processes (MDPs) are a powerful framework for modeling many real-world problems with finite-horizons that maximize the reward given a sequence of actions. Although many problems such as investment and financial market problems where the value of a reward decreases exponentially with time, require the introduction of interest rates. Purpose: This study investigates non-stationary finite-horizon MDPs with a discount factor to account for fluctuations in rewards over time. Research methodology: To consider the fluctuations of rewards with time, the authors define new nonstationary finite-horizon MDPs with a discount factor. First, the existence of an optimal policy for the proposed finite-horizon discounted MDPs is proven. Next, a new Discounted Backward Induction (DBI) algorithm is presented to find it. To enhance the value of their proposal, a financial model is used as an example of a finite-horizon discounted MDP and an adaptive DBI algorithm is used to solve it. Results: The proposed method calculates the optimal values of the investment to maximize its expected total return with consideration of the time value of money. Novelty: No existing studies have before examined dynamic finite-horizon problems that account for temporal fluctuations in rewards.

  • Issue Year: 23/2023
  • Issue No: 1
  • Page Range: 1-15
  • Page Count: 15
  • Language: English
Toggle Accessibility Mode