WU Xiao, GUO Zhenbin. Convergence Problem of a Sequence of First Passage Markov Decision Processes with Varying Discount Factors[J]. Chinese Journal of Applied Probability and Statistics, 2021, 37(6): 598-610. DOI: 10.3969/j.issn.1001-4268.2021.06.004
Citation: WU Xiao, GUO Zhenbin. Convergence Problem of a Sequence of First Passage Markov Decision Processes with Varying Discount Factors[J]. Chinese Journal of Applied Probability and Statistics, 2021, 37(6): 598-610. DOI: 10.3969/j.issn.1001-4268.2021.06.004

Convergence Problem of a Sequence of First Passage Markov Decision Processes with Varying Discount Factors

  • In this paper, we study the convergence problem of a sequence of first passage Markov decision processes with constraints and varying discount factors. Using the ``occupation measures'' and its related properties, we transform the constrained optimality problems into linear programming problems on the set of occupation measures (i.e., the convex analytic approach), and then prove that the optimal values and optimal policies of the original first passage Markov decision processes converge respectively to those of the ``limit'' one.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return