Algorithms are described for determining optimal policies for finite state, finite action, infinite discrete time horizon Markov decision processes. Both value-improvement and policy-improvement techniques are used in the algorithms. Computing procedures are also described. The algorithms are appropriate for processes that are either finite or infinite, deterministic or stochastic, discounted or undiscounted, in any meaningful combination of these features. Computing procedures are described in terms of initial data processing, bound improvements, process reduction, and testing and solution. Application of the methodology is illustrated with an example involving natural resource management. Management implications of certain hypothesized relationships between mallard survival and harvest rates are addressed by applying the optimality procedures to mallard population models.
Additional publication details
MARKOV: A methodology for the solution of infinite time horizon MARKOV decision processes