1 - 6 of 6
Number of results to display per page
Search Results
2. Growth rates and average optimality in risk-sensitive Markov decision chains
- Creator:
- Sladký, Karel
- Format:
- bez média and svazek
- Type:
- model:article and TEXT
- Subject:
- risk-sensitive Markov decision chains, average optimal policies, and optimal growth rates
- Language:
- English
- Description:
- In this note we focus attention on characterizations of policies maximizing growth rate of expected utility, along with average of the associated certainty equivalent, in risk-sensitive Markov decision chains with finite state and action spaces. In contrast to the existing literature the problem is handled by methods of stochastic dynamic programming on condition that the transition probabilities are replaced by general nonnegative matrices. Using the block-triangular decomposition of a collection of nonnegative matrices we establish necessary and sufficient conditions guaranteeing independence of optimal values on starting state along with partition of the state space into subsets with constant optimal values. Finally, for models with growth rate independent of the starting state we show how the methods work if we minimize growth rate or average of the certainty equivalent.
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
3. Identification of optimal policies in Markov decision processes
- Creator:
- Sladký, Karel
- Format:
- bez média and svazek
- Type:
- model:article and TEXT
- Subject:
- finite state Markov decision processes, discounted and average costs, and elimination of suboptimal policies
- Language:
- English
- Description:
- In this note we focus attention on identifying optimal policies and on elimination suboptimal policies minimizing optimality criteria in discrete-time Markov decision processes with finite state space and compact action set. We present unified approach to value iteration algorithms that enables to generate lower and upper bounds on optimal values, as well as on the current policy. Using the modified value iterations it is possible to eliminate suboptimal actions and to identify an optimal policy or nearly optimal policies in a finite number of steps without knowing precise values of the performance function.
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
4. Mononicity and comparison results for nonnegative dynamic systems. Part I.. Discrete-time case
- Creator:
- van Dijk, Nico M. and Sladký, Karel
- Format:
- bez média and svazek
- Type:
- model:article and TEXT
- Subject:
- Markov chains, monotonicity, and nonnegative matrices
- Language:
- English
- Description:
- In two subsequent parts, Part I and II, monotonicity and comparison results will be studied, as generalization of the pure stochastic case, for arbitrary dynamic systems governed by nonnegative matrices. Part I covers the discrete-time and Part II the continuous-time case. The research has initially been motivated by a reliability application contained in Part II. In the present Part I it is shown that monotonicity and comparison results, as known for Markov chains, do carry over rather smoothly to the general nonnegative case for marginal, total and average reward structures. These results, though straightforward, are not only of theoretical interest by themselves, but also essential for the more practical continuous-time case in Part II (see \cite{DijkSl2}). An instructive discrete-time random walk example is included.
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
5. Monotonicity and comparison results for nonnegative dynamic systems. Part II.. Continous-time case
- Creator:
- van Dijk, Nico M. and Sladký, Karel
- Format:
- bez média and svazek
- Type:
- model:article and TEXT
- Subject:
- Markov chains, monotonicity, and nonnegative matrices
- Language:
- English
- Description:
- This second Part II, which follows a first Part I for the discrete-time case (see \cite{DijkSl1}), deals with monotonicity and comparison results, as generalization of the pure stochastic case, for stochastic dynamic systems with arbitrary nonnegative generators in the continuous-time case. In contrast with the discrete-time case the generalization is no longer straightforward. A discrete-time transformation will therefore be developed first. Next, results from Part I can be adopted. The conditions, the technicalities and the results will be studied in detail for a reliability application that initiated the research. This concerns a reliability network with dependent components that can breakdown. A secure analytic performance bound is obtained.
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
6. Risk-sensitive average optimality in Markov decision processes
- Creator:
- Sladký, Karel
- Format:
- bez média and svazek
- Type:
- model:article and TEXT
- Subject:
- controlled Markov processes, finite state space, asymptotic behavior, and risk-sensitive average optimality
- Language:
- English
- Description:
- In this note attention is focused on finding policies optimizing risk-sensitive optimality criteria in Markov decision chains. To this end we assume that the total reward generated by the Markov process is evaluated by an exponential utility function with a given risk-sensitive coefficient. The ratio of the first two moments depends on the value of the risk-sensitive coefficient; if the risk-sensitive coefficient is equal to zero we speak on risk-neutral models. Observe that the first moment of the generated reward corresponds to the expectation of the total reward and the second central moment of the reward variance. For communicating Markov processes and for some specific classes of unichain processes long run risk-sensitive average reward is independent of the starting state. In this note we present necessary and sufficient condition for existence of optimal policies independent of the starting state in unichain models and characterize the class of average risk-sensitive optimal policies.
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public