论文信息 - The Complexity of Decentralized Control of Markov Decision Processes

The Complexity of Decentralized Control of Markov Decision Processes

We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalizations of both the fully observable case and the partially observable case that allow for decentralized control are described. For even two agents, the finite-horizon problems corresponding to both of these models are hard for nondeterministic exponential time. These complexity results illustrate a fundamental difference between centralized and decentralized control of Markov decision processes. In contrast to the problems involving centralized control, the problems we considerprovably do not admit polynomial-time algorithms. Furthermore, assuming EXP ? NEXP, the problems require superexponential time to solve in the worst case.

[1] Harry R. Lewis. Complexity of solvable cases of the decision problem for the predicate calculus , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[2] John H. Reif,et al. Multiple-person alternation , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[3] S. Marcus,et al. Decentralized control of finite state Markov processes , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[4] John N. Tsitsiklis,et al. On the Complexity of Designing Distributed Protocols , 1982, Inf. Control..

[5] Christos Papadimitriou,et al. Intractable problems in control theory , 1985, 1985 24th IEEE Conference on Decision and Control.

[6] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[7] M. Aicardi,et al. Decentralized optimal control of Markov chains with a common past information set , 1987 .

[8] Nondeterministic exponential time has two-prover interactive protocols , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[10] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[11] G. W. Wornell,et al. Decentralized control of a multiple access broadcast channel: performance bounds , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[12] Sarit Kraus,et al. Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[13] Gregory W. Wornell,et al. A separation theorem for periodic sharing information patterns in decentralized control , 1997 .

[14] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[15] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..