In this paper, we study spatially synchronous two-agent navigation on a structured partially unknown graph. The general edge cost statistics are given, and the agents gather and share exact information on the cost of local edges. The agents purpose is to traverse the graph as efficiently as possible. In previous work, we formulate the problem as a Dynamic Program, and exploit the structure of an equivalent Linear Program to compute the optimal value function. Here, we use the optimal policy to formulate a Markov chain with an infinite number of states whose properties we analyze. We present a method that computes the steady state probability distribution of the agent separation, exploiting the repetitive structure of the Markov chain as the agent separation goes to infinity. The results confirms and quantify the intuition that the less rewards, the more beneficial for the agents to spread out. ©2005 AACC.
|Original language||English (US)|
|Title of host publication||Proceedings of the American Control Conference|
|Number of pages||6|
|State||Published - Sep 1 2005|