the Markov decisions Process (MDP)
So lat time we stop on the concept of map (plan so where ever the agent starts it starts developing map), but is that simple, reinforcement learning is this simple in reality no, so this is where Markov Descision Process
Deterministic Search
Non-Deterministic Search
Markov Procecss
A stochastic process has the markov property if the conditional probablity distribution of future states of process (conditional on both past and present states) depends only upon the present state, not on the sequence of events that preceded it. A process with the property is called Markov process.
Markov Descision processes
Markov Descision processes provides a mathematical framework for modeling descision making in situation were outcomes are parly random and partly under the control of the descision maker
- wikipidia
Adational Reading paper
A Survey of Application of MArkov Descision Process By D.J white(1993)