the Markov decisions Process (MDP)

So lat time we stop on the concept of map (plan so where ever the agent starts it starts developing map), but is that simple, reinforcement learning is this simple in reality no, so this is where Markov Descision Process

Deterministic Search

Non-Deterministic Search

Markov Procecss

A stochastic process has the markov property if the conditional probablity distribution of future states of process (conditional on both past and present states) depends only upon the present state, not on the sequence of events that preceded it. A process with the property is called Markov process.

Markov Descision processes

Markov Descision processes provides a mathematical framework for modeling descision making in situation were outcomes are parly random and partly under the control of the descision maker

wikipidia

Adational Reading paper

A Survey of Application of MArkov Descision Process By D.J white(1993)

Course Structure Q Learning Intution