Start free trial
Take Naologic for a spin today, no credit card needed and no obligations.
Start free trial

Temporal Difference Learning - Is Q-learning a temporal difference method?


A particular off-policy temporal-difference learning method is Q-learning, and V and Q are the state value and action value functions, respectively. Both Q and V may be learned using many TD and non-TD approaches, some of which are model-based and others of which are not.