Abstract

S. Koenig and R.G. Simmons. The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms: The Proofs. Technical Report CMU-CS-95-177, School of Computer Science, Carnegie Mellon University, Pittsburgh (Pennsylvania), 1995.

Abstract: This technical report contains the proofs of all theorems that are central to our Machine Learning Journal article 'The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms,' in which we analyze the complexity of on-line reinforcement-learning algorithms that are applied to goal-directed exploration tasks: Previous work had concluded that, even in deterministic state spaces, initially uninformed reinforcement learning was at least exponential for such problems, or that it was of polynomial worst-case time-complexity only if the learning methods were augmented. In the article we prove that, to the contrary, the algorithms are tractable with only a simple change in the reward structure ('penalizing the agent for action executions') or in the initialization of the values that they maintain. In particular, we provide tight complexity bounds for both Watkins' Q-learning and Heger's Q-hat-learning and show how their complexity depends on properties of the state spaces. We also demonstrate how one can decrease the complexity even further by either learning action models or utilizing prior knowledge of the topology of the state spaces. Our results provide guidance for empirical reinforcement-learning researchers on how to distinguish hard reinforcement-learning problems from easy ones and how to represent them in a way that allows them to be solved efficiently.

Download the paper in pdf.

Download the paper in gzipped postscript (large download time).

Many publishers do not want authors to make their papers available electronically after the papers have been published. Please use the electronic versions provided here only if hardcopies are not yet available. If you have comments on any of these papers, please send me an email! Also, please send me your papers if we have common interests.


This page was automatically created by a bibliography maintenance system that was developed as part of an undergraduate research project, advised by Sven Koenig.