Minimax Real-Time Heuristic Search

Abstract

S. Koenig. Minimax Real-Time Heuristic Search. Technical Report GIT-COGSCI-2001/02, College of Computing, Georgia Institute of Technology, Atlanta (Georgia), 2001.

Abstract: Real-time heuristic search methods interleave planning and plan executions and plan only in the part of the domain around the current state of the agents. This is the part of the domain that is immediately relevant for the agents in their current situation. So far, real-time heuristic search methods have mostly been applied to deterministic planning tasks. In this article, we argue that real-time heuristic search methods can efficiently solve nondeterministic planning tasks. Planning in nondeterministic domains can be time-consuming due to the many contingencies. However, real-time heuristic search methods allow agents to gather information early. This information can be used to resolve some of the uncertainty caused by nondeterminism and thus reduce the amount of planning done for unencountered situations. To this end, we introduce Min-Max Learning Real-Time A* (Min-Max LRTA*), a real-time heuristic search method that generalizes Korf's LRTA* to nondeterministic domains. Min-Max LRTA* has the following advantages: First, different from the many existing ad-hoc planning methods that interleave planning and plan executions, it has a solid theoretical foundation and is domain independent. Second, it allows for fine-grained control over how much planning to do between plan executions. Third, it can use heuristic knowledge to guide planning which can reduce planning time without increasing the plan-execution time. Fourth, it can be interrupted at any state and resume execution at a different state. Fifth, it amortizes learning over several planning episodes, which allows it to find a plan with a suboptimal plan-execution time fast and then improve the plan-execution time as it solves similar planning tasks, until the plan-execution time is satisficing or optimal. Thus, it still asymptotically minimizes the plan-execution time in the long run in case similar planning tasks unexpectedly repeat. We apply Min-Max LRTA* to simulated robot-navigation tasks in mazes, where the robots know the maze but do not know their initial position and orientation (pose). These planning tasks can be modeled as planning tasks in nondeterministic domains whose states are sets of poses. We show that Min-Max LRTA* solves the robot-navigation tasks fast, converges quickly, and requires only a small amount of memory. We also discuss how Min-Max LRTA* can be applied to moving-target search, the task of catching a moving target. Planning in nondeterministic domains, including domains where the state of the world cannot be observed completely, is still poorly understood. The main contribution of this article is to provide insight into the behavior of planning methods in nondeterministic domains. However, our analytical results about the properties of Min-Max LRTA* also apply to LRTA* and provide new insight into the behavior of this popular real-time heuristic search method in deterministic domains.

Download the paper in pdf.

Download the paper in gzipped postscript (large download time).

Many publishers do not want authors to make their papers available electronically after the papers have been published. Please use the electronic versions provided here only if hardcopies are not yet available. If you have comments on any of these papers, please send me an email! Also, please send me your papers if we have common interests.

This page was automatically created by a bibliography maintenance system that was developed as part of an undergraduate research project, advised by Sven Koenig.