Bridging the Gap Between AI Planning and Reinforcement Learning (PRL)

ICAPS'24 Workshop
Banff, Alberta, Canada
Date: June 2, 2024

Aim and Scope of the Workshop

While AI Planning and Reinforcement Learning communities focus on similar sequential decision-making problems, these communities remain somewhat unaware of each other on specific problems, techniques, methodologies, and evaluations.

This workshop aims to encourage discussion and collaboration between researchers in the fields of AI planning and reinforcement learning. We aim to bridge the gap between the two communities, facilitate the discussion of differences and similarities in existing techniques, and encourage collaboration across the fields. We solicit interest from AI researchers that work in the intersection of planning and reinforcement learning, in particular, those that focus on intelligent decision-making. This is the seventh edition of the PRL workshop series that started at ICAPS 2020.

Topics of Interest

We invite submissions at the intersection of AI Planning and Reinforcement Learning. The topics of interest include, but are not limited to, the following

  • Reinforcement learning (model-based, Bayesian, deep, hierarchical, etc.)
  • Safe RL
  • Monte Carlo planning
  • Model representation and learning for planning
  • Planning using approximated/uncertain (learned) models
  • Learning search heuristics for planner guidance
  • Theoretical aspects of planning and reinforcement learning
  • Action policy analysis or certification
  • Reinforcement learning and planning competition(s)
  • Multi-agent planning and learning
  • Applications of both reinforcement learning and planning

Important Dates

Please refer to the PRL workshop website for the latest information.

  • Paper submission deadline: March 22th April 5th April 7th, AOE (final extension)
  • Paper acceptance notification: April 28th, AOE

ICAPS will be in-person this year. Authors of accepted workshop papers are expected to physically attend the conference and present in person.

Schedule

Time (Banff)Title
8:30Opening Remarks
8:35Keynote Felipe Trevizan:
The Next-Generation of Planning Heuristics: GNNs and Beyond
9:35Poster Session I
10:00Coffee break
10:30Talk Session I
The Case for Developing a Foundation Model for Planning-like Tasks from Scratch. Biplav Srivastava, Vishal Pallagani.
Equivalence-Based Abstractions for Learning General Policies. Dominik Drexler, Simon Ståhlberg, Blai Bonet, Hector Geffner.
Comparing State-of-the-art Graph Neural Networks and Transformers for General Policy Learning. Nicola J. Müller, Pablo Sanchez Martin, Jörg Hoffmann, Verena Wolf, Timo P. Gros.
Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning. Katharina Stein, Daniel Fišer, Jörg Hoffmann, Alexander Koller.
Planning with Language Models Through The Lens of Efficiency. Michael Katz, Harsha Kokel, Kavitha Srinivas, Shirin Sohrabi.
12:00Lunch
13:30Keynote Forest Agostinelli:
Deep Reinforcement Learning and Heuristic Search Algorithms
14:30Poster Session II
15:00Coffee break
15:30Talk Session II
Exploring Simultaneity: Learning Earliest-time Semantics for Automated Planning. Ángel Aso-Mollar, Óscar Sapena, Eva Onaindia.
ModelDiff: Leveraging Models for Policy Transfer with Value Lower Bounds. Xiaotian Liu, Jihwan Jeong, Ayal Taitler, Michael Gimelfarb, Scott Sanner.
Beyond Training: Optimizing Reinforcement Learning Based Job Shop Scheduling Through Adaptive Action Sampling. Constantin Waubert de Puiseau, Christian Dörpelkus, Jannik Peters, Hasan Tercan, Tobias Meisen.
Online Planning in MDPs with Stochastic Durative Actions. Tal Berman, Ronen Brafman, Erez Karpas.
A New View on Planning in Online Reinforcement Learning. Kevin Roice, Parham Mohammad Panahi, Scott M. Jordan, Adam White, Martha White.
17:00Closing Remarks
17:05End

Program

Keynotes

I. Felipe Trevizan: The Next-Generation of Planning Heuristics: GNNs and Beyond

Abstract

Deep learning has been responsible for multiple recent breakthroughs, particularly in image recognition and natural language processing. In this talk, I will focus on a particular deep learning model, Graph Neural Networks (GNNs), and how they have the potential to change heuristic search in automated planning from the heuristics used to search methods. I will introduce novel graph representations designed to optimize the application of GNNs to learning both domain-specific and domain-independent heuristics. Additionally, I will present other targets that can be learnt using GNNs, such as rankings between states, and how they can be used during search. Lastly, based on theoretical insights, we present an alternative approach to GNNs using classical machine learning methods such as SVMs and Gaussian Processes for heuristic learning, offering simplicity and reduced training times.

Biography

Dr. Felipe Trevizan is a Senior Lecturer at the School of Computing, the Australian National University. He previously served as a Senior Research Scientist at NICTA (now Data61/CSIRO). He earned his PhD in Machine Learning from Carnegie Mellon University in 2013. His research interests lie at the intersection of Artificial Intelligence, Operations Research and Machine Learning including automated planning and scheduling, reasoning under uncertainty, heuristic search, and learning for planning. Along with colleagues and students, he is the co-recipient of the 2016 best paper award from the Transport Research Board and the best paper award at the International Conference on Automated Planning and Scheduling (ICAPS) in 2016 and 2017.

II. Forest Agostinelli: Deep Reinforcement Learning and Heuristic Search Algorithms

Abstract

Deep reinforcement learning has been shown to be able to learn domain-specific heuristic functions in a largely domain-independent fashion. As a result, novel variations of A* search, such as batch A* search and Q* search, have been proposed to accommodate the deep neural networks that represent these heuristic functions. In this talk, I will describe how approximate value iteration can be used to learn heuristic functions to guide batch A* search, which can exploit parallelization provided by graphics processing units. Next, I will describe how Q-learning can be used to learn heuristic functions represented by deep Q-networks to guide Q* search, which exploits the structure of deep Q-networks to significantly increase speed and reduce memory during search. Finally, I will describe how model-based reinforcement learning and hindsight experience replay can be used to extend these methods to domains with unknown transition functions. I will give several examples of application domains, including the Rubik’s cube, quantum computing, and reaction mechanism pathway prediction. The code for many of these algorithms is publicly available at https://github.com/forestagostinelli/deepxube.

Biography

Forest Agostinelli is an assistant professor at the University of South Carolina. His research aims to use artificial intelligence to automate the discovery of new knowledge. He looks to apply his research to fields such as puzzle solving, chemical synthesis, robotics, quantum computing, theorem proving, program synthesis, and education. He led the creation of DeepCubeA, an artificial intelligence algorithm capable of solving puzzles such as the Rubik’s cube without human guidance. DeepCubeA has since been applied to problems in quantum computing, chemical reactions, cryptography, and parking lot optimization. He earned his Ph.D. from the University of California, Irvine under the supervision of Professor Pierre Baldi. His homepage is located at https://cse.sc.edu/~foresta/.

Talks

Select accepted papers are given a slot in the program: 15 minutes for content + 3 minutes for questions.

Poster Sessions

The program includes two poster sessions in order to have enough time for discussions. All authors are expected to participate in the poster session.

List of Accepted Papers

Submission Details

We solicit workshop paper submissions relevant to the above call of the following types:

  • Long papers – up to 8 pages + unlimited references / appendices
  • Short papers – up to 4 pages + unlimited references / appendices
  • Extended abstracts – up to 2 pages + unlimited references/appendices

Please format submissions in AAAI style (see instructions in the Author Kit). Authors submitting papers rejected from other conferences, please ensure you do your utmost to address the comments given by the reviewers. Please do not submit papers that are already accepted for the main ICAPS conference to the workshop.

Some accepted long papers will be invited for contributed talks. All accepted papers (long as well as short) and extended abstracts will be given a slot in the poster presentation session. Extended abstracts are intended as brief summaries of already published papers, preliminary work, position papers, or challenges that might help bridge the gap.

As the main purpose of this workshop is to solicit discussion, the authors are invited to use the appendix of their submissions for that purpose.

Paper submissions should be made through OpenReview.

Organizing Committee

Please send your inquiries to prl.theworkshop@gmail.com