Affiliation:
1. MIT-IBM Watson AI Lab, IBM Research Cambridge
2. IBM Research Tokyo
3. Graduate School of Arts and Sciences, University of Tokyo
4. School of Computing, Queen’s University
Abstract
Symbolic systems require hand-coded symbolic representation as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although deep learning has achieved significant success in many fields, the knowledge is encoded in a subsymbolic representation which is incompatible with symbolic systems. To address the gap between the two fields, one has to solve Symbol Grounding problem: The question of how a machine can generate symbols automatically. We discuss our recent work called Latplan, an unsupervised architecture combining deep learning and classical planning. Given only an unlabeled set of image pairs showing a subset of transitions allowed in the environment (training inputs), Latplan learns a complete propositional PDDL action model of the environment. Later, when a pair of images representing the initial and the goal states (planning inputs) is given, Latplan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution. We discuss several key ideas that made Latplan possible which would hopefully extend to many other symbolic paradigms outside classical planning.