Neural Task Success Classifiers for Robotic Manipulation from Few Real Demonstrations

Abstract

Robots learning a new manipulation task from a small amount of demonstrations are increasingly demanded in different workspaces. A classifier model assessing the quality of actions can predict the successful completion of a task, which can be used by intelligent agents for action-selection. This paper presents a novel classifier that learns to classify task completion only from a few demonstrations. We carry out a comprehensive comparison of different neural classifiers, e.g. fully connected-based, fully convolutional-based, sequence2sequence-based, and domain adaptation-based classification. We also present a new dataset including five robot manipulation tasks, which is publicly available. We compared the performances of our novel classifier and the existing models using our dataset and the MIME dataset. The results suggest domain adaptation and timing-based features improve success prediction. Our novel model, i.e. fully convolutional neural network with domain adaptation and timing features, achieves an average classification accuracy of 97.3% and 95.5% across tasks in both datasets whereas state-of-the-art classifiers without domain adaptation and timing-features only achieve 82.4% and 90.3%, respectively.

Research Method

Reward functions in robot learning play a major role in measuring task success in order to numerically reward the behaviour of robots. The use of onboard robot sensors only---without relying on any other external sensors---makes the problem of measuring the success of tasks even harder. In this context, this paper focuses on training success classifiers (also referred to as 'goal classifiers') for measuring the levels of success in robotic manipulation tasks from only a few (as opposed to many) human demonstrations. The idea of training a success classifier in a new task using only a few demonstrations with high accuracy is still a challenging research problem. In this work, we study such a problem via the scenario illustrated in figure below.

Model

CNN-based neural architectures for task success classification

Results

Average performance results of our baseline and proposed neural architectures for task success classification applied to the Kitchen and MIME datasets (notation: ACC=Average Classification Accuracy, AUC=Area Under the Curve)

Paper

Preprint can be accessed on arXiv.

Code

The code is available at github.com/Mohtasib/RewardLearning.

Data

The data is available at github.com/Mohtasib/RewardLearning.

Citation

To be added later!

Contact

For comments/questions, contact Abdalkarim Mohtasib