RPAL Robotic Personal Assistants Laboratory
Socially-Competent Navigation

Despite the great progress in the field of robotic navigation over the past few decades, navigating in a human environment remains a hard task for a robot due to the lack of formal rules guiding traffic, the lack of explicit communication among agents, and the unpredictability of human behavior. Existing navigational approaches often contribute to a chaotic experience because the resulting robot motion is hard to interpret, causing unpredictable human reactions to which the robot in turn reacts. In this project, we are developing models of pedestrian behavior in crowds based on the structure of braid groups from topology, and planning algorithms to help humans and robots come to consensus about how they can smoothly avoid one another. Our framework is inspired by insights from studies on pedestrian behavior and action interpretation and leverages the power of implicit communication to disambiguate complex human behavior.

Visuomotor Robot Control via Natural Language Commands

In this project, we build a modular, interpretable, deep neural network architecture that explicitly addresses language understanding, mapping, planning, and control, allowing a quadcopter to execute navigation commands by mapping first-person camera images to velocity commands. The model builds an internal map of the environment by projecting deep image features from the first-person view to the map reference frame, which is accumulated over time. The model predicts possible future trajectories over the same map.

See this video for a demonstration of the model.

Mental Model Formation and Updating in Social Robots

People form first impressions of robots just as they do of people, and those impressions have outsized influence just the same. Human mental models typically are not well calibrated for social robots, which often are programmed to give the impression that they are more competent than they really are. The resulting poorly-calibrated mental models often initially result in over-estimates of capabilities (called the novelty effect). Over time, disillusion may set in when the person realizes that their expectations are not being met, thus prematurely curtailing use of the robot. It is therefore important for a social robot to accurately set expectations of its capabilities. In this project, we study the mechanisms by which people form and update impressions of robots and investigate algorithms by which the robot is able to help humans calibrate their mental model of the robot, resulting in more realistic judgments about robot capabilities and consequent fruitful long-term interactions.

  • Minae Kwon, Melissa Ferguson, Thomas Mann, and Ross A. Knepper. "An exploration of implicit attitudes towards robots using implicit measures". In: Workshop on Explainable Robotic Systems. Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI). Chicago, USA, March 2018.
  • Minae Kwon, Melissa Ferguson, Thomas Mann, and Ross A. Knepper. "Forming and updating implicit impressions of robot competence". In: Workshop on Morality and Social Trust in Autonomous Robots. Proceedings of the Robotics Science and Systems Conference (RSS). Cambridge, USA, July 2017.
  • Minae Kwon, Malte F. Jung, and Ross A. Knepper. "Human expectations of social robots". In: Late-Breaking Report at the ACM/IEEE International Conference on Human-Robot Interaction (HRI). Christchurch, New Zealand, March 2016.
  • Minae Kwon, Malte F. Jung, and Ross A. Knepper. "Human expectations of social robots". In: Workshop on Challenges and Best Practices to Study HRI in Natural Interaction Settings. Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI). Christchurch, New Zealand, March 2016.
Reasoning about Implicit Communication

In a joint activity, team members act together by coordinating their knowledge, goals, and intentions. A great deal of information exchanged by teammates in service of coordinating the joint activity occurs implicitly, through the choice of functional action in context. Such implicit communication is a natural capability in human teams that robots must acquire if they are to function fluidly with people. Since the functional actions are required by the joint activity, it is a matter of efficiency to piggyback extra information over top of them. In this project, we are developing a framework that robots can use to both understand and generate implicit communication with human teammates.

  • Claire Liang, Julia Proft, and Ross A. Knepper. "Implicature-based inference for socially-fluent robotic teammates". In: Workshop on Mathematical Models, Algorithms, and Human-Robot Interaction. Proceedings of the Robotics Science and Systems Conference (RSS). Cambridge, USA, July 2017.
  • Ross A. Knepper, Christoforos I. Mavrogiannis, Julia Proft, and Claire Liang. "Implicit communication in a joint action". In: Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI). Best Paper Finalist. Vienna, Austria, March 2017.
  • Ross A. Knepper. "On the communicative aspect of human-robot joint action". In: Workshop Towards a Framework for Joint Action — What about Common Ground?. Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). New York, USA, August 2016.
Recognizing Unfamiliar Gestures

Human communication is highly multimodal, including speech, gesture, gaze, facial expressions, and body language. Robots serving as human teammates must act on such multimodal communicative inputs from humans. Although individual modalities are frequently ambiguous, they multiple modalities serve as context for one another, so that understanding them jointly if often easier and more complete than attempting to understand them individually and reconcile their meanings afterwards. In this project, we explore a method for understanding complex, situated communications by leveraging coordinated natural language, gesture, and context.


As part of this project, we collected a dataset of situated gesture and speech. The data were collected by recording participants in an experiment designed to elicit a high volume of coincident gesture and speech. Participants were given a set of instructions for folding a moderately complex piece of origami and told that the instructions had been generated by a machine learning model. They were then asked to use the instructions to teach the study conductor how to fold the piece of origami without ever showing the instructions to the study conductor.

To avoid inducing a bias toward unnatural gesture use, participants were never told to use gesture. Instead, the study conductor told participants that any mode of communication except for showing the instructions or looking at what had been folded thus far was acceptable. However, due to the construction of the origami instructions, participants found it very difficult to complete the exercise (and convey the instructions to the study conductor) using speech alone. Thus, participants resorted to simultaneous speech and gesture to describe the geometry of the origami and the folding actions which it required.

The dataset is provided in raw format, unprocessed except to remove noise. It comprises 27 trials of roughly 20 minutes each of recorded data. NiTE skeleton data and audio are provided for each trial. The audio for a given trial is stored either as a .flac or a .wav file, and the skeleton data is saved as a pickled Python object in the format returned by the NiTE framework. You can use the nite-skeleton-visualizer to create visualizations from the skeleton data or as an example of how to work with the skeleton data.

If you use this dataset, please cite our ISER paper. You can download the dataset using the link below.

Automated Furniture Assembly

Broadly defined, the goal of this project is to autonomously assemble IKEA furniture using a team of robots. In the long term, the vision is for the robots to open the box, remove the furniture, and follow the directions or otherwise plan a series of steps resulting in the complete assembly of the furniture kit. Major robotics challenges include knowledge representation and understanding, planning with physical and geometric constraints, understanding function from form, dexterous control of parts during attachment operations, use of tools, multi-robot coordination, and interaction with humans. The current implementation of the system is IkeaBot.

Although IKEA furniture is most often assembled in a home setting, many of these skills apply equally well to problems in manufacturing automation. We are therefore exploring applications of these technologies in factory settings as well.