IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. XX, NO. XX, XXXXX 2014 1 Behavior Understanding and Developmental Robotics Albert Ali Salah1 , Member, IEEE, Pierre-Yves Oudeyer2 , Member, IEEE, C ¸ etin Meric¸li3 , Member, IEEE, 4 and Javier Ruiz-del-Solar , Senior Member, IEEE 1 Department of Computer Engineering, Bo˘gazic¸i University, Istanbul, 34342 Turkey 2 Inria and Ensta ParisTech, France 3 National Robotics Engineering Center, Robotics Institute, Carnegie Mellon University, United States 4 Department of Electrical Engineering & Advanced Mining Technology Center, Universidad de Chile, Av. Tupper 2007, 837-0451 Santiago, Chile The scientific, technological and application challenges that arise from the mutual interaction of developmental robotics and computational human behavior understanding give rise to two different perspectives. Robots need to be capable to learn dynamically and incrementally how to interpret, and thus understand multimodal human behavior, which means behavior analysis can be performed for developmental robotics. On the other hand, behavior analysis can also be performed through developmental robotics, since developmental social robots can offer stimulating opportunities for improving scientific understanding of human behavior, and especially to allow a deeper analysis of the semantics and structure of human behavior. The contributions to the Special Issue explore these two perspectives. Index Terms—human behavior understanding, developmental learning, affective computing, attention, learning by demonstration, nonverbal communication, activity recognition. I. T HE S COPE OF THE S PECIAL I SSUE structure of human behavior. Humans tend to interpret the meaning and the structure of other’s behaviors in terms of I N order to act in a useful, relevant, and socially acceptable manner, robots will need to understand the behavior of humans at various levels of abstractions, at various time scales, their own action repertoire, which acts as a strong helping prior for this complex inference problem. Since robots are also embodied and have an action repertoire, this can be used and in the particular context of human-robot interactions. as an experimental and theoretical tool to investigate human Robots need to be capable to learn dynamically and incremen- behavior, and in particular, the development and change of tally how to interpret, and thus understand multimodal human behavior over time. behavior. This includes for example learning the meaning of new linguistic constructs used by a human, learning to interpret II. C ONTRIBUTIONS TO THE SPECIAL ISSUE the emotional state of particular users from paralinguistic or non-verbal behavior, characterizing properties of the interac- The special issue incorporates six papers, two of which tion or learning to guess the intention, and potentially the extend work presented in the Third Int. Workshop on Hu- structure of goals of a human based on its overt behavior. man Behavior Understanding [1]. We briefly summarize their Furthermore, robots need in particular to be capable of highlights here. Nonverbal signals play a very prominent role in human- learning new tasks through interaction with humans, for ex- human communication [2], [3], especially to coordinate joint ample using imitation learning or learning by demonstration. actions and for the dance-like precision in the timing of This heavily involves the capacity for learning how to decode interactions. But what happens if one of the communicating teaching behavior, including linguistic and non-linguistic cues, humans is replaced by a robot? Alessandra Sciutti, Laura feedback and guidance provided by humans, as well as infer- Patan`e, Francesco Nori and Giulio Sandini, in their pa- ring reusable primitives in human behavior. per entitled “Understanding object weight from human and While some of the existing techniques of multimodal behav- humanoid lifting actions,” make a humanoid robot produce ior analysis and modeling can be readily re-used for robots, an informative set of nonverbal signals to communicate to a novel scientific and technological challenges arise when one human partner the weight of an object in an implicit manner. aims to achieve human behavior understanding in the context They show that it is not enough that the robot chooses the of natural and life-long human-robot interaction. The first optimum action (here correctly lifting a weight), but should purpose of this special issue is to explore these challenges. perform it in a way to allow the humans to modify their own Our second purpose is to understand how behavior analysis coordinated actions. Subsequently, a simple modification in can be achieved through developmental robotics. Develop- robot action planning can produce a significant impact on the mental social robots can offer stimulating opportunities for efficiency of the human-robot interaction. improving scientific understanding of human behavior, and The developmental trajectory of motor actions in humans especially to allow a deeper analysis of the semantics and contains a number of skills that are acquired in parallel, Manuscript received XXXX; revised XXXX. Corresponding author: A.A. including gaze control, body orientation, reaching and grasp- Salah (email:
[email protected]). ing behaviors. In “From Saccades to Grasping: A model of IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. XX, NO. XX, XXXXX 2014 2 Coordinated Reaching through Simulated Development on a In humans, as opposed to typical computer-based systems, Humanoid Robot,” James Law, Patricia Shaw, Mark Lee, recognition of actions is tied to reproduction of actions, so that and Michael Sheldon implement these skills on an iCub robot, improvements in one are reflected in the other to some degree. and show that by mimicking a child’s learning trajectory, In their paper “Humanoid Tactile Gesture Production using it is possible to develop hand/eye coordination on complex a Hierarchical SOM-based Encoding,” Georgios Pierris and kinematics rapidly. An interesting observation made by the Torbjørn S. Dahl describe a perception-action system where authors is that motor babbling need not be random, but can a robot learns a hierarchical self-organized map (SOM) based relate learned actions to new exploration patterns, in a manner representation for a demonstrated action, and reproduces the not unlike free play, which balances goal-oriented exploration action while compensating for perturbations. This approach and social guidance [4]. builds on Cohen’s Constructivist Learning Architecture (CLA), Social interactions are important sources of learning for which is a model of learning that reproduces several effects children, but also form the complex backdrop of adult be- from theories of infant cognitive development [8]. havior. Recent work in computer analysis of human behavior Taken together, these contributions exemplify schemes in- focuses on complex and contextualized social interactions, as spired by human development and behavior to improve robot opposed to scenarios of a single person performing a single behavior, as well as provide robotic systems on which cogni- activity [5], [6]. One of the basic requirements of social tive theories and their predictions can be tested. interaction is the establishment of joint attention between the interacting parties. For natural human-robot interaction, real- ACKNOWLEDGMENT time approaches that implement attentional mechanisms in Albert Ali Salah was partially funded by Bo˘gazic¸i Uni- robots would be essential. Jo˜ao Filipe Ferreira, and Jorge versity project BAP 6531. Pierre-Yves Oudeyer’s contribution Dias provide an in-depth assessment of this field in their paper was partially funded by ERC Grant EXPLORERS 240007. “Attentional Mechanisms for Socially Interactive Robots - A Survey”. They point out that robots also offer a possibility R EFERENCES to study the underlying processes of attention in a detailed fashion and in complex settings, and thereby allow cognitive [1] A. A. Salah, J. Ruiz-del Solar, C ¸ . Meric¸li, and P.-Y. Oudeyer, Human Behavior Understanding: Third International Workshop, HBU 2012, scientists powerful computational and physical platforms to Vilamoura, Portugal, October 7, 2012. Proceedings. Springer, 2012. test their theories of attention. Subsequently, research on atten- [2] A. Mehrabian, Nonverbal communication. Transaction Publishers, 1977. tion demonstrates perfectly the two perspectives of behavior [3] M. Knapp, J. Hall, and T. Horgan, Nonverbal communication in human interaction. Cengage Learning, 2013. analysis for and through developmental robotics. The paper [4] S. M. Nguyen and P.-Y. Oudeyer, “Active choice of teachers, learning also shows that attention is primarily tackled in the visual strategies and goals for a socially guided intrinsic motivation learner,” domain in the field. There is a lot of room for novel approaches Paladyn, vol. 3, no. 3, pp. 136–146, 2013. [5] A. A. Salah, T. Gevers, N. Sebe, and A. Vinciarelli, “Challenges of human based on audio and multimodal information. behavior understanding,” in Human Behavior Understanding. Springer, It is well-known that the human brain contains cross-modal 2010, pp. 1–12. and multimodal sensory integration from very basic, neuronal [6] A. A. Salah, J. Ruiz-del Solar, C. Meric¸li, and P.-Y. Oudeyer, “Human behavior understanding for robotics,” in Human Behavior Understanding. levels onwards [7]. In their paper “The MEI Robot: Towards Springer, 2012, pp. 1–16. Using Motherese to Develop Multimodal Emotional Intelli- [7] B. E. Stein and M. A. Meredith, The merging of the senses. The MIT gence,” Angelica Lim and Hiroshi G. Okuno implement a Press, 1993. [8] L. B. Cohen, H. H. Chaput, and C. H. Cashon, “A constructivist model of developmental robot called MEI that can generalize emotional infant cognition,” Cognitive Development, vol. 17, no. 3, pp. 1323–1343, analysis across modalities, and correctly guess the emotional 2002. category of human gait after just being trained on the voice modality. In order to achieve this, a number of perceptual meta-features are derived and matched from each modality: speed, intensity, irregularity and extent, respectively. Lim and Albert Ali Salah (M’ 2006) received the Ph.D. degree from the Computer Engineering Depart- Okuno describe the semantic concepts covered by these four ment, Bo˘gazic¸i University, Istanbul, Turkey. Be- features in different domains like voice, gesture and music, tween 2007–2011, he worked at the CWI Insti- and postulate that during development, a joint perceptual tute, Amsterdam, and at the Informatics Institute of the University of Amsterdam. He is currently space would be a simplified and intuitive explanation for an Assistant Professor at Bo˘gazic¸i University Com- learning to represent emotional expressions across all domains puter Engineering Department and the Chair of the simultaneously. Cognitive Science program. His research interests include biologically inspired models of learning and Flexible and adaptive representations are essential for learn- vision, pattern recognition, biometrics, multimodal ing human behavior. Alexandros Andre Chaaraoui and interaction, and human behavior understanding. He has more than 100 Francisco Fl´orez-Revuelta propose to adapt a recent com- publications in related areas, including an edited book on computer analysis of human behavior. Dr. Salah received the inaugural EBF European Biometrics puter vision approach to action recognition in their paper Research Award, in 2006, for his work on facial feature localization. He “Adaptive Human Action Recognition with an Evolving Bag initiated the International Workshop on Human Behavior Understanding of Key Poses” by enhancing it with dynamic model update, (HBU) in 2010 and served as its co–chair since. He is a Member of the IEEE, ACM the IEEE AMD Technical Committee, IEEE Biometrics Council, and evolutionary parameter optimization. Constant change in the eNTERFACE Steering Committee, and the RoboCup Turkish National the model parameters helps the system to adapt easily to new Committee. He is an associate editor of JAISE and IEEE TAMD. Web action classes, or new interaction partners. http://www.cmpe.boun.edu.tr/∼salah. IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. XX, NO. XX, XXXXX 2014 3 Pierre-Yves Oudeyer is Research Director at Inria and head of the Inria and Ensta-ParisTech FLOW- ERS team (France). Before, he has been a permanent researcher in Sony Computer Science Laboratory for 8 years (1999-2007). After working on computa- tional models of language evolution, he is now work- ing on developmental and social robotics, focusing on sensorimotor development, language acquisition and life-long learning in robots, and their application in educational technologies. Strongly inspired by in- fant development, the mechanisms he studies include artificial curiosity, intrinsic motivation, the role of morphology in learning motor control, human-robot interfaces, joint attention and joint intentional understanding, and imitation learning. He has published two books, more than 100 papers in international journals and conferences, holds 8 patents, gave several invited keynote lectures in international conferences, and received several prizes for his work in developmental robotics and on the origins of language. He is laureate of the ERC Starting Grant EXPLORERS. He is editor of the IEEE CIS Newsletter on Autonomous Mental Development, and associate editor of IEEE TAMD, Frontiers in Neurorobotics, and of the International Journal of Social Robotics. Web http://www.pyoudeyer.com. C¸ etin Meric¸li (M’ 2004) is a Senior Robotics En- gineer at the National Robotics Engineering Center (NREC) of the Robotics Institute, Carnegie Mel- lon University. He was a post-doctoral fellow at the Computer Science Department, Carnegie Mellon University prior to joining NREC. He received his PhD in Computer Science from Bo˘gazic¸i University, Turkey. His research interests include robot learning from human demonstration and feedback, interactive learning, sliding autonomy through learning, long- term autonomy and lifelong learning, developmental and cognitive robotics, human-robot interaction, robot vision, data driven high-fidelity robot simulation, probabilistic robotics, robot soccer, multi- robot coordination and planning, and software engineering practices for robot software development. He was the general co-chair of RoboCup 2011, chair of the AAAI 2012 Fall Symposium on Robots Learning Interactively from Human Teachers (RLIHT), co-chair of the 3rd Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE’14), co- chair of the Workshop on Humanoid Robots Learning from Human Interaction at IEEE/RAS Humanoids 2010. Dr. Mericli is a founding member of the RoboCup Turkey National Committee. He is a member of AAAI, IEEE, IEEE Robotics and Automation Society, and IEEE Computational Intelligence Society. He is an editor of the Journal of Unmanned Systems Techology. Web http://cetin.mericli.com. Javier Ruiz-del-Solar (SM’ 2004) received the Diploma degree in electrical engineering, the M.S. degree in electronic engineering from the Technical University Federico Santa Maria, Valparaiso, Chile, in 1991 and 1992, respectively, and the Doctor Engineer degree from the Technical University of Berlin, Berlin, Germany, in 1997. In 1998, he joined the Department of Electrical Engineering, Universidad de Chile, Santiago, Chile, as an Assistant Professor. In 2005, he became an Associate Professor and in 2010 Professor. Since 2009, he has been an Executive Director of the Advanced Mining Technology Center, Universidad de Chile. His research interests include mobile robotics, computer and robot vision, and the development of automation technologies for mining applications. Dr Ruiz-del-Solar received the 2004 RoboCup Engineering Challenge Award, the 2003 IEEE RAB Achievement Award, and the RoboCup@Home Innovation Award in 2007 and 2008.