Incremental Acquisition and Reuse of Multimodal Affective Behaviors in a Conversational Agent

To feel novel and engaging over time it is critical for an autonomous agent to have a large corpus of potential responses. As the size and multi-domain nature of the corpus grows, however, traditional hand-authoring of dialogue content is no longer practical. While crowdsourcing can help to overcome the problem of scale, a diverse set of authors contributing independently to an agent's language can also introduce inconsistencies in expressed behavior. In terms of affect or mood, for example, incremental authoring can result in an agent who reacts calmly at one moment but impatiently moments later with no clear reason for the transition. In contrast, affect in natural conversation develops over time based on both the agent's personality and contextual triggers. To better achieve this dynamic, an autonomous agent needs to (a) have content and behavior available for different desired affective states and (b) be able to predict what affective state will be perceived by a person for a given behavior. In this proof-of-concept paper, we explore a way to elicit and evaluate affective behavior using crowdsourcing. We show that untrained crowd workers are able to author content for a broad variety of target affect states when given semi-situated narratives as prompts. We also demonstrate that it is possible to strategically combine multimodal affective behavior and voice content from the authored pieces using a predictive model of how the expressed behavior will be perceived.

[1]  J. Russell A circumplex model of affect. , 1980 .

[2]  Joseph Bates,et al.  Personality-rich believable agents that use language , 1997, AGENTS '97.

[3]  Oudeyer Pierre-Yves,et al.  The production and recognition of emotions in speech: features and algorithms , 2003 .

[4]  Pierre-Yves Oudeyer,et al.  The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[5]  Reid G. Simmons,et al.  Modeling Affect in Socially Interactive Robots , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[6]  Matthias Scheutz,et al.  The utility of affect expression in natural language interactions in joint human-robot tasks , 2006, HRI '06.

[7]  Christoph Bartneck,et al.  Perception of affect elicited by robot motion , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[8]  Walter S. Lasecki,et al.  Conversations in the Crowd: Collecting Data for Task-Oriented Dialog Learning , 2013, Proceedings of the AAAI Conference on Human Computation and Crowdsourcing.

[9]  Michelle Karg,et al.  Body Movements for Affective Expression: A Survey of Automatic Recognition and Generation , 2013, IEEE Transactions on Affective Computing.

[10]  Tony Belpaeme,et al.  People Interpret Robotic Non-linguistic Utterances Categorically , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[11]  Jeffrey Nichols,et al.  Chorus: a crowd-powered conversational assistant , 2013, UIST.

[12]  Cynthia Breazeal,et al.  Crowdsourcing human-robot interaction , 2013, HRI 2013.

[13]  Catherine Pelachaud,et al.  From a User-created Corpus of Virtual Agent's Non-verbal Behavior to a Computational Model of Interpersonal Attitudes , 2013, IVA.

[14]  Catherine Pelachaud,et al.  A crowdsourcing toolbox for a user-perception based design of social virtual actors , 2013 .

[15]  C. Bartneck,et al.  Comparing the Similarity of Responses Received from Studies in Amazon’s Mechanical Turk to Studies Conducted Online and with Direct Recruitment , 2015, PloS one.

[16]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[17]  Michael Kriegel,et al.  Towards a crowdsourced solution for the authoring bottleneck in interactive narratives , 2015 .

[18]  Jamy Li,et al.  The benefit of being physically present: A survey of experimental works comparing copresent robots, telepresent robots and virtual agents , 2015, Int. J. Hum. Comput. Stud..

[19]  Daniel McDuff,et al.  Crowdsourcing Techniques for Affective Computing , 2015 .

[20]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[21]  Tom Ziemke,et al.  Physical vs. Virtual Agent Embodiment and Effects on Social Interaction , 2016, IVA.

[22]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[23]  Boyang Li,et al.  Semi-situated learning of verbal and nonverbal content for repeated human-robot interaction , 2016, ICMI.

[24]  Boyang Li,et al.  Learning and Reusing Dialog for Repeated Interactions with a Situated Social Agent , 2017, IVA.

[25]  Andreea Danielescu,et al.  A Bot is Not a Polyglot: Designing Personalities for Multi-Lingual Conversational Agents , 2018, CHI Extended Abstracts.

[26]  James Kennedy,et al.  Expressing Coherent Personality with Incremental Acquisition of Multimodal Behaviors , 2018, 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).