Tutorial

Call for tutorial proposal is now over, but is still available as a reference here.

Modeling Expressive and Communicative Agents

Catherine Pelachaud, Stacy Marsella and Yu Ding

Abstract: Nonverbal behaviors play a very crucial role in communication. They convey a large quantity of information linked to communicative intentions but also to other mental states such as emotions and attitudes. When interacting with humans, Embodied Conversational Agents ECAs should be endowed with such variety of meaningful and expressive nonverbal behaviors.

In this tutorial, we will present computational methods to incorporate expressive behaviors in conversational agents. We will start by providing an overview of how mental states are expressed multimodally. The focus will be on communicative acts as well as emotions, social attitudes and metaphors and how facial expression, gaze and gesture convey this information. We will then explore a variety of machine learning and automated methods to model what nonverbal behaviors to perform and how to realize them in an embodied agent.

Catherine Pelachaud is a Director of Research at CNRS in the laboratory ISIR, University of Pierre and Marie Curie. Her research interest includes embodied conversational agent, nonverbal communication (face, gaze, and gesture), expressive behaviors and socio-emotional agents. She is associate editors of several journals among which IEEE Transactions on Affective Computing, ACM Transactions on Interactive Intelligent Systems and Journal on Multimodal User Interfaces. She has co-edited several books on virtual agents and emotion-oriented systems.

Stacy Marsella is a professor in Computer Science and Psychology at Northeastern University. His research interests include the computational modeling of human cognitive, emotional and social behavior as well as the application of these models to a range of applications including virtual humans and social simulations.

Yu Ding is a postdoctoral researcher at the University of Houston. He received his Ph.D. degree (2014) in Computer Science from Telecom ParisTech (France). He completed the M.S. degree in Computer Science from Pierre et Marie Curie university (France) and the B.S. degree in Automation from Xiamen University (China). His research interests include nonverbal behavior synthesis, and machine learning methods for multimodal processing. He is a member of the ACM.


Deep Learning for Affective Computing

Björn W. Schuller

Abstract: The tutorial introduces the attendees to the principles of Deep Learning for Affective Computing and its application to signal-based and symbolic representations of input signals such as audio, video, physiological information or text. It will focus on the analysis side, but also show examples of application towards generation of affective output such as text or music, and alike. It will include the presenter’s recently introduced end-to-end learning approaches in fully multimodal manner allowing to train an affect recognizer directly from raw material without the need of feature representations. This is an utmost powerful tool, as arbitrary signals and symbolic representations can be used, accordingly. The principles of Convolutional Neural Networks (CNNs) and (deep) Recurrent Neural Networks (RNNs) with Long-Short-Term Memory (LSTM) or Gated Recurrent Units (GRUs) and more recent efficient combinations of these two such as Convolutional LSTM models (CLSTMs) will be introduced step by step. For processing of textual or other symbolic representation, word embeddings such as by word2vec and LSTM language models will be introduced.  

Freely available open-source tools by the presenter’s groups will be the basis of demonstration and application such as the GPU LSTM RNN toolkit CURRENNT (https://sourceforge.net/projects/currennt/) or the TensorFlow scripts (https://github.com/trigeorgis/ComParE2017) for end-2-end learning. Of particular interest to the Affective Computing community will be alignment algorithms such as Deep Canonical Time Warping or latent variable modelling by Deep Semi NMF.  

The tutorial will also touch upon crucial steps of data scarceness given the ever high “data hunger” of deep learning methods. This will include recent Transfer Learning methods based on Autoencoders such as the presenter’s team’s Universum Autoencoder and other alternatives. It will also touch upon the usage of large-data pre-trained networks such as in the image domain and handling of other modalities with these nets.

Björn W. Schuller received his diploma, doctoral degree, habilitation, and Adjunct Teaching Professor all in electrical engineering and information technology from TUM in Munich/Germany. At present, he is Full Professor and head of the Chair of Complex and Intelligent Systems at the University of Passau/Germany and Reader (Associate Professor) in Machine Learning at Imperial College London/UK. Further, he is the co-founding CEO of audEERING – a company focussed on Affective Computing via intelligent audio engineering in the real world. He is also a permanent Visiting Professor at the Harbin Institute of Technology/P.R. China among further Associateships. Previous major stations include Joanneum Research in Graz/Austria, and the CNRS-LIMSI in Orsay/France. Dr. Schuller is an elected member of the IEEE Speech and Language Processing Technical Committee, Senior Member of the IEEE, and was President of the Association for the Advancement of Affective Computing. He (co-)authored 5 books and more than 600 publications (14k citations, h-index = 56). He is the Editor in Chief of the IEEE Transactions on Affective Computing, Associate Editor for Computer Speech and Language, IEEE Signal Processing Letters, IEEE Transactions on Cybernetics, and the IEEE Transactions on Neural Networks and Learning Systems, and a General Chair of ACII 2019 and ACM ICMI 2014, a Program Chair of Interspeech 2019, ACII 2015 and 2011, ACM ICMI 2013, and IEEE SocialCom 2012. He won a range of awards including being honoured as one of 40 extraordinary scientists under the age of 40 by the World Economic Forum in 2015 and 2016. His research has garnered over 10 million EUR in extramural funding for projects centred around Affective Computing and Intelligent Interaction: He served as Coordinator or PI in more than 10 European Projects, and is consultant of companies such as Huawei and Samsung. Dr. Schuller is the initiator of the first research competitions held annually since on speech emotion recognition (Interspeech ComParE since 2009), and multimodal emotion recognition (AVEC since 2011, and MEC since 2016). His teams were the first to use deep learning techniques in Affective Computing as early as in 2006, presenting also the first end-to-end learning speech emotion recogniser. They provide a range of broadly used toolkits such as openSMILE (feature extraction), CURRENNT (deep learning), or iHEARu-PLAY (intelligent crowd-sourcing) and databases to the community. Dr. Schuller has given more than 10 tutorials at major international conferences (e.g., ACM Multimedia, IJCAI, ICASSP, Interspeech).