Tuesday, November 21, 2017

Oh, patents! Humanoid voices (1)

Copyright © Françoise Herrmann

A humanoid robot without a voice would really have missed the point of emulating humans!

So, each of the Softbank Robotics humanoid robots – Nao, Pepper and Romeo, are not only equipped with voices (i.e.; speech synthesizer and speech recognition with a natural language interface), they also have actuators generating coordinated body language that animates interactions to make them more mimetic -- plus much more in terms of personalized and synergetic interactive capacity.

The many Softbank Robotics R&D partnerships in human/machine interaction, at major academic robotics research labs, in France and Europe, are both part of the humanoids' development, and users of the robotics platforms to further their own research agendas.  

One such project, led by Devillers (2017) at L'IMSI (affiliated with France's National Center for Scientific Research), seeks to model not only the verbal components of human/machine interaction, but also the non-verbal or paralinguistic aspects of interaction. The assumption is that the quality of human/machine interactions might increase when more aspects of the interactions are modeled and detected. It is not only what is said that matters, but how it is said, for example with intonation or facial expression. Detecting that an interlocutor might be annoyed, or angry, perplexed or unsure, will greatly enhance the quality of the interactions. 

Ultimately, the goal is for the robot companion to please, so that more satisfying human/machine interactions might arise, perhaps creating conditions for a warm relationship to take root, even if it is going to be a very deceptive one. Thus, the project also includes an ethical component, designed to limit the ways in which such potentially deceptive interactions play out with vulnerable populations, such as children, the elderly and handicapped. 

Finally, the project seeks to define machine humor, also for the purposes of making the machine more likable. If the machine is bound to make mistakes, then the machine’s own error detection might be transformed into humor. Otherwise, humor might also be modeled as a form of response to the detection of certain emotional states. 

The detection and modeling of non-verbal input in human/machine interaction, for the purposes of enhancing human/machine interactions, is a patented Softbank Robotics invention.  US2017148434 titled Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method, discloses the acquisition of input from at least a sound sensor and a motion or image sensor, for the purposes of interpreting such linguistic and paralinguistic aspects of human/machine interaction as utterances, intonation, gestures, facial expressions and body posture. In turn, the interpretation of multi-modal input is designed to invoke a humanoid response that also includes both linguistic and paralinguistic features, such as an utterance, intonation, gestures, facial expression, and body posture!  

Thus, the humanoid response is also animated. The patent figure drawing 5b shows the syntactic analysis of the utterance: I agree with you. for the purpose of determining the insertion point(s) of the mechanical actuation that will animate the robot's response. 

The abstract of this patent is included below:
A method of performing dialogue between a humanoid robot and user comprises: i) acquiring input signals from respective sensors, at least one being a sound sensor and another being a motion or image sensor; ii) interpreting the signals to recognize events generated by the user, including: the utterance of a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; iii) detennining a response of the humanoid robot, comprising an event such as: the utterance of a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; iv) generating, an event by the humanoid robot; wherein step iii) comprises detemlining the response from events jointly generated by the user and recognized at step ii), of which at least one is not words uttered by the user. A computer program product and humanoid robot for carrying out the method is provided. [Abstract US2017148434] 

This invention is disclosed in a whole family of patents listed, and hyperlinked, below. 
  • US2017148434 (A1) ― 2017-05-25 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method 
  • AU2015248713 (A1) ― 2016-11-03 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method 
  • CA2946056 (A1) ― 2015-10-22 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method 
  •  EP2933067 (A1) ― 2015-10-21 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method 
  • HK1216405 (A1) ― 2016-11-11 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method
  • JP2017520782 (A) ― 2017-07-27 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method  
  • KR20170003580 (A) ― 2017-01-09 - Method of performing multi-modal dialogue between a humanoid robot and user computer program product and humanoid robot for implementing said method
  • MX2016013019 (A) ― 2017-05-30 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method. 
  • SG11201608205U (A) ― 2016-10-28 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method 
  • WO2015158887 (A2) ― 2015-10-22 - Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method
The following video will give you a glimpse of how well Pepper performs, in an interview, with a human. 


NB. Aldebaran Robotics is the former Softbank Robotics.

References
Devillers, L. (2017) Rire avec les robots pour mieux vivre avec. Le Journal du CNRS 9-02-2017 
LIMSI - Laboratoire  d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
Projet romeo – Partenaires
Softbank Robotics

1 comment:

prashanth said...

Thanks you for sharing the article. The data that you provided in the blog is infromative and effectve. Through you blog I gained so much knowledge. Also check my collection at machine learning online training Blog