What technology area does this patent fall under?

Primary CPC classification B25J9/1694. Mapped technology areas include Operations & Transport.

When was this patent published?

Publication date Tue Mar 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method

US10242666B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10242666-B2
Application number	US-201515300226-A
Country	US
Kind code	B2
Filing date	Apr 17, 2015
Priority date	Apr 17, 2014
Publication date	Mar 26, 2019
Grant date	Mar 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of performing dialog between a humanoid robot and user comprises: i) acquiring input signals from respective sensors, at least one being a sound sensor and another being a motion or image sensor; ii) interpreting the signals to recognize events generated by the user, including: the utterance of a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; iii) determining a response of the humanoid robot, comprising an event such as: the utterance of a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; iv) generating, an event by the humanoid robot; wherein step iii) comprises determining the response from events jointly generated by the user and recognized at step ii), of which at least one is not words uttered by the user. A computer program product and humanoid robot for carrying out the method is provided.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of performing a dialogue between a humanoid robot and at least one user comprising the following steps, carried out iteratively by said humanoid robot: i) acquiring a plurality of input signals from respective sensors, at least one said sensor being a sound sensor and at least one other sensor being a motion or image sensor; ii) interpreting the acquired signals to recognize a plurality of events generated by said user, selected from a group comprising: the utterance of at least a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; iii) determining a response of said humanoid robot, comprising at least one event selected from a group comprising: the utterance of at least a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression, said determining being performed by applying a set of rules, each said rule associating a set of input events to a response of the robot; iv) generating said or each said event; wherein at least some of said rules applied at said step iii) associate a response to a combination of at least two events jointly generated by said user and recognized at said step ii), of which at least one is not a word or sentence uttered by said user, and if the response determined during step iii) is or comprises at least the utterance of a word or sentence, executing a step iii-a) of performing a syntactic analysis of a sentence to be uttered by the robot to determine at least one word to be animated depending on a function of the at least one word within a structure of said sentence and determining an animation accompanying said response as a function of said analysis. 2. The method according to claim 1 , wherein at least some of said rules applied at said step iii) determine a response comprising at least two events generated jointly by said humanoid robot, of which at least one is not the utterance of a word or sentence. 3. The method according to claim 1 , wherein, at said step iii, said response of humanoid robot is determined based on at least one parameter selected from: a dialogue context, the identity of the user, an internal state of said humanoid robot. 4. The method according to claim 3 , further comprising a step of modifying the value of said or of at least one said parameter according to said at least one event recognized at said step ii) or determined in said step iii). 5. The method according to claim 1 , wherein said step ii) comprises searching a match between an acquired signal and an event belonging to a list of expected events stored in a memory of said humanoid robot, or accessible by it, said searching being carried out by successively using a plurality of matching methods with increasing complexity until an event is recognized with a confidence score greater than a predetermined value, or after the highest complexity recognition method has been used. 6. The method according to claim 5 , wherein the used matching methods are selected depending on a context of dialogue. 7. The method according to claim 5 , wherein said matching methods include, by order of increasing complexity: the search for an exact match, the search for an approximate match, the search for a phonetic correspondence—only in the case of voice recognition—and the search for a semantic correspondence. 8. A method of performing a dialogue between a humanoid robot and at least one user comprising the following steps, carried out iteratively by said humanoid robot: i) acquiring a plurality of input signals from respective sensors, at least one said sensor being a sound sensor and at least one other sensor being a motion or image sensor; ii) interpreting the acquired signals to recognize a plurality of events generated by said user, selected from a group comprising: the utterance of at least a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; iii) determining a response of said humanoid robot, comprising at least one event selected from a group comprising: the utterance of at least a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression, said determining being performed by applying a set of rules, each said rule associating a set of input events to a response of the robot; iv) generating said or each said event; a step of phonetic transcription of a set of sounds acquired by a sound sensor; a step of simplifying and smoothing the resulting phonetic transcription; calculating an edit distance between said simplified and smoothed phonetic transcription and a plurality of entries, obtained by simplifying and smoothing a predefined set of words in natural language; and choosing a natural language word of said predefined set, corresponding to the entry with the lowest edit distance from said simplified and smoothed phonetic transcription, wherein at least some of said rules applied at said step iii) associate a response to a combination of at least two events jointly generated by said user and recognized at said step ii), of which at least one is not a word or sentence uttered by said user, said step ii) comprises searching a match between an acquired signal and an event belonging to a list of expected events stored in a memory of said humanoid robot, or accessible by it, said searching being carried out by successively using a plurality of matching methods with increasing complexity until an event is recognized with a confidence score greater than a predetermined value, or after the highest complexity recognition method has been used, and said matching methods include, by order of increasing complexity: the search for an exact match, the search for an approximate match, the search for a phonetic correspondence—only in the case of voice recognition—and the search for a semantic correspondence. 9. The method according to claim 8 wherein said simplifying and smoothing comprises: replacing phonemes prone to confusion by a single phoneme; removing vowels other than vowels at the beginning of words and nasal vowels, and removing breaks between words. 10. The method according to claim 5 , wherein said list of expected events is selected, among a plurality of said lists, depending on a dialogue context. 11. The method according to claim 1 wherein said step iii) comprises determining a response to a set of events, including the absence of words uttered by said user or identified gestures, by applying rules belonging to a predefined subset, called proactive rules. 12. A method of performing a dialogue between a humanoid robot and at least one user comprising the following steps, carried out iteratively by said humanoid robot: i) acquiring a plurality of input signals from respective sensors, at least one said sensor being a sound sensor and at least one other sensor being a motion or image sensor; ii) interpreting the acquired signals to recognize a plurality of events generated by said user, selected from a group comprising: the utterance of at least a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; iii) determining a response of said humanoid robot, comprising at least one event selected from a group comprising: the utterance of at least a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression, said determining being performed by applying a set of rules, each said rule associating a set of input events to a response of the robot; iv) generating said or each said event; and if the response determined during step iii) is or comprises at least the utterance of a word or sentence, the execution of a step iii-a) of perfo

Assignees

Softbank Robotics Europe

Inventors

Classifications

G06F40/211
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
B25J13/003
by means of an audio-responsive input (audible safety signals B25J19/061) · CPC title
B25J9/1694Primary
characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion · CPC title
G10L15/1815Primary
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
G10L2015/025
Phonemes, fenemes or fenones being the recognition units · CPC title

Patent family

Related publications grouped by family.

View patent family 50628742

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10242666B2 cover?: A method of performing dialog between a humanoid robot and user comprises: i) acquiring input signals from respective sensors, at least one being a sound sensor and another being a motion or image sensor; ii) interpreting the signals to recognize events generated by the user, including: the utterance of a word or sentence, an intonation of voice, a gesture, a body posture, a facial expression; …
Who is the assignee on this patent?: Softbank Robotics Europe
What technology area does this patent fall under?: Primary CPC classification B25J9/1694. Mapped technology areas include Operations & Transport.
When was this patent published?: Publication date Tue Mar 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).