Synchronized gesture and speech production for humanoid robots using random numbers

US9431027B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9431027-B2
Application numberUS-201213352182-A
CountryUS
Kind codeB2
Filing dateJan 17, 2012
Priority dateJan 26, 2011
Publication dateAug 30, 2016
Grant dateAug 30, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system or method for generating gestures in a robot during generation of a speech output by the robot by analyzing a speech text and selecting appropriate gestures from a plurality of candidate gestures. The speech text is analyzed and tagged with information relevant to generating of the gestures. Based on the speech text, the tagged information and other relevant information, a gesture identifier is selected. A gesture template corresponding to the gesture identifier is retrieved and then processed by adding relevant parameter to generate a gesture descriptor representing a gesture to be taken by the robot. A gesture motion is planned based on the gesture descriptor and analysis of timing associated with the speech, wherein the amplitude, frequency or speed of the selected gesture is modified based on random numbers of specific range depending on the status of the robot. Actuator signals for controlling the actuators such as arms and hands are generated based on the planned gesture motion.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of generating a gesture in a robot, comprising: receiving a text of a speech including one or more speech elements; analyzing, by a processor, the speech text by a plurality of pattern modules to identify a plurality of candidate gestures associated with same speech elements in the speech, each of the plurality of pattern modules configured to apply a different set of rules to the speech elements to identify one or more corresponding candidate gestures to be made by the robot during generation of a voice output corresponding to the speech text; selecting a gesture from the plurality of candidate gestures identified by the plurality of pattern modules; generating the selected gesture by controlling actuators in the robot; modifying at least one of amplitude, frequency and speed of the selected gesture based on a random number, wherein the random number is configured to take different ranges of values based on a status of the robot; and generating the voice output by synthesizing the speech text, the voice output synchronized with the gesture generated by the robot. 2. The method of claim 1 , further comprising tagging information on the speech text by analyzing the speech text, wherein the plurality of candidate gestures are identified by further analyzing the tagged information. 3. The method of claim 2 , wherein the tagged information indicates types of words of the speech elements. 4. The method of claim 1 , further comprising planning the selected gesture by adding a preparatory motion before making a motion corresponding to the selected gesture. 5. The method of claim 4 , wherein the preparatory motion moves an effector from an end position of a prior gesture or a starting gesture to an initial position of the selected gesture. 6. The method of claim 1 , further comprising: analyzing timing of the speech elements in the voice output; and adjusting the selected gesture or the synthesized voice output to synchronize timing of the selected gesture and the timing of the speech elements in the voice output. 7. The method of claim 1 , further comprising: receiving an expressivity parameter representing a degree of expressivity to be expressed by the robot; and selecting the gesture based further on the expressivity parameter. 8. The method of claim 1 , wherein the plurality of pattern modules comprise a first pattern module configured to apply a first set of rules to detect emblems, a second pattern module configured to apply a second set of rules to detect iconics, a third pattern module configured to apply a third set of rules to detect metaphorics, a fourth pattern module configured to apply a fourth set of rules to detect deictics, and a fifth pattern module configured to apply a fifth set of rules to detect beats. 9. A robot configured to generate a gesture, comprising: a gesture generator configured to: receive a text of a speech including one or more speech elements; analyze the speech text by a plurality of pattern modules to identify a plurality of candidate gestures associated with same speech elements in the speech, each of the plurality of pattern modules configured to apply a different set of rules to the speech elements to identify one or more corresponding candidate gestures to be made by the robot during generation of a voice output corresponding to the speech text; and select a gesture from the plurality of candidate gestures identified by the plurality of pattern modules; a motion generator configured to generate control signals based on the selected gesture and to modify at least one of amplitude, frequency and speed of the selected gesture based on a random number, wherein the random number is configured to take different ranges of values based on a status of the robot; at least one actuator configured to cause relative movements on parts of the robot according to the control signals; and a voice synthesizer configured to generate the voice output by synthesizing the speech text, the voice output synchronized with the gesture generated by the robot. 10. The robot of claim 9 , wherein the gesture generator is further configured to tag information on the speech text by analyzing the speech text, wherein the plurality of candidate gestures are identified by further analyzing the tagged information. 11. The robot of claim 10 , wherein the tagged information indicates types of words of the speech elements. 12. The robot of claim 9 , wherein the motion generator is further configured to analyze timing of the speech elements in the voice output and adjust the selected gesture to synchronize timing of the selected gesture and the timing of the speech elements in the voice output. 13. The robot of claim 9 , wherein the gesture generator is further configured to: receive an expressivity parameter representing a degree of expressivity to be expressed by the robot; and select the gesture based further on the expressivity parameter. 14. The robot of claim 9 , wherein the motion generator is further configured to plan the selected gesture by adding a preparatory motion before making a motion corresponding to the selected gesture. 15. The robot of claim 9 , wherein the plurality of pattern modules comprise a first pattern module configured to apply a first set of rules to detect emblems, a second pattern module configured to apply a second set of rules to detect iconics, a third pattern module configured to apply a third set of rules to detect metaphorics, a fourth pattern module configured to apply a fourth set of rules to detect deictics, and a fifth pattern module configured to apply a fifth set of rules to detect beats. 16. A non-transitory computer readable storage medium for recognizing verbal commands, the computer readable storage medium structured to store instructions, when executed, cause a processor to: receive a text of a speech including one or more speech elements; analyze the speech text by a plurality of pattern modules to identify a plurality of candidate gestures associated with same speech elements in the speech, each of the plurality of pattern modules configured to apply a different set of rules to the speech elements identify one or more corresponding candidate gestures to be made by a robot during generation of a voice output corresponding to the speech text; select a gesture from the plurality of candidate gestures identified by the plurality of pattern modules; generate the selected gesture by controlling actuators in the robot; modify at least one of amplitude, frequency and speed of the selected gesture based on a random number, wherein the random number is configured to take different ranges of values based on a status of the robot; and generate the voice output by synthesizing the speech text, the voice output synchronized with the gesture generated by the robot.

Assignees

Inventors

Classifications

  • Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination · CPC title

  • G10L21/10Primary

    Transforming into visible information · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

  • Speech synthesis; Text to speech systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9431027B2 cover?
A system or method for generating gestures in a robot during generation of a speech output by the robot by analyzing a speech text and selecting appropriate gestures from a plurality of candidate gestures. The speech text is analyzed and tagged with information relevant to generating of the gestures. Based on the speech text, the tagged information and other relevant information, a gesture iden…
Who is the assignee on this patent?
Ng-Thow-Hing Victor, Luo Pengcheng, Honda Motor Co Ltd
What technology area does this patent fall under?
Primary CPC classification G10L21/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 30 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).