What technology area does this patent fall under?

Primary CPC classification G10L15/265. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Methods and apparatus to define virtual scenes using natural language commands and natural gestures

US10403285B1 · US · B1

Patent metadata
Field	Value
Publication number	US-10403285-B1
Application number	US-201715831617-A
Country	US
Kind code	B1
Filing date	Dec 5, 2017
Priority date	Dec 5, 2016
Publication date	Sep 3, 2019
Grant date	Sep 3, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed methods and apparatus allow a lay person to easily and intuitively define virtual scenes using natural language commands and natural gestures. Natural language commands include statements that a person would naturally (e.g., spontaneously, simply, easily, intuitively, etc.) speak without any or little training. Example natural language commands include “put a cat on the box,” or “put a ball in front of the red box.” Natural gestures include gestures that a person would naturally do, perform or carry out (e.g., spontaneously, simply, easily, intuitively, etc.) without any or little training. Example natural gestures include pointing, a distance between hands, gazing, head tilt, kicking, etc. The person can simply speak and gesture how it naturally occurs to them.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: translating words spoken by a user while in a scene into text, the scene capable of including a computer-generated virtual element; parsing the text into a spoken command fragment; identifying a scene definition gesture from gesture information captured for the user in the scene; performing ray tracing to identify an object or a location based on the identified scene definition gesture; combining the spoken command fragment and the scene definition gesture to form a scene building instruction; and performing the scene building instruction to at least partially define the computer-generated element of the scene. 2. The method of claim 1 , further comprising time aligning the spoken command fragment and the scene definition gesture. 3. The method of claim 1 , further comprising contextually matching the spoken command fragment and the scene definition gesture. 4. The method of claim 1 , wherein the scene is modified while the user is in the scene. 5. The method of claim 1 , wherein the scene includes a non computer-generated element. 6. The method of claim 1 , wherein the computer-generated element comprises an aspect of the computer-generated element. 7. The method of claim 1 , wherein the computer-generated element comprises at least one of a 2D object, a 3D object, a sound, a video and/or a picture. 8. The method of claim 1 , wherein the gesture of the user includes at least one of pointing, a separation between hands, an eye gaze, and/or a head tilt. 9. The method of claim 1 , further comprising: detecting a spoken start command; and recording the words spoken by the user after the spoken start command is detected. 10. An apparatus comprising: a speech-to-text translator to translate words spoken by a user while in a scene into text, the scene capable of including a computer-generated virtual element; a language parser to parse the text into spoken command fragment; a gesture identifier configured to identify a scene definition gesture from gesture information captured for the user in the scene; a ray tracer configured to identify an object or a location based on the identified scene definition gesture; a gesture/language combiner configured to combine the spoken command fragment and the scene definition gesture to form a scene building instruction; and a builder configured to perform the scene building instruction to at least partially define the computer-generated element of the scene. 11. The apparatus of claim 10 , wherein the gesture/language is configured to time aligning the spoken command fragment and the scene definition gesture. 12. The apparatus of claim 10 , wherein the gesture/language is configured to contextually match the spoken command fragment and the scene definition gesture. 13. The apparatus of claim 10 , wherein the builder is configured to modify the scene while the user is in the scene. 14. The apparatus of claim 10 , wherein the scene includes a non computer-generated element. 15. The apparatus of claim 10 , wherein the computer-generated element comprises an aspect of the computer-generated element. 16. The apparatus of claim 10 , wherein the computer-generated element comprises at least one of a 2D object, a 3D object, a sound, a video and/or a picture. 17. The apparatus of claim 10 , wherein the gesture of the user includes at least one of pointing, a separation between hands, an eye gaze, and/or a head tilt. 18. The apparatus of claim 10 , wherein the speech-to-text engine is configured to detect a spoken start command; and the words spoken by the user are recorded after the spoken start command is detected. 19. A non-transitory machine-readable media storing machine-readable instructions that, when executed, cause a machine to: translate words spoken by a user while in a scene into text, the scene capable of including a computer-generated virtual element; parse the text into a spoken command fragment; identify a scene definition gesture from gesture information captured for the user in the scene; perform ray tracing to identify an object or a location based on the identified scene definition gesture; combine the spoken command fragment and the scene definition gesture to form a scene building instruction; and perform the scene building instruction to at least partially define the computer-generated element of the scene. 20. The non-transitory machine-readable media of claim 19 , wherein the machine-readable instructions, when executed, cause a machine to contextually match the spoken command fragment with the scene definition gesture to form the scene building instruction.

Assignees

Google Llc

Inventors

Classifications

G06F3/011
Arrangements for interaction with the human body, e.g. for user immersion in virtual reality (blind teaching G09B21/00) · CPC title
G06F40/205
Parsing · CPC title
G06F40/211
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
G06F3/017
Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title
G06F3/013
Eye tracking input arrangements (G06F3/015 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 67770033

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10403285B1 cover?: The disclosed methods and apparatus allow a lay person to easily and intuitively define virtual scenes using natural language commands and natural gestures. Natural language commands include statements that a person would naturally (e.g., spontaneously, simply, easily, intuitively, etc.) speak without any or little training. Example natural language commands include “put a cat on the box,” or “…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/265. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).