Controlling augmented reality effects through multi-modal human interaction

US11960653B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11960653-B2
Application numberUS-202217740567-A
CountryUS
Kind codeB2
Filing dateMay 10, 2022
Priority dateMay 10, 2022
Publication dateApr 16, 2024
Grant dateApr 16, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods herein describe a multi-modal interaction system. The multi-modal interaction system, receives a selection of an augmented reality (AR) experience within an application on a computer device, displays a set of AR objects associated with the AR experience on a graphical user interface (GUI) of the computer device, display textual cues associated with the set of augmented reality objects on the GUI, receives a hand gesture and a voice command, modifies a subset of augmented reality objects of the set of augmented reality objects based on the hand gesture and the voice command, and displays the modified subset of augmented reality objects on the GUI.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: causing display, by at least one processor of a set of augmented reality objects on a graphical user interface (GUI) of a computer device; causing display of textual cues associated with the set of augmented reality objects on the GUI; in response to the displayed textual cues: receiving, from the computer device, a hand gesture; generating a first set of modification data based on the hand gesture; receiving, from the computer device, a voice command; and generating a second set of modification data based on keywords in the voice command; identifying a subset of augmented reality objects of the set of set of augmented reality objects using the first set of modification data; generating a modified subset of augmented reality objects by applying the second set of modification data to the identified subset of augmented reality objects; and causing display of the modified subset of augmented reality objects on the GUI. 2. The method of claim 1 , further comprising: receiving a selection of an augmented reality experience within an application on the computer device. 3. The method of claim 2 , wherein the selection is a user input received at the computer device. 4. The method of claim 3 , wherein the augmented reality experience is displayed as a selectable user interface element within the application. 5. The method of claim 1 , wherein the textual cues are hints associated with the hand gesture and the voice command. 6. The method of claim 1 , wherein the textual cues are temporarily displayed on the GUI for a predetermined duration of time. 7. The method of claim 1 , wherein receiving the hand gesture further comprises: detecting a user's hand using one or more image sensors of the computer device; identifying a set of joint locations of the user's hand; identifying a pose based on the set of joint locations; and determining the hand gesture based on the pose. 8. The method of claim 7 , wherein receiving the voice command further comprises: receiving audio data from one or more microphones of the computer device; and analyzing the audio data using a machine learning model, the machine learning model trained to identify the keywords in the audio data. 9. The method of claim 8 , further comprising: causing display of the identified keywords on the GUI. 10. The method of claim 1 , wherein identifying the subset of augmented reality objects further comprises: generating a temporary outline of the subset of augmented reality objects; and causing display of the temporary outline of the subset of augmented reality objects. 11. A computing system comprising: a processor; and a memory storing instructions that, when executed by the processor, configure the system to: cause display of a set of augmented reality objects on a graphical user interface (GUI) of a computer device; cause display of textual cues associated with the set of augmented reality objects on the GUI; in response to the displayed textual cues: receive, from the computer device, a hand gesture; generate a first set of modification data based on the hand gesture; receive, from the computer device, a voice command; and generate a second set of modification data based on keywords in the voice command; identify a subset of augmented reality objects of the set of set of augmented reality objects using the first set of modification data; generate a modified subset of augmented reality objects by applying the second set of modification data to the identified subset of augmented reality objects; and cause display of the modified subset of augmented reality objects on the GUI. 12. The computing system of claim 11 , wherein the instructions further configure the system to: receive, by the processor, a selection of an augmented reality experience within an application on the computer device. 13. The computing system of claim 12 , wherein the selection is a user input received at the computer device, and wherein the augmented reality experience is displayed as a selectable user interface element within the application. 14. The computing system of claim 11 , wherein the textual cues are hints associated with the hand gesture and the voice command. 15. The computing system of claim 11 , wherein the textual cues are temporarily displayed on the GUI for a predetermined duration of time. 16. The computing system of claim 11 , wherein receiving the hand gesture further comprises: detect a user's hand using one or more image sensors of the computer device; identify a set of joint locations of the user's hand; identify a pose based on the set of joint locations; and determine the hand gesture based on the pose. 17. The computing system of claim 16 , wherein receiving the voice command further comprises: receive audio data from one or more microphones of the computer device; analyze the audio data using a machine learning model, the machine learning model trained to identify the keywords in the audio data; and generate a second set of modification data using the identified keywords. 18. The computing system of claim 17 , wherein the system further configured to: cause display of the identified keywords on the GUI. 19. A non-transitory computer-readable storage medium, the computer-readable storage medium including instructions that when executed by a computer, cause the computer to: cause display of a set of augmented reality objects on a graphical user interface (GUI) of a computer device; cause display of textual cues associated with the set of augmented reality objects on the GUI; in response to the displayed textual cues: receive, from the computer device, a hand gesture; generate a first set of modification data based on the hand gesture; receive, from the computer device, a voice command; generate a second set of modification data based on keywords in the voice command; identify a subset of augmented reality objects of the set of set of augmented reality objects using the first set of modification data; generate a modified subset of augmented reality objects by applying the second set of modification data to the identified subset of augmented reality objects; and cause display of the modified subset of augmented reality objects on the GUI. 20. The computer-readable storage medium of claim 19 , wherein the instructions further configure the computer-readable storage medium to: receive a selection of an augmented reality experience within an application on the computer device, wherein the selection is a user input received at the computer device.

Assignees

Inventors

Classifications

  • Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title

  • based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance · CPC title

  • in augmented reality scenes · CPC title

  • Recognition of static hand signs · CPC title

  • Arrangements for interaction with the human body, e.g. for user immersion in virtual reality (blind teaching G09B21/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11960653B2 cover?
Systems and methods herein describe a multi-modal interaction system. The multi-modal interaction system, receives a selection of an augmented reality (AR) experience within an application on a computer device, displays a set of AR objects associated with the AR experience on a graphical user interface (GUI) of the computer device, display textual cues associated with the set of augmented reali…
Who is the assignee on this patent?
Snap Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/017. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).