Multimodal task execution and text editing for a wearable system

US10768693B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10768693-B2
Application numberUS-201815955204-A
CountryUS
Kind codeB2
Filing dateApr 17, 2018
Priority dateApr 19, 2017
Publication dateSep 8, 2020
Grant dateSep 8, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Examples of wearable systems and methods can use multiple inputs (e.g., gesture, head pose, eye gaze, voice, and/or environmental factors (e.g., location)) to determine a command that should be executed and objects in the three-dimensional (3D) environment that should be operated on. The multiple inputs can also be used by the wearable system to permit a user to interact with text, such as, e.g., composing, selecting, or editing text.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a first sensor of a wearable system configured to acquire first user input data in a first mode of input; a second sensor of the wearable system configured to acquire second user input data in a second mode of input, the second mode of input different from the first mode of input; and a hardware processor in communication with the first and second sensors, the hardware processor programmed to: receive multimodal inputs comprising the first user input data in the first mode of input and the second user input data in the second mode of input; identify a first set of candidate objects for interactions based on the first user input data; identify a second set of candidate objects for interactions based on the second user input data; select a first candidate object within the first set of candidate objects; determine a first confidence score for the first candidate object based on the first user input data, wherein the first confidence score is based on a proportional area for a first portion of the first candidate object that is in the user's field of view with respect to a second portion of the first candidate object that is outside of the user's field of view; determine a second confidence score for the first candidate object based on the second user input data; calculate an aggregated confidence score for the first candidate object based on at least the first confidence score and the second confidence score; determine a target virtual object from the first set of candidate objects and the second set of candidate objects based on a combination of the first user input data, the second user input data, and the aggregated confidence score; determine a user interface operation on the target virtual object based on at least one of the first user input data or the second user input data; and generate a multimodal input command which causes the user interface operation to be performed on the target virtual object. 2. The system of claim 1 , wherein the multimodal inputs comprise at least two of the following input modes: head pose, eye gaze, user input device, hand gesture, or voice. 3. The system of claim 1 , wherein the user interface operation comprises at least one of selecting, moving, or resizing the target virtual object. 4. The system of claim 1 , wherein the hardware processor is further configured to determine at least one of: a target location, orientation, or movement for the target virtual object in the user interface operation. 5. The system of claim 4 , wherein to determine the target location for the target virtual object, the hardware processor is programmed to identify a workable surface in a physical environment for putting the target virtual object. 6. The system of claim 5 , wherein the workable surface is identified by: calculating a distance function for points of interest (POIs) on a physical object in the physical environment; eliminating one or more of the POIs outside of a planar tolerance; and delineating the workable surface based on remaining POIs. 7. The system of claim 5 , wherein the hardware processor is programmed to automatically orient the target virtual object to match an orientation of the target location. 8. The system of claim 1 , wherein the user interface operation is determined based on the first user input data in the first input mode, and at least one of a subject or a parameter is determined based on a combination of the first mode of input and the second mode of input. 9. The system of claim 1 , wherein the first input mode comprises an indirect input mode based on location information of a user of the wearable system. 10. The system of claim 9 , wherein the hardware processor is programmed to identify a virtual object as the target virtual object from the first set of candidate objects and the second set of candidate objects in response to a determination that the object is within a threshold range of the user. 11. The system of claim 1 , wherein the user interface operation is associated with a virtual application and the virtual application is programmed to be more responsive to one of the first sensor or the second sensor. 12. The system of claim 1 , wherein to determine the target virtual object from the first set of candidate objects and the second set of candidate objects, the hardware processor is programmed to perform a tree-based analysis on the first set of candidate objects and the second set of the candidate objects based on the first user input data and the second user input data. 13. The system of claim 1 , wherein to determine the target virtual object, the hardware processor is programmed to calculate a first or second confidence score for a virtual object by calculating at least one of: an evenness of space around the virtual object in the field of view; a proportional area for a first portion of the virtual object that is in the user's field of view with respect to a second portion of the virtual object that is outside of the user's field of view; or a historical analysis of user's interactions with the virtual object. 14. The system of claim 1 , wherein the hardware processor is further programmed to: detect an initiation condition for an interaction event which triggers the hardware processor to determine the target virtual object and the user interface operation based on the multimodal inputs. 15. The system of claim 14 , wherein the initiation condition comprises a triggering phrase. 16. The system of claim 1 , wherein the first mode of input is a primary input mode and the second mode of input is a secondary input mode, and the hardware processor is programmed to: resolve ambiguities in at least one of the target virtual object and the user interface operation based on the second user input data. 17. The system of claim 1 , wherein the first user input data comprises a deictic or anaphoric reference to a virtual object and the hardware processor is programmed to identify the virtual object as the target virtual object based on the second user input data. 18. The system of claim 1 , wherein the hardware processor is further programmed to automatically enable, disable, or adjust a sensitivity of the first mode of input, the second mode of input, or both based at least in part on a user setting or an environment of the user. 19. The system of claim 1 , wherein the hardware processor is programmed to identify that target virtual object outside of a field of view of the user based at least in part on the multimodal inputs; and automatically move the target virtual object inside the field of view for user interaction. 20. The system of claim 1 , wherein the first confidence score is based on an evenness of space around the first candidate object in the field of view. 21. A method comprising: under control of a hardware processor of a wearable system in communication with a plurality of sensors configured to acquire user input data: receiving the user input data from the plurality of sensors for an interaction event of a user with an environment; analyzing the user input data to identify multimodal inputs for interacting with the environment wherein the multimodal inputs comprise a first input in a first input channel and a second input in a second input channel; calculating, for a candidate object, a first confidence score correlating to the first input, wherein the first confidence score is based on a proportional area for a first portion of the candidate object that is in the

Assignees

Inventors

Classifications

  • Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title

  • Navigation within 3D models or images · CPC title

  • G06F3/017Primary

    Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title

  • Eye tracking input arrangements (G06F3/015 takes precedence) · CPC title

  • G06F3/011Primary

    Arrangements for interaction with the human body, e.g. for user immersion in virtual reality (blind teaching G09B21/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10768693B2 cover?
Examples of wearable systems and methods can use multiple inputs (e.g., gesture, head pose, eye gaze, voice, and/or environmental factors (e.g., location)) to determine a command that should be executed and objects in the three-dimensional (3D) environment that should be operated on. The multiple inputs can also be used by the wearable system to permit a user to interact with text, such as, e.g…
Who is the assignee on this patent?
Magic Leap Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/017. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 08 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).