Machine learning-based generation of outputs in augmented reality environments

US12586263B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12586263-B2
Application numberUS-202418775219-A
CountryUS
Kind codeB2
Filing dateJul 17, 2024
Priority dateJul 17, 2024
Publication dateMar 24, 2026
Grant dateMar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprises at least one processing device configured to generate, using a first machine learning model, a first data structure comprising input representations of one or more input components from an augmented reality environment. The at least one processing device is also configured to generate, using a second machine learning model that takes as input at least a portion of the first data structure, a second data structure comprising at least one vector representation characterizing relevance of one or more of the input representations in the first data structure. The at least one processing device is further configured to generate, using a third machine learning model that takes as input at least a portion of the first data structure and at least a portion of the second data structure, an output response, and to present the output response to a user in the augmented reality environment.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured: to generate, using a first machine learning model, a first data structure comprising input representations of one or more input components from an augmented reality environment; to generate, using a second machine learning model that takes as input at least a portion of the first data structure, a second data structure comprising at least one vector representation characterizing relevance of one or more of the input representations in the first data structure; to generate, using a third machine learning model that takes as input at least a portion of the first data structure and at least a portion of the second data structure, an output response; and to present the output response to a user in the augmented reality environment. 2 . The apparatus of claim 1 wherein the one or more input components comprise: (i) a user prompt received from the user in the augmented reality environment; (ii) a history of interactions associated with the user in the augmented reality environment; and (iii) visual and spatial information associated with the user in the augmented reality environment. 3 . The apparatus of claim 2 wherein the first machine learning model used to generate the first data structure comprises one or more large language models. 4 . The apparatus of claim 3 wherein the one or more large language models comprise: at least a first text-based large language model configured for generating input representations of the (i) the user prompt received from the user in the augmented reality environment and (ii) the history of interactions associated with the user in the augmented reality environment; and at least a second vision-based large language model configured for generating input representations of (iii) the visual and spatial information associated with the user in the augmented reality environment. 5 . The apparatus of claim 1 wherein the second machine learning model comprises a Continuous Attention Memory Model (CAMM). 6 . The apparatus of claim 5 wherein the CAMM comprises: a continuous attention mechanism configured to compute attention weights between the input representations of the first data structure and a query vector; a dynamic memory bank configured to store and update information from the input representations of the first data structure as memory items, each of the memory items comprising a vector representation encoding information from at least one of the input representations of the first data structure; and a context relevance estimator configured to rank the memory items according to a relevance to a current context of the augmented reality environment. 7 . The apparatus of claim 6 wherein the query vector is initialized randomly and updated iteratively utilizing a gradient descent algorithm. 8 . The apparatus of claim 6 wherein the continuous attention mechanism is configured to utilize a dot product to compute the attention weights and a sigmoid function. 9 . The apparatus of claim 6 wherein the dynamic memory bank comprises a set of memory slots, each of the memory slots comprising at least one of the memory items, the dynamic memory bank having a fixed size of memory items and being configured to store the memory items in a chronological order utilizing a first-in, first-out policy for replacing memory items. 10 . The apparatus of claim 6 wherein the context relevance estimator comprises a feed-forward neural network configured to compute relevance scores for the memory items in the dynamic memory bank. 11 . The apparatus of claim 1 wherein the third machine learning model comprises one or more large language models conditioned on said at least a portion of the second data structure. 12 . The apparatus of claim 11 wherein the one or more large language models further incorporates visual and spatial information from the augmented reality environment for customizing the output response based at least in part on a view of the user in the augmented reality environment. 13 . The apparatus of claim 1 wherein the at least one processing device is further configured to update at least one of the first machine learning model, the second machine learning model and the third machine learning model according to one or more user preferences of the user in the augmented reality environment. 14 . The apparatus of claim 13 wherein the one or more user preferences of the user in the augmented reality environment are determined based at least in part on at least one of: sentiment analysis extracting emotions from text or speech of the user captured in the augmented reality environment; facial expression recognition to detect emotions from one or more images of the user captured in the augmented reality environment; and reinforcement learning to learn from rewards or penalties determined from user interaction in the augmented reality environment. 15 . A computer program product comprising a non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device: to generate, using a first machine learning model, a first data structure comprising input representations of one or more input components from an augmented reality environment; to generate, using a second machine learning model that takes as input at least a portion of the first data structure, a second data structure comprising at least one vector representation characterizing relevance of one or more of the input representations in the first data structure; to generate, using a third machine learning model that takes as input at least a portion of the first data structure and at least a portion of the second data structure, an output response; and to present the output response to a user in the augmented reality environment. 16 . The computer program product of claim 15 wherein the one or more input components comprise: (i) a user prompt received from the user in the augmented reality environment; (ii) a history of interactions associated with the user in the augmented reality environment; and (iii) visual and spatial information associated with the user in the augmented reality environment. 17 . The computer program product of claim 15 wherein the second machine learning model comprises a Continuous Attention Memory Model (CAMM), the CAMM comprising: a continuous attention mechanism configured to compute attention weights between the input representations of the first data structure and a query vector; a dynamic memory bank configured to store and update information from the input representations of the first data structure as memory items, each of the memory items comprising a vector representation encoding information from at least one of the input representations of the first data structure; and a context relevance estimator configured to rank the memory items according to a relevance to a current context of the augmented reality environment. 18 . A method comprising: generating, using a first machine learning model, a first data structure comprising input representations of one or more input components from an augmented reality environment; generating, using a second machine learning model that takes as input at least a portion of the first data structure, a second dat

Assignees

Inventors

Classifications

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Semantic analysis · CPC title

  • Facial expression recognition · CPC title

  • for estimating an emotional state · CPC title

  • Generative networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12586263B2 cover?
An apparatus comprises at least one processing device configured to generate, using a first machine learning model, a first data structure comprising input representations of one or more input components from an augmented reality environment. The at least one processing device is also configured to generate, using a second machine learning model that takes as input at least a portion of the fir…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06T11/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).