Dynamically adjusting instructions in an augmented-reality experience

US12254785B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12254785-B2
Application numberUS-202217969303-A
CountryUS
Kind codeB2
Filing dateOct 19, 2022
Priority dateOct 19, 2022
Publication dateMar 18, 2025
Grant dateMar 18, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for augmented-reality tutoring can utilize optical character recognition, natural language processing, and/or augmented-reality rendering for providing real-time notifications for completing a determined task. The systems and methods can include utilizing one or more machine-learned models trained for quantitative reasoning and can include providing a plurality of different user interface elements at different times.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system, the system comprising: one or more processors; and one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: obtaining image data, wherein the image data is descriptive of one or more images, wherein the one or more images are descriptive of an environment; processing the image data with a machine-learned model to generate semantic data, wherein the semantic data is descriptive of a semantic understanding of at least a portion of the one or more images, wherein the machine-learned model comprises a language model trained for multi-part quantitative reasoning, wherein the language model was trained on a plurality of mathematical proofs; processing the semantic data with the machine-learned model to generate a multi-part response for a detected problem in the one or more images, wherein the multi-part response is descriptive of a proof for the detected problem; determining an error in the one or more images based at least in part on the multi-part response; determining a corrective action based on the multi-part response and the error, wherein the corrective action is descriptive of at least one of a replacement for the error or an action to fix the error; generating one or more augmented images based on the correction action and the one or more images, wherein the one or more augmented images comprise one or more user interface elements rendered into the one or more images, wherein the one or more user interface elements comprise text superimposed over at least a portion of the one or more images, wherein the text of the one or more user interface elements comprise informational data descriptive of the corrective action; providing the one or more augmented images for display based on the corrective action; obtaining additional image data after providing the one or more augmented images for display, wherein the additional image data is descriptive of one or more additional images, wherein the one or more additional images are descriptive of the one or more pages with user-generated text; processing the additional image data and the multi-part response to determine a particular portion of the user-generated text deviates from the multi-part response; and generating one or more second augmented images that indicate the particular portion of the user-generated text that has a determined error. 2. The system of claim 1 , wherein determining the error in the one or more images based at least in part on the semantic data, comprises: obtaining a particular machine-learned model based on the semantic data; and processing the image data with the particular machine-learned model to detect the error. 3. The system of claim 1 , wherein the error comprises an inconsistency with the semantic understanding. 4. The system of claim 1 , wherein the error comprises a deviation from a multi-part process, wherein the multi-part process is associated with the semantic data. 5. The system of claim 1 , wherein determining the corrective action based on the semantic data and the error comprises: detecting a position of the error within the environment; determining an errorless dataset associated with the semantic data and the one or more images; and determining replacement data from the errorless dataset based on the position of the error within the environment. 6. The system of claim 1 , wherein the error is determined with an error detection model, wherein the error detection model: generates text data based on optical character recognition; parses the text data based on one or more features in the environment; and processes each parsed segment of a plurality of parsed segments to determine the error. 7. The system of claim 6 , wherein the error detection model is trained on a plurality of mathematical proofs. 8. The system of claim 6 , wherein the error detection model comprises an optical character recognition model and a natural language processing model. 9. The system of claim 1 , wherein the image data is generated by one or more image sensors of a mobile computing device, and wherein the one or more user interface elements are provided for display via the mobile computing device. 10. The system of claim 9 , wherein the mobile computing device is a smart wearable. 11. A computer-implemented method, the method comprising: obtaining, by a computing system comprising one or more processors, image data with one or more image sensors of a user computing device, wherein the image data is descriptive of one or more images, wherein the one or more images are descriptive of one or more pages; processing, by the computing system, the image data with an optical character recognition model to generate text data, wherein the text data is descriptive of text on the one or more pages; determining, by the computing system, a prompt based on the text data, wherein the prompt is descriptive of a request for a response; determining, by the computing system, text data comprises a problem of a particular problem type; in response to determining the text data comprises the problem of the particular problem type, obtaining, by the computing system, a problem-specific machine-learned model associated with the particular problem type; processing, by the computing system, the prompt with the problem-specific machine-learned model to generate a multi-part response to the prompt, wherein the multi-part response comprises a plurality of individual responses associated with the prompt, wherein the problem-specific machine-learned model comprises a language model trained for multi-part quantitative reasoning, wherein the language model was trained on a plurality of mathematical proofs, wherein the multi-part response is descriptive of a proof for the detected problem; obtaining, by the computing system, additional image data, wherein the additional image data is descriptive of one or more additional images, wherein the one or more additional images are descriptive of the one or more pages with user-generated text; processing, by the computing system, the additional image data with the optical character recognition model to generate additional text data, wherein the additional text data is descriptive of the user-generated text on the one or more pages; determining, by the computing system, the user-generated text deviates from the multi-part response; and providing, by the computing system, a notification rendered in an augmented-reality experience via the user computing device, wherein the notification is descriptive of the user-generated text having an error. 12. The method of claim 11 , wherein determining, by the computing system, the user-generated text deviates from the multi-part response comprises: determining the user-generated text contradicts the multi-part response. 13. The method of claim 11 , wherein determining, by the computing system, the user-generated text deviates from the multi-part response comprises: determining the user-generated text lacks one or more particular features of the multi-part response. 14. The method of claim 11 , wherein the one or more pages comprise one or more questions, and wherein the user-generated text comprises a user response to the one or more questions. 15. The method of claim 11 , further comprising: processing, by the computing system, the image data with the machine-learned model to determine the prompt. 16. The method of claim 11 , further comprising:

Assignees

Inventors

Classifications

  • using evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12254785B2 cover?
Systems and methods for augmented-reality tutoring can utilize optical character recognition, natural language processing, and/or augmented-reality rendering for providing real-time notifications for completing a determined task. The systems and methods can include utilizing one or more machine-learned models trained for quantitative reasoning and can include providing a plurality of different …
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G09B7/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).