What technology area does this patent fall under?

Primary CPC classification G06V40/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Displaying a recipe preparation suggestion in an augmented reality element based on a predicted recipe being prepared

US12573158B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12573158-B2
Application number	US-202418753880-A
Country	US
Kind code	B2
Filing date	Jun 25, 2024
Priority date	Jun 25, 2024
Publication date	Mar 10, 2026
Grant date	Mar 10, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A client device, or an online system communicating with the device, receives video data depicting a field of view of a display area of the device and applies machine-learning algorithms to the video data to detect objects, including portions of a body of a user of the device, within the field of view and to determine a series of body poses. The device/system uses machine-learning models to predict an action performed by the user based on the series of poses and to predict a recipe being prepared based on the objects and a predicted series of actions performed by the user. The device/system selects a suggestion associated with preparing the recipe based on candidate suggestions associated with preparing the recipe, the objects, or the predicted series of actions, and generates an augmented reality element describing the suggestion. The augmented reality element is displayed in the display area of the device.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, performed at a computer system comprising a processor and a computer-readable medium, comprising: receiving video data captured by a camera of a client device, wherein the video data depicts a field of view of a display area of the client device; applying one or more machine-learning algorithms to the video data to detect one or more objects within the field of view of the display area of the client device, the one or more objects comprising one or more portions of a body of a user associated with the client device; for each timeframe of a plurality of timeframes of the video data, applying one or more machine-learning algorithms to determine a series of poses of the one or more portions of the body of the user associated with the client device; accessing a first machine-learning model trained to predict an action being performed by the user; for each timeframe of the plurality of timeframes, applying the first machine-learning model to predict the action being performed by the user based at least in part on the series of poses; accessing a second machine-learning model trained to predict a recipe being prepared by the user, wherein the second machine-learning model is trained by: receiving recipe data for a plurality of recipes, wherein a set of recipe data for each recipe of the plurality of recipes describes a series of actions and a set of objects associated with preparing a corresponding recipe, receiving, for each recipe of the plurality of recipes, a label describing a corresponding recipe, and training the second machine-learning model based at least in part on the recipe data and the label for each recipe of the plurality of recipes; applying the second machine-learning model to predict the recipe being prepared by the user based at least in part on a predicted series of actions being performed by the user during the plurality of timeframes and the one or more objects; retrieving a set of recipe data for the recipe, the set of recipe data comprising a set of candidate suggestions associated with preparing the recipe; selecting one or more suggestions associated with preparing the recipe based at least in part on one or more of: the set of candidate suggestions, the one or more objects, or the predicted series of actions; generating an augmented reality element comprising information describing the one or more suggestions associated with preparing the recipe; and displaying the augmented reality element in the display area of the client device, wherein the augmented reality element is overlaid onto a portion of the display area. 2 . The method of claim 1 , wherein applying one or more machine-learning algorithms to the video data to detect one or more objects within the field of view of the display area of the client device comprises applying one or more machine-learning algorithms to the video data to detect one or more ingredients of a recipe. 3 . The method of claim 1 , wherein applying one or more machine-learning algorithms to the video data to detect one or more objects within the field of view of the display area of the client device comprises applying one or more machine-learning algorithms to the video data to detect one or more tools used to prepare a recipe. 4 . The method of claim 1 , wherein generating the augmented reality element comprising information describing the one or more suggestions associated with preparing the recipe comprises generating the augmented reality element comprising one or more of: a guide for preparing the recipe, information describing a technique associated with preparing the recipe, or information describing an additional recipe. 5 . The method of claim 4 , wherein generating the augmented reality element comprising the guide for preparing the recipe comprises: sending a prompt to the client device to confirm the user is preparing the recipe; and responsive to receiving a response to the prompt confirming the user is preparing the recipe, generating the augmented reality element comprising a set of step-by-step instructions for preparing the recipe. 6 . The method of claim 5 , wherein generating the augmented reality element comprising the set of step-by-step instructions for preparing the recipe comprises: responsive to predicting an action associated with a step included in the set of step-by-step instructions has been completed, generating the augmented reality element comprising a video associated with a subsequent step included in the set of step-by-step instructions. 7 . The method of claim 4 , wherein generating the augmented reality element comprising information describing the technique associated with preparing the recipe comprises: accessing an additional machine-learning model trained to predict a measure of deviation of the predicted action being performed by the user from the technique, wherein the additional machine-learning model is trained by: receiving action data for a plurality of actions associated with the technique, receiving, for each action of the plurality of actions, a label describing the measure of deviation of a corresponding action from the technique, and training the additional machine-learning model based at least in part on the action data and the label for each action of the plurality of actions; applying the additional machine-learning model to predict the measure of deviation of the predicted action being performed by the user from the technique based at least in part on the video data; determining that the predicted measure of deviation is at least a threshold measure of deviation; and responsive to determining that the predicted measure of deviation is at least the threshold measure of deviation, generating the augmented reality element comprising a video demonstrating the technique. 8 . The method of claim 4 , wherein generating the augmented reality element comprising information describing the additional recipe comprises: identifying the additional recipe based at least in part on the one or more objects; identifying an additional set of objects associated with the recipe, wherein the additional set of objects is not included among the one or more objects; and generating the augmented reality element comprising one or more of: information describing the additional recipe, information describing the additional set of objects, and a suggestion to place an order including the additional set of objects with an online system. 9 . The method of claim 4 , wherein generating the augmented reality element comprising information describing the additional recipe comprises: receiving a request from the user to suggest one or more additional recipes for the user to prepare; and responsive to receiving the request, identifying the additional recipe based at least in part on the one or more objects. 10 . The method of claim 1 , further comprising: sending a prompt to the client device to confirm the user is preparing the recipe. 11 . A computer program product comprising a non-transitory computer-readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform steps comprising: receiving video data captured by a camera of a client device, wherein the video data depicts a field of view of a display area of the client device; applying one or more machine-learning algorithms to the video data to detect one or more objects within the field of view of the display area of the client device, the one or more objects comprising one or more portions of a body of a user associated with the client device; for each timeframe of a plurality of timeframes of the video data, applyin

Assignees

Maplebear Inc

Inventors

Classifications

G06V40/20Primary
Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title
G06V2201/07
Target detection · CPC title
G06T19/006Primary
Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title
G06V20/20
in augmented reality scenes · CPC title

Patent family

Related publications grouped by family.

View patent family 98219700

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12573158B2 cover?: A client device, or an online system communicating with the device, receives video data depicting a field of view of a display area of the device and applies machine-learning algorithms to the video data to detect objects, including portions of a body of a user of the device, within the field of view and to determine a series of body poses. The device/system uses machine-learning models to pred…
Who is the assignee on this patent?: Maplebear Inc
What technology area does this patent fall under?: Primary CPC classification G06V40/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).