Mitigating reality gap through training a simulation-to-real model using a vision-based robot task model

US12498677B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12498677-B2
Application numberUS-202017767675-A
CountryUS
Kind codeB2
Filing dateMay 15, 2020
Priority dateNov 15, 2019
Publication dateDec 16, 2025
Grant dateDec 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations disclosed herein relate to mitigating the reality gap through training a simulation-to-real machine learning model (“Sim2Real” model) using a vision-based robot task machine learning model. The vision-based robot task machine learning model can be, for example, a reinforcement learning (“RL”) neural network model (RL-network), such as an RL-network that represents a Q-function.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method implemented by one or more processors, the method comprising: processing a simulated image, using a simulation-to-real generator model, to generate a simulated episode predicted real image, wherein the simulated image is generated by a robotic simulator during a simulated episode of a simulated robot attempting performance of a robotic task; processing the simulated episode predicted real image, using a real-to-simulation generator model, to generate a simulated episode predicted simulation image; processing the simulated image along with a simulated robot action, using a task machine learning model being trained for use in the robotic task, to generate a first predicted value; processing the simulated episode predicted real image along with the simulated robot action, using the task machine learning model, to generate a second predicted value; processing the simulated episode predicted simulated image along with the simulated robot action, using the task machine learning model, to generate a third predicted value; generating a loss as a function of comparisons of the first predicted value, the second predicted value, and the third predicted value; and updating the simulation-to-real generator model based on the generated loss. 2 . The method of claim 1 , further comprising: processing a real image, using the real-to-simulation generator model, to generate a real episode predicted simulation image, wherein the real image is captured by a real camera, associated with a real robot, during a real episode of the real robot attempting performance of the robotic task; processing the real episode predicted simulation image, using the simulation-to-real generator model, to generate a real episode predicted real image; processing the real image along with a real robot action, using the task machine learning model or an additional task machine learning model being trained for use in the robotic task, to generate a fourth predicted value; processing the real episode predicted simulated image along with the real robot action, using the task machine learning model or the additional task machine learning model, to generate a fifth predicted value; and processing the real episode predicted real image along with the real robot action, using the task machine learning model or the additional task machine learning model, to generate a sixth predicted value, wherein generating the loss is further a function of additional comparisons of the fourth predicted value, the fifth predicted value, and the sixth predicted value. 3 . The method of claim 2 , wherein the comparisons of the first predicted value, the second predicted value, and the third predicted value comprise three comparisons, each of the three comparisons being between a unique pair of the first predicted value, the second predicted value, and the third predicted value. 4 . The method of claim 3 , wherein the additional comparisons of the fourth predicted value, the fifth predicted value, and the sixth predicted value comprise three additional comparisons, each of the three additional comparisons being between a unique pair of the fourth predicted value, the fifth predicted value, and the sixth predicted value. 5 . The method of claim 2 , wherein generating the loss is further a function of an adversarial loss and/or a cycle consistency loss, wherein the adversarial loss and the cycle consistency loss are both generated independent of any outputs generated using the task machine learning model or the additional task machine learning model. 6 . The method of claim 5 , wherein the adversarial loss is generated based on whether a simulation-to-real discriminator model predicts the predicted real image is an actual real image or the predicted real image generated by the simulation-to-real generator, and wherein the cycle consistency loss is generated based on comparison of the simulated image and the simulated episode predicted simulation image. 7 . The method of claim 6 , wherein generating the loss is further a function of both the adversarial loss and the cycle consistency loss. 8 . The method of claim 1 , wherein the task machine learning model represents a Q-function, wherein the task machine learning model is being trained during reinforcement learning based on the simulated episode and additional simulated episodes, and wherein the first predicted value is a first Q-value, the second predicted value is a second Q-value, and the third predicted value is a third Q-value. 9 . The method of claim 1 , further comprising: generating a task machine learning model loss based on the second predicted value; and updating the task machine learning model based on the task machine learning model loss. 10 . The method of claim 9 , wherein generating the task machine learning model loss is independent of at least the first predicted value and the third predicted value. 11 . The method of claim 1 , wherein the simulated episode is an offline episode. 12 . The method of claim 1 , wherein the simulated episode is an online episode. 13 . The method of claim 1 , wherein the robotic task is an object manipulation task or a navigation task. 14 . The method of claim 13 , wherein the robotic task is the object manipulation task, and wherein the object manipulation task is a grasping task. 15 . A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the at least one processor to be operable to: process a simulated image, using a simulation-to-real generator model, to generate a simulated episode predicted real image, wherein the simulated image is generated by a robotic simulator during a simulated episode of a simulated robot attempting performance of a robotic task; process the simulated episode predicted real image, using a real-to-simulation generator model, to generate a simulated episode predicted simulation image; process the simulated image along with a simulated robot action, using a task machine learning model being trained for use in the robotic task, to generate a first predicted value; process the simulated episode predicted real image along with the simulated robot action, using the task machine learning model, to generate a second predicted value; process the simulated episode predicted simulated image along with the simulated robot action, using the task machine learning model, to generate a third predicted value; generate a loss as a function of comparisons of the first predicted value, the second predicted value, and the third predicted value; and update the simulation-to-real generator model based on the generated loss. 16 . The system of claim 15 , wherein the at least one processor is further operable to: process a real image, using the real-to-simulation generator model, to generate a real episode predicted simulation image, wherein the real image is captured by a real camera, associated with a real robot, during a real episode of the real robot attempting performance of the robotic task; process the real episode predicted simulation image, using the simulation-to-real generator model, to generate a real episode predicted real image; process the real image along with a real robot action, using the task machine learning model or an additional task machine learning model being trained for use in the robotic task, to generate a fourth predicted value; process the real episode predicted simulated image along with the real robot action, using the task machine learning model or t

Assignees

Inventors

Classifications

  • B25J19/023Primary

    including video camera means · CPC title

  • Vision controlled systems · CPC title

  • B25J9/163Primary

    learning, adaptive, model based, rule based expert control · CPC title

  • Simulation of manipulator lay-out, design, modelling of manipulator · CPC title

  • Generative networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12498677B2 cover?
Implementations disclosed herein relate to mitigating the reality gap through training a simulation-to-real machine learning model (“Sim2Real” model) using a vision-based robot task machine learning model. The vision-based robot task machine learning model can be, for example, a reinforcement learning (“RL”) neural network model (RL-network), such as an RL-network that represents a Q-function.
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification B25J19/023. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).