What technology area does this patent fall under?

Primary CPC classification B25J9/163. Mapped technology areas include Operations & Transport.

When was this patent published?

Publication date Tue Oct 08 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Guided uncertainty-aware policy optimization: combining model-free and model-based strategies for sample-efficient learning

US12109701B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12109701-B2
Application number	US-202016780465-A
Country	US
Kind code	B2
Filing date	Feb 3, 2020
Priority date	Nov 20, 2019
Publication date	Oct 8, 2024
Grant date	Oct 8, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A robot is controlled using a combination of model-based and model-free control methods. In some examples, the model-based method uses a physical model of the environment around the robot to guide the robot. The physical model is oriented using a perception system such as a camera. Characteristics of the perception system may be are used to determine an uncertainty for the model. Based at least in part on this uncertainty, the system transitions from the model-based method to a model-free method where, in some embodiments, information provided directly from the perception system is used to direct the robot without reliance on the physical model.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: dividing at least a portion of a physical model created based at least in part on information from a perception system into a plurality of regions; generating estimates of uncertainty for the plurality of regions based at least in part on at least one uncertainty estimation provided by the perception system; using the physical model to control a robot in any of the plurality of regions associated with any of the estimates of uncertainty that indicate the robot is unlikely to interact with its environment; and using at least one reinforcement learning process instead of using the physical model to control the robot in any of the plurality of regions associated with any of the estimates of uncertainty that indicate the robot is likely to interact with its environment. 2. The computer-implemented method of claim 1 , wherein: the perception system is a stationary camera; and the at least one reinforcement learning process controls the robot using data collected by a camera mounted on the robot. 3. The computer-implemented method of claim 1 , wherein the plurality of regions comprises a first set of regions that are associated with any of the estimates of uncertainty that indicate the robot is unlikely to interact with its environment and a second set of regions that are associated with any of the estimates of uncertainty that indicate the robot is likely to interact with its environment, and the method further comprises: using the physical model to move the robot within at least one region in the first set of regions to at least one region in the second set of regions; and using the at least one reinforcement learning process to control the robot to complete a task in one or more of the second set of regions. 4. The computer-implemented method of claim 1 , further comprising: generating the physical model based at least in part on image data collected by at least one camera. 5. The computer-implemented method of claim 1 , wherein the at least one uncertainty estimation provided by the perception system comprises a nonparametric distribution of a plurality of poses of a region of the plurality of regions and an associated weights for each of the plurality of poses. 6. The computer-implemented method of claim 1 , wherein the at least one uncertainty estimation provided by the perception system comprises a parametric distribution. 7. The computer-implemented method of claim 1 , wherein using the physical model to control the robot comprises moving the robot using a controller that uses target attractors defined by motion policies of the robot. 8. The computer-implemented method of claim 1 , wherein the reinforcement learning process is performed using an autoencoder that is trained using input from at least one camera mounted on the robot instead of relying on the physical model. 9. A computer system comprising: one or more processors; and computer-readable memory storing executable instructions that, as a result of being executed by the one or more processors, cause the computer system to: divide at least a portion of a physical model created based at least in part on information from a perception system into a plurality of regions; generate estimates of uncertainty for the plurality of regions based at least in part on at least one uncertainty estimation provided by the perception system, a first set of the plurality of regions to comprise any of the plurality of regions associated with any of the estimates of uncertainty that indicate a robot is unlikely to interact with its environment, a second set of the plurality of regions to comprise any of the plurality of regions associated with any of the estimates of uncertainty that indicate the robot is likely to interact with its environment; determine if the robot is positioned in at least one of the first set of regions or at least one of the second set of regions; use the physical model to control the robot if it is determined that the robot is in at least one of the first set of regions; and use at least one reinforcement learning process instead of using the physical model to control the robot if it is determined that the robot is in at least one of the second set of regions. 10. The computer system of claim 9 , wherein the at least one reinforcement learning process is to control the robot using data collected by a wrist-mounted camera positioned on the robot. 11. The computer system of claim 9 , wherein the instructions, as a result of being executed by the one or more processors, cause the computer system to updates the second set of regions using a result of controlling the robot in at least a portion of the plurality of regions. 12. The computer system of claim 11 , wherein the instructions, as a result of being executed by the one or more processors, cause the computer system to: perform a task using the robot, and use a result of the task to modify at least one of the first or second sets of regions. 13. The computer system of claim 9 , wherein the instructions, as a result of being executed by the one or more processors, cause the computer system to orient the physical model by at least using a deep object pose estimator to process data obtained by the perception system. 14. The computer system of claim 9 , wherein: the perception system comprises a first camera; and the at least one reinforcement learning process controls the robot using data collected by a second camera; and the first and the second cameras are different cameras. 15. The computer system of claim 9 , wherein the perception system comprises a camera; and the instructions, as a result of being executed by the one or more processors, cause the computer system to use image data collected by the camera to generate a plurality of possible poses that are consistent with the image data collected by the camera. 16. The computer system of claim 9 , wherein the instructions, as a result of being executed by the one or more processors, cause the computer system to cause a controller that uses target attractors defined by motion policies of the robot to move the robot within the first set of regions. 17. A non-transitory computer-readable medium having stored thereon instructions that, which if performed by one or more processors, cause the one or more processors to at least: partition at least a portion of a physical model into a plurality of regions; generate estimates of uncertainty for the plurality of regions; identifying a first set of the plurality of regions based at least in part on any of the estimates of uncertainty associated with the first set of regions; identifying a second set of the plurality of regions based at least in part on any of the estimates of uncertainty associated with the second set of regions; use the physical model to control a robot in any of the first set of regions; and use at least one reinforcement learning process instead of using the physical model to control the robot in any of the second set of regions. 18. The non-transitory computer-readable medium of claim 17 , wherein: the physical model is to be created based at least in part on information obtained from a perception system comprising at least one stationary camera; and the at least one reinforcement learning process controls the robot using data collected by at least one camera that moves with the robot. 19. The non-transitory computer-readable medium of claim 17 , wherein the instructions, if performed by the one or more proc

Assignees

Nvidia Corp

Inventors

Classifications

G06N3/092
Reinforcement learning · CPC title
G06N3/0499
Feedforward networks · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/09
Supervised learning · CPC title
G05B13/04
involving the use of models or simulators · CPC title

Patent family

Related publications grouped by family.

View patent family 75683578

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12109701B2 cover?: A robot is controlled using a combination of model-based and model-free control methods. In some examples, the model-based method uses a physical model of the environment around the robot to guide the robot. The physical model is oriented using a perception system such as a camera. Characteristics of the perception system may be are used to determine an uncertainty for the model. Based at least…
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification B25J9/163. Mapped technology areas include Operations & Transport.
When was this patent published?: Publication date Tue Oct 08 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).