Robotic control using action image(s) and critic network

US11607802B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11607802-B2
Application numberUS-202016886545-A
CountryUS
Kind codeB2
Filing dateMay 28, 2020
Priority dateSep 15, 2019
Publication dateMar 21, 2023
Grant dateMar 21, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Generating and utilizing action image(s) that represent a candidate pose (e.g., a candidate end effector pose), in determining whether to utilize the candidate pose in performance of a robotic task. The action image(s) and corresponding current image(s) can be processed, using a trained critic network, to generate a value that indicates a probability of success of the robotic task if component(s) of the robot are traversed to the particular pose. When the value satisfies one or more conditions (e.g., satisfies a threshold), the robot can be controlled to cause the component(s) to traverse to the particular pose in performing the robotic task.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented by one or more processors of a robot in performing a robotic task, the method comprising: identifying a current image that is based on at least part of a current instance of vision data captured by a vision component of the robot; identifying a particular action image that includes projections, for N points of an end effector of the robot for a particular pose of the end effector, onto a vision frame of the vision component, wherein N is an integer greater than one; processing, utilizing a trained critic network that represents a learned value function, the current image and the particular action image, wherein processing the current image utilizing the trained critic network comprises processing current image pixels, of the current image, utilizing the trained critic network; generating, based on the processing, a value for the particular pose, wherein the value for the particular pose indicates a probability of success of the robotic task if the end effector is traversed to the particular pose; in response to determining that the value satisfies one or more conditions: controlling the robot to cause the end effector to traverse to the particular pose in performing the robotic action. 2. The method of claim 1 , wherein the action image has N channels, and wherein each of the N channels includes a corresponding one-hot pixel that is a corresponding one of the projections for a corresponding one of the N points of the end effector. 3. The method of claim 1 , wherein the current image has a given width and a given height and the particular action image also has the given width and the given height. 4. The method of claim 3 , further comprising: generating the particular action image based on cropping an initial action image with a frame that is centered, at a given pixel location, so as to encompass the projections for the N points of the end effector; and generating the current image based on cropping the current instance of vision data with the frame that is centered at the given pixel location. 5. The method of claim 1 , further comprising: generating the particular action image, generating the particular action image comprising: determining, for each of the N points, a corresponding three-dimensional location, for the particular pose, relative to a first frame; projecting the three-dimensional locations onto the vision frame using a kinematics based transformation that is from the first frame to the vision frame and that is dependent on a current vision component pose of the vision component; and assigning particular values to the pixels, of the action image, determined to correspond to the three-dimensional locations based on the projecting. 6. The method of claim 1 , wherein the one or more conditions comprise the value satisfying a fixed threshold. 7. The method of claim 6 , further comprising: identifying a particular additional action image that includes additional projections, for the N points of the end effector of the robot for an additional particular pose of the end effector, onto the vision frame of the vision component; processing, utilizing the trained critic network, an additional current image and the additional particular action image, wherein the additional current image current is based on at least an additional part of the current instance of vision data, and is a crop, of the current instance of vision data, that is based on the additional action image; generating, based on the processing of the additional current image and the additional particular action image, an additional value for the additional particular pose, wherein the additional value for the additional particular pose indicates an additional probability of success of the robotic task if the end effector is traversed to the additional particular pose; wherein the one or more conditions comprise the value being more indicative of success than the additional value. 8. The method of claim 7 , further comprising: identifying the additional particular pose and the particular pose based on uniform sampling of end effector poses that are within a workspace corresponding to the current instance of vision data. 9. The method of claim 7 , further comprising: detecting an object of interest based on the current instance of vision data; determining a portion of a workspace that corresponds to the object of interest; and identifying the additional particular pose and the particular pose based on sampling of end effector poses that are within the portion of the workspace. 10. The method of claim 7 , further comprising: identifying the additional particular pose and the particular pose based on sampling of end effector poses that are within a distribution of a prior particular pose selected based on a prior value generated for the prior particular pose, the prior value being generated based on processing, utilizing the critic network, a prior action image that corresponds to the prior particular pose and a prior current image generated based on the current instance of vision data. 11. The method of claim 1 , wherein the robotic task is a grasping task, and further comprising: in response to determining that the end effector has reached the particular pose: controlling the end effector to cause one or more grasping members of the end effector to close in attempting the grasping task. 12. The method of claim 2 , wherein the current instance of vision data include a red, green, blue (RGB) image, the current image is an RGB image, and wherein each of the one-hot pixels of the action image is a fixed value. 13. The method of claim 2 , wherein the current instance of vision data includes a depth image, the current image is a depth image, and wherein each of the one-hot pixels of the action image is a corresponding value indicative of a corresponding depth of a corresponding one of the projections for a corresponding one of the N points of the end effector. 14. The method of claim 2 , wherein the current instance of vision data includes a red, green, blue, depth (RGB-D) image, the current image is an RGB image, and wherein each of the one-hot pixels of the action image is a fixed value, and further comprising: generating a depth current image based on the depth values of the RGB-D image; and identifying an additional particular action image that includes projections, for the N points of the end effector of the robot for the particular pose of the end effector, onto the vision frame of the vision component, wherein the additional particular action image includes additional one-hot pixels that includes a corresponding depth of a corresponding one of the projections for a corresponding one of the N points of the end effector; wherein the processing further comprises processing, utilizing the trained critic network, the depth current image and the additional particular action image, along with the current image and the particular action image. 15. The method of claim 1 , wherein processing, utilizing the trained critic network, the current image and the particular action image comprises: processing the current image using a first tower of the critic network to generate a current image embedding; processing the action image using a second tower of the critic network to generate an action image embedding; processing a merged embedding using a post-merger tower of the critic network, the merged embedding including a concatenation of at least the current image embedding and the action image embedding. 16. The method of claim 2 , wherein the correspon

Assignees

Inventors

Classifications

  • Color image · CPC title

  • learning, adaptive, model based, rule based expert control · CPC title

  • Vision controlled systems · CPC title

  • B25J9/161Primary

    Hardware, e.g. neural networks, fuzzy logic, interfaces, processor · CPC title

  • characterised by the hand, wrist, grip control · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11607802B2 cover?
Generating and utilizing action image(s) that represent a candidate pose (e.g., a candidate end effector pose), in determining whether to utilize the candidate pose in performance of a robotic task. The action image(s) and corresponding current image(s) can be processed, using a trained critic network, to generate a value that indicates a probability of success of the robotic task if component(…
Who is the assignee on this patent?
X Dev Llc
What technology area does this patent fall under?
Primary CPC classification B25J9/161. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Mar 21 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).