Deep machine learning methods and apparatus for robotic grasping

US11548145B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11548145-B2
Application numberUS-202117172666-A
CountryUS
Kind codeB2
Filing dateFeb 10, 2021
Priority dateMar 3, 2016
Publication dateJan 10, 2023
Grant dateJan 10, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a deep neural network to predict a measure that candidate motion data for an end effector of a robot will result in a successful grasp of one or more objects by the end effector. Some implementations are directed to utilization of the trained deep neural network to servo a grasping end effector of a robot to achieve a successful grasp of an object by the grasping end effector. For example, the trained deep neural network may be utilized in the iterative updating of motion control commands for one or more actuators of a robot that control the pose of a grasping end effector of the robot, and to determine when to generate grasping control commands to effectuate an attempted grasp by the grasping end effector.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented by one or more processors, the method comprising: attempting, by a robot, a grasp of an object by actuating an end effector of the robot to an actuated position when the end effector is at a grasping position; subsequent to attempting the grasp, and while maintaining the end effector in the actuated position: moving the end effector to a first position that is away from the grasping position; capturing, when the end effector is in the first position, a first image that captures the grasping position; subsequent to capturing the first image: moving the end effector back toward the grasping position and then actuating the end effector from the actuated position to a drop position; capturing, subsequent to actuating the end effector from the actuated position to the drop position, a second image that captures the grasping position; comparing the first image and the second image; and determining, based on comparing the first image and the second image, whether the grasp of the object was successful. 2. The method of claim 1 , wherein comparing the first image and the second image comprises determining a quantity of pixels that are different between the first image and the second image. 3. The method of claim 2 , wherein determining whether the grasp of the object is successful is based on whether the quantity satisfies a threshold. 4. The method of claim 1 , wherein comparing the first image and the second image comprises performing a first object detection on the first image, performing a second object detection on the second image, and comparing the first results from the first object detection to second results from the second object detection. 5. The method of claim 3 , wherein determining whether the grasp of the object is successful is based on whether the second results indicate an additional object that is not indicated by the first results. 6. The method of claim 1 , further comprising: generating, based on determining whether the grasp attempt was successful, a corresponding label for robot data generated by the robot, the robot data including robot data generated during traversing the end effector to the grasping position. 7. The method of claim 6 , further comprising: training a deep neural network based on the robot data and the corresponding label. 8. The method of claim 7 , wherein the robot data comprises images captured in traversing the end effector to the grasping position. 9. The method of claim 8 , wherein the robot data further comprises one or more end effector motion vectors utilized in traversing the end effector to the grasping position. 10. The method of claim 6 , wherein generating the corresponding label comprises selecting a first value as the corresponding label when it is determined that the grasp attempt was successful, and generating a second value as the corresponding label when it determined that the grasp attempt was not successful. 11. A method implemented by one or more processors, the method comprising: comparing a first image to a second image, wherein the first image captures a grasping position and was captured by a vision sensor of a robot at a first point in time, the first point in time being after a grasp of an object by the robot by actuating an end effector of the robot to an actuated position when the end effector was at the grasping position, and after the end effector was moved away from the grasping position after the attempted grasp and while maintaining the end effector in the actuated position and continuing to grasp the object, and wherein the second image captures the grasping position and was captured after moving the end effector back toward the grasping position and then actuating the end effector to drop the object; determining, based on comparing the first image and the second image, that the grasp of the object was successful; in response to determining that the grasp of the object was successful, assigning a positive label to robot data generated by the robot, the robot data generated in traversing the end effector to the grasping position. 12. The method of claim 11 , further comprising: training a deep neural network based on the robot data and the positive label. 13. The method of claim 11 , wherein comparing the first image and the second image comprises determining a quantity of pixels that are different between the first image and the second image. 14. The method of claim 11 , wherein comparing the first image and the second image comprises performing a first object detection on the first image, performing a second object detection on the second image, and comparing the first results from the first object detection to second results from the second object detection. 15. The method of claim 11 , wherein the robot data comprises images captured in traversing the end effector to the grasping position. 16. A robot, comprising: an end effector; actuators controlling movement of the end effector; a vision sensor viewing an environment; at least one processor configured to: capture, with the vision sensor and prior to attempting a grasp of an object, a first image that captures an area that includes the object; attempt a grasp of the object by actuating the end effector to a closed position when the end effector is at a grasping position; subsequent to attempting the grasp, and while maintaining the end effector in the closed position: moving the end effector to an away position that is away from the grasping position; capturing, when the end effector is in the away position, a second image that captures the area; comparing the first image and the second image; and determining, based on comparing the first image and the second image, whether the grasp of the object was successful. 17. The method of claim 16 , wherein comparing the first image and the second image comprises determining a quantity of pixels that are different between the first image and the second image. 18. The method of claim 17 , wherein comparing the first image and the second image comprises performing a first object detection on the first image, performing a second object detection on the second image, and comparing the first results from the first object detection to second results from the second object detection.

Assignees

Inventors

Classifications

  • B25J9/163Primary

    learning, adaptive, model based, rule based expert control · CPC title

  • characterised by the hand, wrist, grip control · CPC title

  • Vision controlled systems · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11548145B2 cover?
Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a deep neural network to predict a measure that candidate motion data for an end effector of a robot will result in a successful grasp of one or more objects by the end effector. Some implementations are directed to utilization of the trained de…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification B25J9/163. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Jan 10 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).