What technology area does this patent fall under?

Primary CPC classification G06N3/045. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Deep machine learning methods and apparatus for robotic grasping

US9914213B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9914213-B2
Application number	US-201715448013-A
Country	US
Kind code	B2
Filing date	Mar 2, 2017
Priority date	Mar 3, 2016
Publication date	Mar 13, 2018
Grant date	Mar 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented by one or more processors, comprising: generating a candidate end effector motion vector defining motion to move a grasping end effector of a robot from a current pose to an additional pose; identifying a current image captured by a vision sensor associated with the robot, the current image capturing the grasping end effector and at least one object in an environment of the robot; applying the current image and the candidate end effector motion vector as input to a trained grasp convolutional neural network; generating, over the trained grasp convolutional neural network, a measure of successful grasp of the object with application of the motion, the measure being generated based on the application of the image and the end effector motion vector to the trained grasp convolutional neural network; identifying a desired object semantic feature; applying, as input to a semantic convolutional neural network, a spatial transformation of the current image or of an additional image captured by the vision sensor; generating, over the semantic convolutional neural network based on the spatial transformation, an additional measure that indicates whether the desired object semantic feature is present in the spatial transformation; generating an end effector command based on the measure of successful grasp and the additional measure that indicates whether the desired object semantic feature is present; and providing the end effector command to one or more actuators of the robot. 2. The method of claim 1 , further comprising: generating, over the trained grasp convolutional neural network based on the application of the image and the end effector motion vector to the trained grasp convolutional neural network, spatial transformation parameters; and generating the spatial transformation over a spatial transformation network based on the spatial transformation parameters. 3. The method of claim 1 , wherein the desired object semantic feature defines an object classification. 4. The method of claim 1 , further comprising: receiving user interface input from a user interface input device; wherein identifying the desired object semantic feature is based on the user interface input. 5. The method of claim 4 , wherein the user interface input device is a microphone of the robot. 6. The method of claim 1 , wherein the spatial transformation is of the current image. 7. The method of claim 6 , wherein the spatial transformation crops out a portion of the current image. 8. The method of claim 1 , further comprising: determining a current measure of successful grasp of the object without application of the motion; wherein generating the end effector command based on the measure comprises generating the end effector command based on comparison of the measure to the current measure. 9. The method of claim 8 , wherein the end effector command is a grasp command and wherein generating the grasp command is in response to: determining that the additional measure indicates that the desired object feature is present in the spatial transformation; and determining that comparison of the measure to the current measure satisfies one or more criteria. 10. The method of claim 1 , wherein the end effector command is an end effector motion command and wherein generating the end effector motion command comprises generating the end effector motion command to conform to the candidate end effector motion vector. 11. The method of claim 1 , wherein the end effector command is an end effector motion command and wherein generating the end effector motion command comprises generating the end effector motion command to effectuate a trajectory correction to the end effector. 12. The method of claim 1 , wherein the end effector command is an end effector motion command and conforms to the candidate end effector motion vector, wherein providing the end effector motion command to the one or more actuators moves the end effector to a new pose, and further comprising: generating, by one or more processors, an additional candidate end effector motion vector defining new motion to move the grasping end effector from the new pose to a further additional pose; identifying, by one or more of the processors, a new image captured by a vision sensor associated with the robot, the new image capturing the end effector at the new pose and capturing the objects in the environment; applying, by one or more of the processors, the new image and the additional candidate end effector motion vector as input to the trained grasp convolutional neural network; generating, over the trained grasp convolutional neural network, a new measure of successful grasp of the object with application of the new motion, the new measure being generated based on the application of the new image and the additional end effector motion vector to the trained grasp convolutional neural network; applying, as input to the semantic convolutional neural network, an additional spatial transformation of the new image or a new additional image captured by the vision sensor; generating, over the semantic convolutional neural network based on the additional spatial transformation, a new additional measure that indicates whether the desired object feature is present in the spatial transformation; generating a new end effector command based on the new measure of successful grasp and the new additional measure that indicates whether the desired object feature is present; and providing the new end effector command to one or more actuators of the robot. 13. The method of claim 1 , wherein applying the image and the candidate end effector motion vector as input to the trained grasp convolutional neural network comprises: applying the image as input to an initial layer of the trained grasp convolutional neural network; and applying the candidate end effector motion vector to an additional layer of the trained grasp convolutional neural network, the additional layer being downstream of the initial layer. 14. The method of claim 1 , wherein generating the candidate end effector motion vector comprises: generating a plurality of candidate end effector motion vectors; and performing one or more iterations of cross-entropy optimization on the plurality of candidate end effector motion vectors to select the candidate end effector motion vector from the plurality of candidate end effector motion vectors. 15. A method implemented by one or more processors, comprising: identifying a current image captured by a vision sensor associated with a robot; generating, over a grasp convolutional neural network based on application of the current image to the grasp convolutional neural network: a measure of successful grasp, by a grasping end effector of the robot, of an object captured in the current image, and spatial transformation parameters; generating, over a spatial transformer network, a spatial transformation based on the spatial transformation parameters, the spatial transformation being of the current image or an additional image captured by the vision sensor; applying the spatial transformation as input to a semantic convolutional neural network; generating, over the semantic convolutional neural network based on the spatial transformation, an additional measure that indicates whether a desired object semantic feature is present in the spatial transformation; generating an end effector command based on the measure and the additional measure; and providing the end effector command to one or more actuators of the robot. 1

Assignees

Inventors

Classifications

G06N3/008
based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour · CPC title
G06N3/045Primary
Combinations of networks · CPC title
G05B13/027Primary
using neural networks only · CPC title
G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
B25J9/1697
Vision controlled systems · CPC title

Patent family

Related publications grouped by family.

View patent family 59722666

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9914213B2 cover?: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired s…
Who is the assignee on this patent?: Google Inc, Google Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).