Robot system and workpiece picking method
US-2019091869-A1 · Mar 28, 2019 · US
US11717959B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11717959-B2 |
| Application number | US-201816622309-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 28, 2018 |
| Priority date | Jun 28, 2017 |
| Publication date | Aug 8, 2023 |
| Grant date | Aug 8, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Deep machine learning methods and apparatus related to semantic robotic grasping are provided. Some implementations relate to training a training a grasp neural network, a semantic neural network, and a joint neural network of a semantic grasping model. In some of those implementations, the joint network is a deep neural network and can be trained based on both: grasp losses generated based on grasp predictions generated over a grasp neural network, and semantic losses generated based on semantic predictions generated over the semantic neural network. Some implementations are directed to utilization of the trained semantic grasping model to servo, or control, a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).
Opening claim text (preview).
What is claimed is: 1. A method implemented by one or more processors, comprising: identifying a desired object semantic feature; generating a candidate end effector motion vector defining motion to move a grasping end effector of a robot from a given pose to an additional pose; identifying an image captured by a vision component of the robot, the image capturing the grasping end effector and an object in an environment of the robot; applying the image and the candidate end effector motion vector as input to a trained joint neural network; generating a joint output based on the application of the image and the end effector motion vector to the trained joint neural network, wherein the trained joint neural network is trained based on: grasp losses generated based on grasp predictions generated over a grasp neural network based on training outputs generated using the joint neural network, and semantic losses generated based on semantic predictions generated over a semantic neural network based on training outputs generated using the joint neural network; applying the joint output to a trained version of the semantic neural network; generating, using the trained version of the semantic neural network based on the joint output, semantic neural network output that indicates whether the object includes the desired object semantic feature; generating a grasp success measure, generating the grasp success measure comprising: generating the grasp success measure based on application of the joint output to a trained version of the grasp neural network, or generating the grasp success measure based on application of the current image and the end effector motion vector to an additional trained grasp neural network; generating an end effector command based on the grasp success measure and the semantic model output that indicates whether the object includes the desired object semantic feature; and providing the end effector command to one or more actuators of the robot. 2. The method of claim 1 , wherein generating the grasp success measure comprises generating the grasp success measure based on application of the joint output to the trained version of the grasp neural network. 3. The method of claim 1 , wherein generating the grasp success measure comprises generating the grasp success measure based on application of the image and the end effector motion vector to the additional trained grasp neural network. 4. The method of claim 3 , wherein the additional trained grasp neural network is trained independently of the grasp neural network, the joint neural network, and the semantic neural network. 5. The method of claim 1 , wherein the image is not applied directly as input to the semantic neural network in generating the semantic model output. 6. The method of claim 1 , wherein the joint output is the only input applied to the semantic neural network in generating the semantic model output. 7. The method of claim 1 , wherein in training the joint neural network based on the grasp losses generated based on the grasp predictions generated over the grasp neural network, the grasp neural network is also trained based on the grasp losses, without training of the semantic neural network based on the grasp losses. 8. The method of claim 1 , wherein in training the joint neural network based on the semantic losses generated based on the semantic predictions generated over the semantic neural network, the semantic neural network is also trained based on the semantic losses, without training of the grasp neural network based on the semantic losses. 9. The method of claim 1 , wherein the desired object semantic feature defines an object classification. 10. The method of claim 1 , wherein the semantic model output indicates, for each of a plurality of object classifications, a likelihood that the object has a corresponding one of the object classifications. 11. The method of claim 1 , further comprising: receiving user interface input from a user interface input device; wherein identifying the desired object semantic feature is based on the user interface input. 12. The method of claim 1 , further comprising: determining a current grasp success measure of the object without application of the motion; wherein generating the end effector command based on the grasp success measure comprises generating the end effector command based on comparison of the grasp success measure to the current grasp success measure. 13. The method of claim 1 , wherein the end effector command is an end effector motion command and wherein generating the end effector motion command comprises generating the end effector motion command to conform to the candidate end effector motion vector. 14. The method of claim 13 , wherein generating the end effector command is in response to: determining, based on the semantic neural network output, a likelihood that the object includes the desired object feature; and determining that the likelihood satisfies one or more criteria and that the grasp success measure satisfies one or more criteria. 15. The method of claim 13 , wherein generating the end effector command is in response to: determining, based on the semantic neural network output, a likelihood that the object includes the desired object feature; generating a value as a function of the likelihood and the grasp success measure; and determining that the value satisfies a threshold. 16. A method implemented by one or more processors, comprising: identifying a desired object semantic feature; generating a candidate end effector motion vector defining motion to move a grasping end effector of a robot from a given pose to an additional pose; identifying an image captured by a vision component of the robot, the image capturing the grasping end effector and an object in an environment of the robot; applying the image and the candidate end effector motion vector as input to a trained joint neural network; generating a joint output based on the application of the image and the end effector motion vector to the trained joint neural network; applying the joint output to a trained semantic neural network; generating, using the trained semantic neural network based on the joint output, semantic neural network output that indicates whether the object includes the desired object semantic feature; applying the joint output to a trained grasp neural network; generating, using the trained grasp neural network based on the joint output, a grasp success measure; generating an end effector command based on the grasp success measure and the semantic model output that indicates whether the object includes the desired object semantic feature; and providing the end effector command to one or more actuators of the robot. 17. The method of claim 16 , wherein during training of the trained grasp neural network, grasp losses generated based on grasp predictions generated over the grasp neural network are utilized to update the grasp neural network and the joint prediction model, without being utilized to update the semantic neural network. 18. The method of claim 16 , wherein during training of the trained semantic neural network, grasp losses generated based on semantic predictions generated over the semantic neural network are utilized to update the semantic neural network and the joint prediction model, without being utilized to update the grasp neural network.
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Supervised learning · CPC title
Transfer learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
learning, adaptive, model based, rule based expert control · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.