Deep reinforcement learning for robotic manipulation
US-2021237266-A1 · Aug 5, 2021 · US
US11285607B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11285607-B2 |
| Application number | US-201916511492-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 15, 2019 |
| Priority date | Jul 13, 2018 |
| Publication date | Mar 29, 2022 |
| Grant date | Mar 29, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In some aspects, a system comprises a computer hardware processor and a non-transitory computer-readable storage medium storing processor-executable instructions for receiving, from one or more sensors, sensor data relating to a robot; generating, using a statistical model, based on the sensor data, first control information for the robot to accomplish a task; transmitting, to the robot, the first control information for execution of the task; and receiving, from the robot, a result of execution of the task.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: at least one computer hardware processor; and at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by the at least one computer hardware processor, cause the at least one computer hardware processor to perform: receiving, from one or more sensors, sensor data relating to a robot, wherein the sensor data comprises a voxel grid relating to the robot, wherein each voxel in the voxel grid is either occupied or not occupied, wherein the voxel grid is generated by capturing a three-dimensional point cloud relating to the robot, segmenting the three-dimensional point cloud into one or more object point clouds, and converting the one or more object point clouds into the voxel grid; generating, using a statistical model, based on the sensor data, first control information for the robot to accomplish a task; transmitting, to the robot, the first control information for execution of the task; and receiving, from the robot, a result of execution of the task. 2. The system of claim 1 , wherein the processor-executable instructions cause the at least one computer hardware processor to further perform: in response to the result of execution of the task being unsuccessful: receiving, from a user, input relating to second control information for the robot to accomplish the task; transmitting, to the robot, the second control information for execution of the task; receiving, from the robot, the result of execution of the task; and updating the statistical model based on the sensor data, the second control information, and the result of execution of the task. 3. The system of claim 2 , wherein the processor-executable instructions cause the at least one computer hardware processor to further perform: in response to the result of execution of the task being unsuccessful: updating a count of unsuccessful executions of tasks; and in response to the count of unsuccessful executions exceeding a threshold, receiving, from the user, the input relating to the second control information for the robot to accomplish the task. 4. The system of claim 2 , wherein the processor-executable instructions cause the at least one computer hardware processor to further perform: generating, using the statistical model, a confidence value for the first control information; in response to the confidence value not exceeding a confidence threshold, receiving, from the user, the input relating to the second control information for the robot to accomplish the task; and in response to the confidence value exceeding the confidence threshold, transmitting, to the robot, the first control information for execution of the task. 5. The system of claim 1 , wherein the first control information relates to a grasp pose for an end effector of the robot. 6. The system of claim 5 , wherein the grasp pose comprises a position vector and an orientation vector for the end effector of the robot. 7. The system of claim 1 , wherein the statistical model comprises a convolutional neural network including an input layer, one or more convolution layers, one or more pooling layers, one or more dense layers, and an output layer. 8. The system of claim 1 , wherein the result of execution of the task indicates whether execution of the task was successful or unsuccessful. 9. The system of claim 8 , wherein the result of execution of the task is based on an indication from a user regarding whether the execution of the task was successful or unsuccessful. 10. The system of claim 8 , wherein the task relates to a grasp pose, wherein a torque across an end effector of the robot is measured, and wherein the result of execution of the task is successful or unsuccessful based on whether the measured torque exceeds or does not exceed a torque threshold. 11. At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one computer hardware processor, cause the at least one computer hardware processor to perform: receiving, from one or more sensors, sensor data relating to a robot, wherein the sensor data comprises a voxel grid relating to the robot, wherein each voxel in the voxel grid is either occupied or not occupied, wherein the voxel grid is generated by capturing a three-dimensional point cloud relating to the robot, segmenting the three-dimensional point cloud into one or more object point clouds, and converting the one or more object point clouds into the voxel grid; generating, using a statistical model, based on the sensor data, first control information for the robot to accomplish a task; transmitting, to the robot, the first control information for execution of the task; and receiving, from the robot, a result of execution of the task. 12. The computer-readable storage medium of claim 11 , wherein the processor-executable instructions cause the at least one computer hardware processor to further perform: in response to the result of execution of the task being unsuccessful: receiving, from a user, input relating to second control information for the robot to accomplish the task; transmitting, to the robot, the second control information for execution of the task; receiving, from the robot, the result of execution of the task; and updating the statistical model based on the sensor data, the second control information, and the result of execution of the task. 13. The computer-readable storage medium of claim 12 , wherein the processor-executable instructions cause the at least one computer hardware processor to further perform: in response to the result of execution of the task being unsuccessful: updating a count of unsuccessful executions of tasks; and in response to the count of unsuccessful executions exceeding a threshold, receiving, from the user, the input relating to the second control information for the robot to accomplish the task. 14. The computer-readable storage medium of claim 12 , wherein the processor-executable instructions cause the at least one computer hardware processor to further perform: generating, using the statistical model, a confidence value for the first control information; in response to the confidence value not exceeding a confidence threshold, receiving, from the user, the input relating to the second control information for the robot to accomplish the task; and in response to the confidence value exceeding the confidence threshold, transmitting, to the robot, the first control information for execution of the task. 15. The computer-readable storage medium of claim 11 , wherein the first control information relates to a grasp pose for an end effector of the robot, wherein the grasp pose comprises a position vector and an orientation vector for the end effector of the robot. 16. The computer-readable storage medium of claim 11 , wherein the result of execution of the task indicates whether execution of the task was successful or unsuccessful, wherein the result of execution of the task is based on an indication from a user regarding whether the execution of the task was successful or unsuccessful. 17. The computer-readable storage medium of claim 11 , wherein the result of execution of the task indicates whether execution of the task was successful or unsuccessful, wherein the task relates to a grasp pose, wherein a torque across an end effector of the robot is measured, and wherein the result of execution of the task is successful or unsuccessful based on whether the measured torque
Multilayer, MNN, four layer perceptron, sigmoidal neural network · CPC title
Virtual reality control, programming of manipulator · CPC title
Teleoperation · CPC title
learning, adaptive, model based, rule based expert control · CPC title
Planning of hand motion, grasping · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.