Robotic control using deep learning
US-2021252698-A1 · Aug 19, 2021 · US
US11534913B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11534913-B2 |
| Application number | US-202016880857-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 21, 2020 |
| Priority date | May 21, 2020 |
| Publication date | Dec 27, 2022 |
| Grant date | Dec 27, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for integrating sensor streams for robotic demonstration learning. One of the methods includes selecting, by a learning system for a robot, a base update rate for combining multiple sensor streams into a task state representation. The learning system repeatedly generates the task state representation at the base update rate, including combining, during each time period defined by the update rate, the task state representation from most recently updated sensor data processed by the plurality of neural networks. The learning system repeatedly uses the task state representations to generate commands for the robot at the base update rate.
Opening claim text (preview).
What is claimed is: 1. A method performed by one or more computers, the method comprising: selecting, by a learning system for a robot, a base update rate for combining multiple sensor streams into a task state representation representing a state of a subtask being performed by the robot, wherein selecting the base update rate comprises selecting the base update rate based on a minimum update rate of the robot; generating, by each neural network of a plurality of neural networks, a respective portion of the task state representation at a respective update rate of a respective sensor of a plurality of sensors, wherein the plurality of sensors include one or more robot state sensors that generate sensor data representing physical characteristics of a robot and one or more perceptual sensors that generate sensor data representing visual characteristics of a workcell of the robot; repeatedly generating, by the learning system, the task state representation at the base update rate, including combining, during each time period defined by the update rate, the task state representation from most recently updated sensor data processed by the plurality of neural networks; and repeatedly using the task state representations to generate commands for the robot at the base update rate. 2. The method of claim 1 , wherein selecting the base update rate comprises: obtaining information representing respective update rates for each of the plurality of sensors; and selecting the base update rate to be a highest update rate among the plurality of sensors. 3. The method of claim 1 , wherein repeatedly generating the task state representation at the base update rate comprises reading different portions of the task state representation from multiple independent memory devices or memory partitions. 4. The method of claim 3 , wherein reading the different portions of the task state representation occurs at a rate different than the rate at which a particular neural network updates a particular portion of the task state representation in one of the independent memory devices or memory partitions. 5. The method of claim 1 , wherein repeatedly using the task state representation to generate commands for the robot at the base update rate comprises: providing the task state representations to a plurality of independent tuned control policy systems; generating, by each of the plurality of independent tuned control policy systems, a respective robot subcommand; and combining the robot subcommands to generate each command for the robot at the base update rate. 6. The method of claim 5 , wherein at least two of the plurality of independently tuned control policy systems operate at different update rates, and wherein combining the robot subcommands comprises combining each most recently generated robot subcommand at the base update rate. 7. The method of claim 6 , wherein each of the independently tuned control policy systems implements a different respective control algorithm. 8. The method of claim 1 , wherein repeatedly using the task state representation to generate commands for the robot comprises generating a corrective action to be combined with a base action generated by a base control policy. 9. The method of claim 8 , further comprising: generating a robot command including combining the corrective action with the base action generated by the base control policy; and providing the robot command to the robot, thereby causing the robot to perform actions according to the base update rate. 10. The method of claim 1 , wherein the minimum update rate of the robot is dependent on a real-time control cycle of the robot. 11. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: selecting, by a learning system for a robot, a base update rate for combining multiple sensor streams into a task state representation representing a state of a subtask being performed by the robot, wherein selecting the base update rate comprises selecting the base update rate based on a minimum update rate of the robot; generating, by each neural network of a plurality of neural networks, a respective portion of the task state representation at a respective update rate of a respective sensor of a plurality of sensors, wherein the plurality of sensors include one or more robot state sensors that generate sensor data representing physical characteristics of a robot and one or more perceptual sensors that generate sensor data representing visual characteristics of a workcell of the robot; repeatedly generating, by the learning system, the task state representation at the base update rate, including combining, during each time period defined by the update rate, the task state representation from most recently updated sensor data processed by the plurality of neural networks; and repeatedly using the task state representations to generate commands for the robot at the base update rate. 12. The system of claim 11 , wherein selecting the base update rate comprises: obtaining information representing respective update rates for each of the plurality of sensors; and selecting the base update rate to be a highest update rate among the plurality of sensors. 13. The system of claim 11 , wherein repeatedly generating the task state representation at the base update rate comprises reading different portions of the task state representation from multiple independent memory devices or memory partitions. 14. The system of claim 13 , wherein reading the different portions of the task state representation occurs at a rate different than the rate at which a particular neural network updates a particular portion of the task state representation in one of the independent memory devices or memory partitions. 15. The system of claim 11 , wherein repeatedly using the task state representation to generate commands for the robot at the base update rate comprises: providing the task state representations to a plurality of independent tuned control policy systems; generating, by each of the plurality of independent tuned control policy systems, a respective robot subcommand; and combining the robot subcommands to generate each command for the robot at the base update rate. 16. The system of claim 15 , wherein at least two of the plurality of independently tuned control policy systems operate at different update rates, and wherein combining the robot subcommands comprises combining each most recently generated robot subcommand at the base update rate. 17. The system of claim 16 , wherein each of the independently tuned control policy systems implements a different respective control algorithm. 18. The system of claim 11 , wherein repeatedly using the task state representation to generate commands for the robot comprises generating a corrective action to be combined with a base action generated by a base control policy. 19. The system of claim 11 , wherein the minimum update rate of the robot is dependent on a real-time control cycle of the robot. 20. One or more non-transitory computer storage media encoded with computer program instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: selecting, by a learning system for a robot, a base update rate for combining multiple sensor streams into a task state representation representing a state of a subt
Generic motion control operations, primitive skills each for special task · CPC title
Different sample rates, multiple sample rates for the different loops · CPC title
characterised by task planning, object-oriented languages · CPC title
Record actions of human expert, teach by showing · CPC title
Teleoperation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.