Machine learning device, robot controller, robot system, and machine learning method for learning action pattern of human
US-10807235-B2 · Oct 20, 2020 · US
US12539605B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12539605-B2 |
| Application number | US-202318514574-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 20, 2023 |
| Priority date | May 21, 2020 |
| Publication date | Feb 3, 2026 |
| Grant date | Feb 3, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using simulated local demonstration data for robotic demonstration learning. One of the methods includes receiving perceptual data of a workcell of a robot to be configured to execute a task according to a skill template, wherein the skill template specifies one or more subtasks required to perform the skill, wherein at least one of the subtasks is a demonstration subtask that relies on learning visual characteristics of the workcell. A virtual model is generated of a portion of the workcell. A training system generates simulated local demonstration data from the virtual model of the portion of the workcell and tunes a base control policy for the demonstration subtask using the simulated local demonstration data generated from the virtual model of the portion of the workcell.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising: receiving perceptual data of a workcell of a robot configured to execute a task according to a skill template, wherein the skill template specifies a plurality of subtasks required to perform a skill, wherein the plurality of subtasks includes at least one demonstration subtask that relies on visual characteristics of the workcell; generating actual local demonstration data from a workcell demonstration of the demonstration subtask by operating the robot in the workcell while capturing sensor data generated by a plurality of sensors in the workcell of the robot, wherein the actual local demonstration data comprises task state representations, and each task state representation is generated by processing sensor output of each sensor of the plurality of sensors by a different respective neural network of a plurality of neural networks to generate, by each neural network, a different portion of the task state representation; obtaining a base control policy for the demonstration subtask; and tuning, using the actual local demonstration data from the workcell demonstration, the base control policy of the robot to follow an updated control policy for the demonstration subtask. 2 . The method of claim 1 , wherein generating the actual local demonstration data comprises combining the task state representations into a single task state representation. 3 . The method of claim 2 , wherein the sensor data has a first dimension, and the single task state representation has a second dimension, the second dimension being lower than the first dimension. 4 . The method of claim 1 , wherein tuning the base control policy comprises: generating, based on the actual local demonstration data, one or more modifications to adjust the robot from the base control policy to the updated control policy. 5 . The method of claim 1 , wherein tuning the base control policy to the updated control policy comprises: generating one or more commands for execution by the robot, wherein the one or more commands, when executed by the robot, causes the robot to follow the updated control policy. 6 . The method of claim 1 , further comprising: generating, by the updated control policy, a proposed command for execution by the robot; generating, using the proposed command and the actual local demonstration data of the workcell demonstration as an input to a training system, one or more parameter corrections for the updated control policy; and adjusting, using the one or more parameter corrections and the training system, one or more parameters of the updated control policy to obtain a new control policy. 7 . The method of claim 6 , wherein the training system is trained by one or more machine learning techniques to generate the one or more parameter corrections. 8 . The method of claim 1 , wherein tuning the base control policy at least partially overlaps with a user collecting the actual local demonstration data from the robot. 9 . The method of claim 1 , wherein the perceptual data comprises a camera image, depth camera data, lidar scan data, or laser scan data of the workcell. 10 . The method of claim 1 , wherein the actual local demonstration data includes joint data of the robot representing joint angles of the robot during the workcell demonstration. 11 . A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving perceptual data of a workcell of a robot configured to execute a task according to a skill template, wherein the skill template specifies a plurality of subtasks required to perform a skill, wherein the plurality of subtasks includes at least one demonstration subtask that relies on visual characteristics of the workcell; generating actual local demonstration data from a workcell demonstration of the demonstration subtask by operating the robot in the workcell while capturing sensor data generated by a plurality of sensors in the workcell of the robot, wherein the actual local demonstration data comprises task state representations, and each task state representation is generated by processing sensor output of each sensor of the plurality of sensors by a different respective neural network of a plurality of neural networks to generate, by each neural network, a different portion of the task state representation; obtaining a base control policy for the demonstration subtask; and tuning, using the actual local demonstration data from the workcell demonstration, the base control policy of the robot to follow an updated control policy for the demonstration subtask. 12 . The system of claim 11 , wherein generating the actual local demonstration data comprises combining the task state representations into a single task state representation. 13 . The system of claim 12 , wherein the sensor data has a first dimension, and the single task state representation has a second dimension, the second dimension being lower than the first dimension. 14 . The system of claim 11 , wherein tuning the base control policy comprises: generating, based on the actual local demonstration data, one or more modifications to adjust the robot from the base control policy to the updated control policy. 15 . The system of claim 11 , wherein tuning the base control policy to the updated control policy comprises: generating one or more commands for execution by the robot, wherein the one or more commands, when executed by the robot, causes the robot to follow the updated control policy. 16 . The system of claim 11 , the operations further comprising: generating, by the updated control policy, a proposed command for execution by the robot; generating, using the proposed command and the actual local demonstration data of the workcell demonstration as an input to a training system, one or more parameter corrections for the updated control policy; and adjusting, using the one or more parameter corrections and the training system, one or more parameters of the updated control policy to obtain a new control policy. 17 . One or more non-transitory computer storage media encoded with computer program instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: receiving perceptual data of a workcell of a robot configured to execute a task according to a skill template, wherein the skill template specifies a plurality of subtasks required to perform a skill, wherein the plurality of subtasks includes at least one demonstration subtask that relies on visual characteristics of the workcell; generating actual local demonstration data from a workcell demonstration of the demonstration subtask by operating the robot in the workcell while capturing sensor data generated by a plurality of sensors in the workcell of the robot, wherein the actual local demonstration data comprises task state representations, and each task state representation is generated by processing sensor output of each sensor of the plurality of sensors by a different respective neural network of a plurality of neural networks to generate, by each neural network, a different portion of the task state representation; obtaining a base control policy for the demonstration subtask; and tuning, using the actual local demonstration data from the workcell demonstration, the base control policy of the robot to follow an updated control po
Hardware, e.g. neural networks, fuzzy logic, interfaces, processor · CPC title
Cellular, reconfigurable manipulator, e.g. cebot · CPC title
Simulation of manipulator lay-out, design, modelling of manipulator · CPC title
learning, adaptive, model based, rule based expert control · CPC title
Generic motion control operations, primitive skills each for special task · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.