Model-based robust deep learning
US-2022101627-A1 · Mar 31, 2022 · US
US12561550B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12561550-B2 |
| Application number | US-202218092256-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 31, 2022 |
| Priority date | Sep 14, 2021 |
| Publication date | Feb 24, 2026 |
| Grant date | Feb 24, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for using a teleoperation system to train a robot to perform tasks using machine learning are described herein. A teleoperation system may be used to record actions of a robot as used by a human teleoperator. The teleoperation system may provide a teleoperator insight into the state of the robot and may provide feedback to the teleoperator allowing the teleoperator to feel what the robot is feeling. For example, sensor information from the robot may be sent to the teleoperation system and output to the teleoperator in various forms including vibrations, video, visual cues, or sound. The teleoperation system may output visual guides to the teleoperator so that the teleoperator may know how to control the robot to complete a task in a desired manner.
Opening claim text (preview).
The invention claimed is: 1 . A method of training robots, the method comprising: with a robot operating under the control of a teleoperator, receiving sensor outputs from one or more sensors coupled to the robot; receiving teleoperation inputs generated by the teleoperator; generating commands for the robot to perform a first task based on the sensor outputs and the teleoperation inputs; determining a state of the robot within an environment based on the sensor outputs; outputting one or more cues to the teleoperator based on the state of the robot within the environment; generating a first training dataset comprising at least a portion of the sensor outputs and at least a portion of the commands; first training a machine learning model to receive robot sensor data and output robot commands using at least a portion of the first training dataset; and controlling the robot to perform a second task using the first trained machine learning model. 2 . The method of claim 1 , further comprising: generating a second training dataset from actions determined by the first trained machine learning model during controlling the robot to perform the second task; and second training the first trained machine learning model to receive robot sensor data and output robot commands using at least a portion of the second training dataset. 3 . The method of claim 2 , wherein second training the first trained machine learning model using at least a portion of the second training dataset comprises training the first trained machine learning model based on a reinforcement policy. 4 . The method of claim 2 , wherein the first trained machine learning model comprises a reinforcement learning model, and wherein second training the first trained machine learning model comprises: determining, based on a reinforcement policy, a first action that is different from actions indicated by the first training dataset; causing the robot to perform the first action; and in response to causing the robot to perform the first action, adjusting one or more weights of the reinforcement learning model. 5 . The method of claim 2 , wherein the second task is the same as the first task. 6 . The method of claim 2 , wherein the second task is different from the first task. 7 . The method of claim 2 , further comprising: controlling one or more other robots to perform tasks using the second trained machine learning model. 8 . The method of claim 1 , wherein the commands comprise causing movement of an arm of the robot, and wherein outputting one or more cues to the teleoperator based on the state of the robot within an environment comprises: detecting contact of the arm of the robot with an object; and in response to detecting contact of the arm of the robot with an object, outputting haptic feedback to the teleoperator. 9 . The method of claim 1 , wherein the one or more cues comprise a visual representation of the environment. 10 . The method of claim 1 , wherein the one or more cues comprise a haptic feedback indicating interaction of the robot with an object in the environment. 11 . The method of claim 1 , wherein the one or more cues comprise an indication of a restriction on movement of the robot. 12 . A method of training robots, the method comprising: receiving sensor outputs from one or more sensors coupled to one or more robots; receiving teleoperation inputs generated by a teleoperator; generating commands for the robot to perform a first task based on the sensor outputs and the teleoperation inputs; determining a state of the one or more robots within an environment based on the sensor outputs; outputting one or more cues to the teleoperator based on the state of the robot within the environment; generating a first training dataset comprising at least a portion of the sensor outputs and at least a portion of the commands; first training a machine learning model to receive robot sensor data and output robot commands using at least a portion of the first training dataset; and controlling the one or more robots to perform a second task using the first trained machine learning model. 13 . A system comprising: a robot system comprising a robot and one or more sensors coupled to the robot; a teleoperation system communicatively coupled to the robot system and operable to control the robot to perform one or more tasks; a computing system comprising one or more processing units coupled to memory and one or more computer-readable storage media storing instructions that when executed by the one or more processing units cause the computing system to perform operations comprising: receiving sensor outputs from the one or more sensors; receiving teleoperation inputs from the teleoperation system; generating commands for the robot to perform a first task based on the sensor outputs and the teleoperation inputs; determining a state of the robot within an environment based on the sensor outputs; outputting one or more teleoperation cues based on the state of the robot within the environment; generating a first training dataset comprising at least a portion of the sensor outputs and at least a portion of the commands; first training a machine learning model to receive robot sensor data and output robot commands using at least a portion of the first training dataset; and controlling the robot to perform a second task using the first trained machine learning model. 14 . The system of claim 13 , wherein the teleoperation system comprises a headset, and wherein outputting one or more teleoperation cues based on the state of the robot within the environment comprises presenting one or more visual cues on a display of the headset. 15 . The system of claim 13 , wherein the teleoperation system comprises a headset, and wherein outputting one or more teleoperation cues based on the state of the robot within the environment comprises outputting one or more haptic feedbacks to the glove. 16 . The system of claim 13 , wherein the operations further comprise: generating a second training dataset from actions determined by the first trained machine learning model during controlling the robot to perform the second task; and second training the first trained machine learning model to receive robot sensor data and output robot commands using at least a portion of the second training dataset. 17 . The system of claim 16 , wherein second training the first trained machine learning model using at least a portion of the second training dataset comprises training the first trained machine learning model based on a reinforcement policy. 18 . The system of claim 16 , wherein the first trained machine learning model comprises a reinforcement learning model, and wherein second training the first trained machine learning model comprises: determining, based on a reinforcement policy, an action that is different from an action indicated by the first training dataset; causing the robot to perform the action; and in response to causing the robot to perform the action, adjusting one or more weights of the reinforcement learning model. 19 . The system of claim 13 , wherein the operations further comprise: controlling one or more other robots to perform tasks using the second trained machine learning model. 20 . The system of claim 13 , wherein the robot comprises an arm, wherein the commands comprise causing movement of the arm, wherein the teleoperation system comprises a haptic feedback receptor, and
Machine learning · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Reinforcement learning · CPC title
based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.