Method, computer system, and non-transitory computer-readable record medium for learning robot skill
US-2025010466-A1 · Jan 9, 2025 · US
US12583103B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12583103-B2 |
| Application number | US-202318490242-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 19, 2023 |
| Priority date | Oct 19, 2022 |
| Publication date | Mar 24, 2026 |
| Grant date | Mar 24, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A robot arm control device is proposed. The robot arm control device may include a memory. The device may also include an acquisition unit acquiring an arbitrary target image, a virtual canvas image in which a virtual drawing operation of a robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected in the virtual canvas image. The device may further include a processor configured to input the target image to the pre-trained learning model, determine a position change amount of the virtual nib image in the virtual canvas image using the pre-trained learning model, and output a joint angle change amount for driving the robot arm on the basis of the position change amount.
Opening claim text (preview).
What is claimed is: 1 . A robot arm control device comprising: a memory storing one or more instructions configured to process a pre-trained learning model for controlling a robot arm; an acquisition unit configured to acquire an arbitrary target image, a virtual canvas image in which a virtual drawing operation of the robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected in the virtual canvas image; and a processor configured to execute the one or more instructions to: input the target image to the pre-trained learning model, determine a position change amount of the virtual nib image in the virtual canvas image using the pre-trained learning model, output a joint angle change amount for driving the robot arm on the basis of the position change amount, and drive the robot arm according to the joint angle change amount. 2 . The robot arm control device of claim 1 , wherein the learning model includes a commander configured to determine the position change amount, and a stroker configured to output the joint angle change amount. 3 . The robot arm control device of claim 2 , wherein the commander is configured to be trained to output a position change amount that satisfies a target position change amount of the robot arm when the target image is input, and wherein the stroker is configured to be trained to output the joint angle change amount of the robot arm when the position change amount and a joint angle of the robot arm are input. 4 . The robot arm control device of claim 1 , wherein a robot arm driver is configured to measure position information of an actual nib mounted on the robot arm, and wherein the processor is configured to receive the measured position information from the robot arm driver and reflect the received position information of the actual nib in the virtual nib image. 5 . A hierarchical reinforcement learning method of a hierarchical reinforcement learning device including a first learning model and a second learning model for controlling a robot arm, the hierarchical reinforcement learning method comprising: acquiring an arbitrary target image, a virtual canvas image in which a virtual drawing operation of the robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected in the virtual canvas image; training the first learning model such that the virtual canvas image corresponds to the target image on the basis of a position of the virtual nib image in the virtual canvas image; training the second learning model such that a joint angle change amount of a virtual robot arm corresponds to a target position change amount of the virtual nib image; and controlling the robot arm according to the joint angle change amount. 6 . The hierarchical reinforcement learning method of claim 5 , wherein the training of the first learning model includes: determining a position change amount for changing the position of the virtual nib image within the virtual canvas image; and determining an amount of change in similarity between the target image and the virtual canvas image in which the virtual drawing operation is reflected step by step on the basis of the position change amount, and determining a compensation value according to the amount of change in similarity. 7 . The hierarchical reinforcement learning method of claim 6 , wherein the determining of the position change amount includes performing reinforcement learning on the basis of the compensation value. 8 . The hierarchical reinforcement learning method of claim 5 , wherein the training of the second learning model includes receiving a joint angle of the virtual robot arm and the target position change amount, and determining the joint angle change amount of the virtual robot arm. 9 . The hierarchical reinforcement learning method of claim 8 , further comprising: determining similarity between a position change amount of the virtual nib changed according to the joint angle change amount and the target position change amount; and performing reinforcement learning on the basis of a compensation value according to the similarity. 10 . The hierarchical reinforcement learning method of claim 9 , wherein the target position change amount is updated each time the joint angle change amount is determined. 11 . A non-transitory computer-readable storage medium storing computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform a robot arm control method using a hierarchical reinforcement learning model, the method comprising: acquiring an arbitrary target image, a virtual canvas image in which a virtual drawing operation of the robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected within the virtual canvas image; performing processing such that the hierarchical reinforcement learning model determines a position change amount of the virtual nib image in the virtual canvas image when the target image is input to the hierarchical reinforcement learning model; performing processing such that a joint angle change amount for driving the robot arm is output on the basis of the position change amount; and driving the robot arm according to the joint angle change amount. 12 . The non-transitory computer-readable storage medium of claim 11 , wherein the learning model is trained to output a position change amount that satisfies a target position change amount of the robot arm when the target image is input, and trained to output a joint angle change amount of the robot arm when the position change amount and a joint angle of the robot arm are input. 13 . The non-transitory computer-readable storage medium of claim 11 , further comprising: measuring position information of an actual nib mounted on the robot arm when the robot arm is driven on the basis of the joint angle change amount; and reflecting the position information of the actual nib in the virtual nib image.
characterised by motion, path, trajectory planning · CPC title
characterised by simulation, either to verify existing program or to create and verify new program, CAD/CAM oriented, graphic oriented programming systems · CPC title
learning, adaptive, model based, rule based expert control · CPC title
Learning methods · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.