Robot arm control device, method for training hierarchical reinforcement learning model for robot arm control, and storage medium storing instructions to perform method training hierarchical reinforcement learning model for robot arm control

US12583103B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12583103-B2
Application numberUS-202318490242-A
CountryUS
Kind codeB2
Filing dateOct 19, 2023
Priority dateOct 19, 2022
Publication dateMar 24, 2026
Grant dateMar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A robot arm control device is proposed. The robot arm control device may include a memory. The device may also include an acquisition unit acquiring an arbitrary target image, a virtual canvas image in which a virtual drawing operation of a robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected in the virtual canvas image. The device may further include a processor configured to input the target image to the pre-trained learning model, determine a position change amount of the virtual nib image in the virtual canvas image using the pre-trained learning model, and output a joint angle change amount for driving the robot arm on the basis of the position change amount.

First claim

Opening claim text (preview).

What is claimed is: 1 . A robot arm control device comprising: a memory storing one or more instructions configured to process a pre-trained learning model for controlling a robot arm; an acquisition unit configured to acquire an arbitrary target image, a virtual canvas image in which a virtual drawing operation of the robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected in the virtual canvas image; and a processor configured to execute the one or more instructions to: input the target image to the pre-trained learning model, determine a position change amount of the virtual nib image in the virtual canvas image using the pre-trained learning model, output a joint angle change amount for driving the robot arm on the basis of the position change amount, and drive the robot arm according to the joint angle change amount. 2 . The robot arm control device of claim 1 , wherein the learning model includes a commander configured to determine the position change amount, and a stroker configured to output the joint angle change amount. 3 . The robot arm control device of claim 2 , wherein the commander is configured to be trained to output a position change amount that satisfies a target position change amount of the robot arm when the target image is input, and wherein the stroker is configured to be trained to output the joint angle change amount of the robot arm when the position change amount and a joint angle of the robot arm are input. 4 . The robot arm control device of claim 1 , wherein a robot arm driver is configured to measure position information of an actual nib mounted on the robot arm, and wherein the processor is configured to receive the measured position information from the robot arm driver and reflect the received position information of the actual nib in the virtual nib image. 5 . A hierarchical reinforcement learning method of a hierarchical reinforcement learning device including a first learning model and a second learning model for controlling a robot arm, the hierarchical reinforcement learning method comprising: acquiring an arbitrary target image, a virtual canvas image in which a virtual drawing operation of the robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected in the virtual canvas image; training the first learning model such that the virtual canvas image corresponds to the target image on the basis of a position of the virtual nib image in the virtual canvas image; training the second learning model such that a joint angle change amount of a virtual robot arm corresponds to a target position change amount of the virtual nib image; and controlling the robot arm according to the joint angle change amount. 6 . The hierarchical reinforcement learning method of claim 5 , wherein the training of the first learning model includes: determining a position change amount for changing the position of the virtual nib image within the virtual canvas image; and determining an amount of change in similarity between the target image and the virtual canvas image in which the virtual drawing operation is reflected step by step on the basis of the position change amount, and determining a compensation value according to the amount of change in similarity. 7 . The hierarchical reinforcement learning method of claim 6 , wherein the determining of the position change amount includes performing reinforcement learning on the basis of the compensation value. 8 . The hierarchical reinforcement learning method of claim 5 , wherein the training of the second learning model includes receiving a joint angle of the virtual robot arm and the target position change amount, and determining the joint angle change amount of the virtual robot arm. 9 . The hierarchical reinforcement learning method of claim 8 , further comprising: determining similarity between a position change amount of the virtual nib changed according to the joint angle change amount and the target position change amount; and performing reinforcement learning on the basis of a compensation value according to the similarity. 10 . The hierarchical reinforcement learning method of claim 9 , wherein the target position change amount is updated each time the joint angle change amount is determined. 11 . A non-transitory computer-readable storage medium storing computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform a robot arm control method using a hierarchical reinforcement learning model, the method comprising: acquiring an arbitrary target image, a virtual canvas image in which a virtual drawing operation of the robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflected within the virtual canvas image; performing processing such that the hierarchical reinforcement learning model determines a position change amount of the virtual nib image in the virtual canvas image when the target image is input to the hierarchical reinforcement learning model; performing processing such that a joint angle change amount for driving the robot arm is output on the basis of the position change amount; and driving the robot arm according to the joint angle change amount. 12 . The non-transitory computer-readable storage medium of claim 11 , wherein the learning model is trained to output a position change amount that satisfies a target position change amount of the robot arm when the target image is input, and trained to output a joint angle change amount of the robot arm when the position change amount and a joint angle of the robot arm are input. 13 . The non-transitory computer-readable storage medium of claim 11 , further comprising: measuring position information of an actual nib mounted on the robot arm when the robot arm is driven on the basis of the joint angle change amount; and reflecting the position information of the actual nib in the virtual nib image.

Assignees

Inventors

Classifications

  • characterised by motion, path, trajectory planning · CPC title

  • characterised by simulation, either to verify existing program or to create and verify new program, CAD/CAM oriented, graphic oriented programming systems · CPC title

  • B25J9/163Primary

    learning, adaptive, model based, rule based expert control · CPC title

  • Learning methods · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12583103B2 cover?
A robot arm control device is proposed. The robot arm control device may include a memory. The device may also include an acquisition unit acquiring an arbitrary target image, a virtual canvas image in which a virtual drawing operation of a robot arm for the target image is reflected, and a virtual nib image of the robot arm whose position is changed as the virtual drawing operation is reflecte…
Who is the assignee on this patent?
Agency Defense Dev, Seoul Nat Univ R&Db Foundation, Seoul Nat Univ R&Db Foundation
What technology area does this patent fall under?
Primary CPC classification B25J9/163. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).