Motor control apparatus
US-2020096955-A1 · Mar 26, 2020 · US
US11565408B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11565408-B2 |
| Application number | US-202016803536-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 27, 2020 |
| Priority date | Sep 18, 2019 |
| Publication date | Jan 31, 2023 |
| Grant date | Jan 31, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An object manipulation apparatus according to an embodiment of the present disclosure includes a memory and a hardware processor coupled to the memory. The hardware processor is configured to: calculate, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generate information representing a second behavior manner based on the image and a plurality of evaluation values of the first behavior manner; and control actuation of grasping the object to be grasped in accordance with the information being generated.
Opening claim text (preview).
What is claimed is: 1. An object manipulation apparatus comprising: a memory; and a hardware processor coupled to the memory and configured to: calculate, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generate, as a second behavior manner, a behavior manner for obtaining an expected value of a larger accumulative reward by using a deep Q-network (DQN) based on a current observation state determined from the image and a plurality of evaluation values of the first behavior manner; control actuation of grasping the object to be grasped in accordance with the information being generated; and update the second behavior manner by updating a parameter of the DQN such that the expected value of the accumulative reward becomes larger, wherein the accumulative reward is an accumulation of rewards in consideration of at least one of a number of objects that can be picked at the same time, a time taken for picking, and a success rate of the picking. 2. The apparatus according to claim 1 , wherein the hardware processor calculates, from the image, an object area of the object to be grasped, and calculates the evaluation value by a score representing grasping easiness of the object to be grasped indicated by the object area. 3. The apparatus according to claim 1 , further comprising a sensor configured to acquire the image, wherein the hardware processor converts an image format of the image acquired by the sensor into an image format used in the calculation of the evaluation value. 4. The apparatus according to claim 1 , wherein the information representing the second behavior manner includes identification information used for identifying a picking tool, and a grasping position/posture by the picking tool, and the hardware processor carries out the control of the actuation of grasping the object to be grasped by using the picking tool identified by the identification information in accordance with the grasping position/posture. 5. The apparatus according to claim 4 , wherein the hardware processor calculates the evaluation value by using a convolutional neural network (CNN), and updates an evaluation manner of the evaluation value by updating a parameter of the CNN such that a value of a loss function of the CNN becomes smaller. 6. The apparatus according to claim 5 , wherein the hardware processor calculates, from the image, an object area of the object to be grasped from the image, samples candidates of a posture for grasping the object to be grasped indicated by the object area for each pixel of the object area, calculates, for each picking tool, a score representing grasping easiness in the candidates of the posture, selects a posture whose evaluation value becomes larger from the candidates of the posture, generates, as teaching data, a heatmap representing the selected posture and the evaluation value of the selected posture for each pixel of the object area, and stores, in the memory, a learning data set in which the teaching data and the image are associated with each other. 7. The apparatus according to claim 6 , wherein the hardware processor calculates the object area and the heatmap by the CNN using the learning data set, and updates the parameter of the CNN such that the value of the loss function of the CNN becomes smaller. 8. A handling method implemented by a computer, the method comprising: calculating, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generating as a second behavior manner, a behavior manner for obtaining an expected value of a larger accumulative reward by using a deep Q-network (DQN) based on a current observation state determined from the image and a plurality of evaluation values of the first behavior manner; controlling actuation of grasping the object to be grasped in accordance with the information being generated; and updating the second behavior manner by updating a parameter of the DQN such that the expected value of the accumulative reward becomes larger, wherein the accumulative reward is an accumulation of rewards in consideration of at least one of a number of objects that can be picked at the same time, a time taken for picking, and a success rate of the picking. 9. A computer program product comprising a non-transitory computer-readable recording medium on which an executable program is recorded, the program instructing a computer to: calculate, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generate, as a second behavior manner, a behavior manner for obtaining an expected value of a larger accumulative reward by using a deep Q-network (DQN) based on a current observation state determined from the image and a plurality of evaluation values of the first behavior manner; control actuation of grasping the object to be grasped in accordance with the information being generated; and update the second behavior manner by updating a parameter of the DQN such that the expected value of the accumulative reward becomes larger, wherein the accumulative reward is an accumulation of rewards in consideration of at least one of a number of objects that can be picked at the same time, a time taken for picking, and a success rate of the picking.
having finger members (B25J15/02, B25J15/04 take precedence) · CPC title
Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title
with vacuum · CPC title
characterised by motion, path, trajectory planning · CPC title
learning, adaptive, model based, rule based expert control · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.