Object manipulation apparatus, handling method, and program product

US11565408B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11565408-B2
Application numberUS-202016803536-A
CountryUS
Kind codeB2
Filing dateFeb 27, 2020
Priority dateSep 18, 2019
Publication dateJan 31, 2023
Grant dateJan 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An object manipulation apparatus according to an embodiment of the present disclosure includes a memory and a hardware processor coupled to the memory. The hardware processor is configured to: calculate, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generate information representing a second behavior manner based on the image and a plurality of evaluation values of the first behavior manner; and control actuation of grasping the object to be grasped in accordance with the information being generated.

First claim

Opening claim text (preview).

What is claimed is: 1. An object manipulation apparatus comprising: a memory; and a hardware processor coupled to the memory and configured to: calculate, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generate, as a second behavior manner, a behavior manner for obtaining an expected value of a larger accumulative reward by using a deep Q-network (DQN) based on a current observation state determined from the image and a plurality of evaluation values of the first behavior manner; control actuation of grasping the object to be grasped in accordance with the information being generated; and update the second behavior manner by updating a parameter of the DQN such that the expected value of the accumulative reward becomes larger, wherein the accumulative reward is an accumulation of rewards in consideration of at least one of a number of objects that can be picked at the same time, a time taken for picking, and a success rate of the picking. 2. The apparatus according to claim 1 , wherein the hardware processor calculates, from the image, an object area of the object to be grasped, and calculates the evaluation value by a score representing grasping easiness of the object to be grasped indicated by the object area. 3. The apparatus according to claim 1 , further comprising a sensor configured to acquire the image, wherein the hardware processor converts an image format of the image acquired by the sensor into an image format used in the calculation of the evaluation value. 4. The apparatus according to claim 1 , wherein the information representing the second behavior manner includes identification information used for identifying a picking tool, and a grasping position/posture by the picking tool, and the hardware processor carries out the control of the actuation of grasping the object to be grasped by using the picking tool identified by the identification information in accordance with the grasping position/posture. 5. The apparatus according to claim 4 , wherein the hardware processor calculates the evaluation value by using a convolutional neural network (CNN), and updates an evaluation manner of the evaluation value by updating a parameter of the CNN such that a value of a loss function of the CNN becomes smaller. 6. The apparatus according to claim 5 , wherein the hardware processor calculates, from the image, an object area of the object to be grasped from the image, samples candidates of a posture for grasping the object to be grasped indicated by the object area for each pixel of the object area, calculates, for each picking tool, a score representing grasping easiness in the candidates of the posture, selects a posture whose evaluation value becomes larger from the candidates of the posture, generates, as teaching data, a heatmap representing the selected posture and the evaluation value of the selected posture for each pixel of the object area, and stores, in the memory, a learning data set in which the teaching data and the image are associated with each other. 7. The apparatus according to claim 6 , wherein the hardware processor calculates the object area and the heatmap by the CNN using the learning data set, and updates the parameter of the CNN such that the value of the loss function of the CNN becomes smaller. 8. A handling method implemented by a computer, the method comprising: calculating, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generating as a second behavior manner, a behavior manner for obtaining an expected value of a larger accumulative reward by using a deep Q-network (DQN) based on a current observation state determined from the image and a plurality of evaluation values of the first behavior manner; controlling actuation of grasping the object to be grasped in accordance with the information being generated; and updating the second behavior manner by updating a parameter of the DQN such that the expected value of the accumulative reward becomes larger, wherein the accumulative reward is an accumulation of rewards in consideration of at least one of a number of objects that can be picked at the same time, a time taken for picking, and a success rate of the picking. 9. A computer program product comprising a non-transitory computer-readable recording medium on which an executable program is recorded, the program instructing a computer to: calculate, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generate, as a second behavior manner, a behavior manner for obtaining an expected value of a larger accumulative reward by using a deep Q-network (DQN) based on a current observation state determined from the image and a plurality of evaluation values of the first behavior manner; control actuation of grasping the object to be grasped in accordance with the information being generated; and update the second behavior manner by updating a parameter of the DQN such that the expected value of the accumulative reward becomes larger, wherein the accumulative reward is an accumulation of rewards in consideration of at least one of a number of objects that can be picked at the same time, a time taken for picking, and a success rate of the picking.

Assignees

Inventors

Classifications

  • having finger members (B25J15/02, B25J15/04 take precedence) · CPC title

  • Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title

  • with vacuum · CPC title

  • characterised by motion, path, trajectory planning · CPC title

  • B25J9/163Primary

    learning, adaptive, model based, rule based expert control · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11565408B2 cover?
An object manipulation apparatus according to an embodiment of the present disclosure includes a memory and a hardware processor coupled to the memory. The hardware processor is configured to: calculate, based on an image in which one or more objects to be grasped are contained, an evaluation value of a first behavior manner of grasping the one or more objects; generate information representing…
Who is the assignee on this patent?
Toshiba Kk
What technology area does this patent fall under?
Primary CPC classification B25J9/163. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Jan 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).