Inter-object relation recognition apparatus, learned model, recognition method and non-transitory computer readable medium

US10762329B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10762329-B2
Application numberUS-201816209116-A
CountryUS
Kind codeB2
Filing dateDec 4, 2018
Priority dateDec 6, 2017
Publication dateSep 1, 2020
Grant dateSep 1, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An Inter-object relation recognition apparatus includes: a first learning device receiving an image, and outputting a first feature amount of the image; a second learning device receiving the first feature amount, outputting a second feature amount, and having a plurality of storage units holding internal states for predetermined steps; a triplet unit having a plurality of triplet-units, the triplet-units receiving the second feature amount, recognizing first to third elements, and constituted of first to third recognition units outputting probability information. The triplet unit selects at least one combination of the first to third elements, based on the probability information of the first to third elements output from the first to third recognition units of each of the triplet units, from the combinations of the first to third elements output from each of the triplet units, and recognizes and outputs the combination as a relation between objects.

First claim

Opening claim text (preview).

What is claimed is: 1. An inter-object relation recognition apparatus configured to perform learning based on learning data associated with an image, and a plurality of relations between objects included in the image, and to recognize and output the relations between objects included in the image using a result of the learning, the apparatus comprising: a first learning device configured to receive the image and output a first feature amount indicating a feature of the image; a second learning device configured to receive the first feature amount output from the first learning device and output a second feature amount of a lower dimension number than the first feature amount, and the second learning device including a plurality of storage units holding internal states for predetermined steps; and a triplet unit including a plurality of triplet-units, the triplet-units connected to each storage unit of the second learning device, receiving the second feature amount output from each of the storage units, constituted of first, second, and third recognition units, and respectively outputting combinations of the first, second, and third elements, the first, second, and third recognition units respectively recognizing the first, second, and third elements based on the received second feature amount, outputting probability information of the first, second, and third elements, wherein the triplet unit selects at least one combination of the first, second, and third elements, based on the probability information of the first, second, and third elements output from the first, second, and third recognition units of each of the triplet-units, from the combinations of the first, second, and third elements output from each of the triplet-units, and recognizes and outputs the selected combination of the first, second, and third elements as the relation between objects included in the image. 2. The inter-object relation recognition apparatus according to claim 1 , wherein the second learning device is a recurrent neural network, and the second learning device and the triplet unit perform the learning by optimizing parameters of a predetermined function, based on learning data associating the image and the relations between objects and hold the optimized parameter as the result of the learning. 3. The inter-object relation recognition apparatus according to claim 1 , wherein the first and third recognition units respectively output the probability information of the first and third elements to the second recognition unit, and the second recognition unit recognizes the second element, based on the first and third elements output from the first and third recognition units, and the second feature amount output from the corresponding storage unit of the second learning device, and outputs the probability information of the second element. 4. The inter-object relation recognition apparatus according to claim 1 , wherein the storage unit of the second learning device is an LSTM (Long Short-Term Memory). 5. The inter-object relation recognition apparatus according to claim 1 , wherein the first learning device is configured as a convolution type neural network. 6. A learned model for making a computer function comprising: a first learning device configured to receive an image, output a first feature amount indicating a feature of the image; a second learning device configured to receive the first feature amount output from the first learning device, output a second feature amount of a lower dimension number than the first feature amount, and have a plurality of storage units holding internal states for predetermined steps; and a triplet unit including a plurality of triplet-units, the triplet-units connected to each storage unit of the second learning device, receiving the second feature amount output from each of the storage units, constituted of first, second, and third recognition units, and outputting combinations of the first, second, and third elements, the first, second, and third recognition units respectively recognizing the first, second, and third elements based on the received second feature amount, outputting probability information of the first, second, and third elements, wherein, the triplet unit is configured to select at least one combination of the first, second, and third elements, based on the probability information of the first, second, and third elements output from the first, second, and third recognition units of each of the triplet-units, from the combinations of the first, second, and third elements output from each of the triplet-units, and recognize the selected combination of the first, second, and third elements as the relation between objects included in the image, a weighting coefficient of the first learning device, the second learning device, and the triplet unit is learned, based on learning data associating the image and the relations between objects included in the image, and when an image of a recognition target is input, it is intended that the first learning device, the second learning device, and the triplet unit perform calculations based on the learned weighting coefficients, and each relation between objects included in the image of the recognition target is recognized. 7. A recognition method of an inter-object relation recognition apparatus comprising: a first learning device configured to receive an image, output a first feature amount indicating a feature of the image; a second learning device configured to receive the first feature amount output from the first learning device, output a second feature amount of a lower dimension number than the first feature amount, and have a plurality of storage units holding internal states for predetermined steps; and a triplet unit including a plurality of triplet-units, the triplet-units connected to each storage unit of the second learning device, receiving the second feature amount output from each of the storage units, constituted of first, second, and third recognition units, and respectively outputting combinations of the first, second, and third elements, the first, second, and third recognition units respectively recognizing the first, second, and third elements based on the received second feature amount, outputting probability information of the first, second, and third elements, wherein the triplet unit selects at least one combination of the first, second, and third elements, based on the probability information of the first, second, and third elements output from the first, second, and third recognition units of each of the triplet-units, from the combinations of the first, second, and third elements output from each of the triplet-units, and recognizes and outputs the selected combination of the first, second, and third elements as the relation between objects included in the image. 8. A non-transitory computer readable medium storing a program of an inter-object relation recognition apparatus comprising: a first learning device configured to receive an image, output a first feature amount indicating a feature of the image; a second learning device configured to receive the first feature amount output from the first learning device, output a second feature amount of a lower dimension number than the first feature amount, and have a plurality of storage units holding internal states for predetermined steps; and a triplet unit including a plurality of triplet-units, the triplet-units connected to each storage unit of the second learning device, receiving the second feature amount output from each of the storage units, constituted of first, second, and third recognition units, and respectively outputting combinations of the first, second, and third elements, the first

Assignees

Inventors

Classifications

  • G06N3/008Primary

    based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour · CPC title

  • Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Three-dimensional [3D] objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10762329B2 cover?
An Inter-object relation recognition apparatus includes: a first learning device receiving an image, and outputting a first feature amount of the image; a second learning device receiving the first feature amount, outputting a second feature amount, and having a plurality of storage units holding internal states for predetermined steps; a triplet unit having a plurality of triplet-units, the tr…
Who is the assignee on this patent?
Nakayama Hideki, Masui Kento, Yoshizawa Shintaro, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06N3/008. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 01 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).