What technology area does this patent fall under?

Primary CPC classification G06V10/426. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jun 24 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Visual relationship detection method and system based on adaptive clustering learning

US2021192274A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2021192274-A1
Application number	US-202017007213-A
Country	US
Kind code	A1
Filing date	Aug 31, 2020
Priority date	Dec 23, 2019
Publication date	Jun 24, 2021
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure discloses a visual relationship detection method based on adaptive clustering learning, including: detecting visual objects from an input image and recognizing the visual objects to obtain context representation; embedding the context representation of pair-wise visual objects into a low-dimensional joint subspace to obtain a visual relationship sharing representation; embedding the context representation into a plurality of low-dimensional clustering subspaces, respectively, to obtain a plurality of preliminary visual relationship enhancing representation; and then performing regularization by clustering-driven attention mechanism; fusing the visual relationship sharing representations and regularized visual relationship enhancing representations with a prior distribution over the category label of visual relationship predicate, to predict visual relationship predicates by synthetic relational reasoning. The method is capable of fine-grained recognizing visual relationships of different subclasses by mining latent relationships in-between, which improves the accuracy of visual relationship detection.

First claim

Opening claim text (preview).

1 . A visual relationship detection method based on adaptive clustering learning, comprising, executed by a processor, the following steps: detecting visual objects from an input image and recognizing the visual objects by contextual message passing mechanism to obtain context representations of the visual objects; embedding the context representations of pair-wise visual objects into a low-dimensional joint subspace to obtain visual relationship sharing representations; embedding the context representations of pair-wise visual objects into a plurality of low-dimensional clustering subspaces, respectively, to obtain a plurality of preliminary visual relationship enhancing representations; and then performing regularization to the preliminary visual relationship enhancing representations by clustering-driven attention mechanisms; and fusing the visual relationship sharing representations, the regularized visual relationship enhancing representations and a prior distribution over the category labels of visual relationship predicates, to predict visual relationship predicates by synthetic relational reasoning. 2 . The visual relationship detection method based on adaptive clustering learning according to claim 1 , wherein the method further comprises: calculating empirical distribution of the visual relationships from training set samples of a visual relationship data set to obtain a visual relationship prior function. 3 . The visual relationship detection method based on adaptive clustering learning according to claim 1 , wherein the method further comprises: constructing an initialized visual relationship detection model, and training the model by the training data of the visual relationship data set. 4 . The visual relationship detection method based on adaptive clustering learning according to claim 1 , wherein the step of obtaining the visual relationship sharing representations is specifically: obtaining a first product of a joint subject mapping matrix and the context representations of the visual object of the subject, obtaining a second product of a joint object mapping matrix and the context representations of the visual object of the object; subtracting the second product from the first product, and dot-multiplying the difference value and convolutional features of a visual relationship candidate region; wherein, the joint subject mapping matrix and the joint object mapping matrix are mapping matrices that map the visual objects context representations to a joint subspace; and the visual relationship candidate region is the minimum rectangle box that can fully cover the corresponding visual object candidate regions of the subject and object; the convolutional features are extracted from the visual relationship candidate region by any convolutional neural network. 5 . The visual relationship detection method based on adaptive clustering learning according to claim 4 , wherein the step of obtaining a plurality of preliminary visual relationship enhancing representation is specifically: obtaining a third product of a k th clustering subject mapping matrix and the context representation of the visual object of the subject, obtaining a fourth product of a k th clustering object mapping matrix and the context representation of the visual object of the object; subtracting the fourth product from the third product, and dot-multiplying the difference value and convolutional features of a visual relationship candidate region to obtain a k th preliminary visual relationship enhancing representation; wherein the k th clustering subject mapping matrix and the k th clustering object mapping matrix are mapping matrices that map the visual objects context representation to the k th clustering subspace. 6 . The visual relationship detection method based on adaptive clustering learning according to claim 5 , wherein the step of “performing regularization to the preliminary visual relationship enhancing representations of different subspaces by clustering-driven attention mechanisms” is specifically: obtaining attentive scores of the clustering subspaces; obtaining a sixth product of the k th preliminary visual relationship enhancing representations and the k th regularized mapping matrix, and performing weighted sum operation to the sixth products of different clustering subspaces by using the attentive scores of the clustering subspace as the clustering weight; wherein, the k th regularized mapping matrix is the k th mapping matrix that transforms the preliminary visual relationship enhancing representation. 7 . The visual relationship detection method based on adaptive clustering learning according to claim 6 , wherein the step of “obtaining attentive scores of the clustering subspaces” is specifically: inputting a predicted category label of visual object of subject and a predicted category label of visual object of object into the visual relationship prior function to obtain a prior distribution over the category label of visual relationship predicate; obtaining a fifth product of the prior distribution over the category label of visual relationship predicate and the k th attention mapping matrix, and substituting the fifth product into the soft max function for normalization; wherein, the k th attention mapping matrix is the mapping matrix that transforms the prior distribution over the category label of visual relationship predicate. 8 . The visual relationship detection method based on adaptive clustering learning according to claim 6 , wherein the step of “fusing the visual relationship sharing representations and the regularized visual relationship enhancing representations with a prior distribution over the category labels of visual relationship predicates, to predict visual relationship predicates by synthetic relational reasoning” is specifically: inputting a predicted category label of visual object of subject and a predicted category label of visual object of object into the visual relationship prior function to obtain a prior distribution over the category label of visual relationship predicate; and obtaining a seventh product of the visual relationship sharing mapping matrix and the visual relationship sharing representations, obtaining an eighth product of the visual relationship enhancing mapping matrix and the regularized visual relationship enhancing representations; summing the seventh product, the eighth product and the prior distribution over the category label of visual relationship predicate, and then substituting the result into the soft max function. 9 . A system for a visual relationship detection method based on adaptive clustering learning, the system comprising: a processor configured for: detecting visual objects from an input image and recognizing the visual objects by contextual message passing mechanism to obtain context representations of the visual objects; embedding the context representations of pair-wise visual objects into a low-dimensional joint subspace to obtain visual relationship sharing representations; embedding the context representations of pair-wise visual objects into a plurality of low-dimensional clustering subspaces, respectively, to obtain a plurality of preliminary visual relationship enhancing representations; and then performing regularization to the preliminary visual relationship enhancing representations by clustering-driven attention mechanisms; and fusing the visual relationship sharing representations, the regularized visual relationship enhancing representations and a prior distribution over the category labels of visual relationship predicates, to predict visual relationship predicates by synthetic relational reasoning. 10 .

Assignees

Univ Tianjin

Inventors

Classifications

G06V10/426Primary
Graphical representations · CPC title
G06V10/82
using neural networks · CPC title
G06F18/23211
with adaptive number of clusters · CPC title
G06F18/23213
with fixed number of clusters, e.g. K-means clustering · CPC title
G06F18/251
of input or preprocessed data · CPC title

Patent family

Related publications grouped by family.

View patent family 70501453

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021192274A1 cover?: The present disclosure discloses a visual relationship detection method based on adaptive clustering learning, including: detecting visual objects from an input image and recognizing the visual objects to obtain context representation; embedding the context representation of pair-wise visual objects into a low-dimensional joint subspace to obtain a visual relationship sharing representation; em…
Who is the assignee on this patent?: Univ Tianjin
What technology area does this patent fall under?: Primary CPC classification G06V10/426. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jun 24 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).