What technology area does this patent fall under?

Primary CPC classification G06N20/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Object recognition and description using multimodal recurrent neural network

US10970603B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10970603-B2
Application number	US-201816205768-A
Country	US
Kind code	B2
Filing date	Nov 30, 2018
Priority date	Nov 30, 2018
Publication date	Apr 6, 2021
Grant date	Apr 6, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An embodiment of the invention may include a method, computer program product and computer system for image identification and classification. The method, computer program product and computer system may include a computing device which may receive one or more images of a first object from at least two angles linguistic data associated with the first object. The computing device may input the one or more images of the first object into one or more first neural networks and the linguistic data of the first object into one or more second neural networks. The computing device may combine the output of the one or more first neural networks and the one or more second neural networks and generate an identification model based on the combined output of the one or more first neural networks and the one or more second neural networks.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for image identification and classification, the method comprising: receiving, by a computer device, one or more images of a first object from at least two angles; receiving, by the computing device, linguistic data associated with the first object, wherein the linguistic data of the first object describes the artist, art medium, age, color, symbol, pattern, function, and motif of the first object; inputting, by the computing device, the one or more images of the first object into one or more first neural networks; inputting, by the computing device, the linguistic data of the first object into one or more second neural networks; combining, by the computing device, an output of the one or more first neural networks and the one or more second neural networks; generating, by the computing device, an identification model based on the combined output of the one or more first neural networks and the one or more second neural networks, wherein the identification model generates a linguistic description for an unknown object; receiving, by the computer device, at least one image of an unknown second object, wherein the second object is the unknown object (multiple images from different angles); inputting, by the computer device, the at least one image of the unknown second object into the identification model to generate a linguistic description of the unknown second object; analyzing, by the computer device, the at least one image of the unknown second object to identify different features of the unknown second object; generating, by the computer device, a novel linguistic description identifying the unknown second object based on the identified different feature of the unknown second object, wherein the linguistic description includes a novel description of the unknown second object describing the unknown second object and the identified features of the unknown second object, wherein the generated linguistic description is based on a probability distribution of generating a word given previous linguistic data on the second neural networks and the one or more images on the first neural networks; and displaying, by the computer device, the novel linguistic description identifying the unknown second object to a user. 2. A method as in claim 1 , wherein the first object is a known piece of art. 3. A method as in claim 1 , wherein the one or more first neural networks are deep convolutional neural networks. 4. A method as in claim 1 , wherein the one or more second neural networks are deep recurrent neural networks. 5. A method as in claim 1 , wherein the first object and the second object may comprise one of the group consisting of: a painting, a mural, graffiti, a drawing, a photograph, a tapestry, a stained glass, a glasswork piece, a metalwork piece, a sculpture, a pottery piece, a porcelain piece, a ceramic piece, jewelry, clothing, furniture, and architecture. 6. A computer program product for image identification and classification, the computer program product comprising: a computer-readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions comprising: program instructions to receive, by a computer device, one or more images of a first object from at least two angles; program instructions to receive, by the computing device, linguistic data associated with the first object, wherein the linguistic data of the first object describes the artist, art medium, age, color, symbol, pattern, function, and motif of the first object: program instructions to input, by the computing device, the one or more images of the first object into one or more first neural networks; program instructions to input, by the computing device, the linguistic data of the first object into one or more second neural networks; program instructions to combine, by the computing device, an output of the one or more first neural networks and the one or more second neural networks; program instructions to generate, by the computing device, an identification model based on the combined output of the one or more first neural networks and the one or more second neural networks, wherein the identification model generates a linguistic description for an unknown object; program instructions to receive, by the computer device, at least one image of an unknown second object, wherein the second object is the unknown object (multiple images from different angles); program instructions to input, by the computer device, the at least one image of the unknown second object into the identification model to generate a linguistic description of the unknown second object; program instructions to analyze, by the computer device, the at least one image of the unknown second object to identify different features of the unknown second object; program instructions to generate, by the computer device, a novel linguistic description identifying the unknown second object based on the identified different feature of the unknown second object, wherein the linguistic description includes a novel description of the unknown second object describing the unknown second object and the identified features of the unknown second object, wherein the generated linguistic description is based on a probability distribution of generating a word given previous linguistic data on the second neural networks and the one or more images on the first neural networks; and program instructions to display by the computer device the novel linguistic description identifying the unknown second object to a user. 7. A computer program product as in claim 6 , wherein the first object is a known piece of art. 8. A computer program product as in claim 6 , wherein the one or more first neural networks are deep convolutional neural networks. 9. A computer program product as in claim 6 , wherein the one or more second neural networks are deep recurrent neural networks. 10. A computer system for image identification and classification, the system comprising: one or more computer processors, one or more computer-readable storage media, and program instructions stored on one or more of the computer-readable storage media for execution by at least one of the one or more processors, the program instructions comprising: program instructions to receive, by a computer device, one or more images of a first object from at least two angles; program instructions to receive, by the computing device, linguistic data associated with the first object; program instructions to input, by the computing device, the one or more images of the first object into one or more first neural networks; program instructions to input, by the computing device, the linguistic data of the first object into one or more second neural networks, wherein the linguistic data of the first object describes the artist, art medium, age, color, symbol, pattern, function, and motif of the first object; program instructions to combine, by the computing device, an output of the one or more first neural networks and the one or more second neural networks; program instructions to generate, by the computing device, an identification model based on the combined output of the one or more first neural networks and the one or more second neural networks, wherein the identification model generates a linguistic description for an unknown object; program instructions to receive, by the computer device, at least one image of an unknown second object wherein the second object is the unknown object (multiple images from different angles); program instructions to input, by the computer device, the at leas

Assignees

Inventors

Classifications

G06N20/20Primary
Ensemble learning · CPC title
G06V20/10
Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title
G06V10/82
using neural networks · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G06F18/25
Fusion techniques · CPC title

Patent family

Related publications grouped by family.

View patent family 70850108

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10970603B2 cover?: An embodiment of the invention may include a method, computer program product and computer system for image identification and classification. The method, computer program product and computer system may include a computing device which may receive one or more images of a first object from at least two angles linguistic data associated with the first object. The computing device may input the o…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Font recognition using triplet loss neural network training

Systems and methods for fast novel visual concept learning from sentence descriptions of images

Neural network combined image and text evaluator and classifier

Intelligent image captioning

Method and system for joint training of hybrid neural networks for acoustic modeling in automatic speech recognition

Frequently asked questions