What technology area does this patent fall under?

Primary CPC classification G06V10/50. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Identifying a type of object in a digital image based on overlapping areas of sub-images

US12039769B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12039769-B2
Application number	US-202117492485-A
Country	US
Kind code	B2
Filing date	Oct 1, 2021
Priority date	Sep 26, 2018
Publication date	Jul 16, 2024
Grant date	Jul 16, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method identifies a type of object in a digital image. A user and/or one or more processors selects, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another. The user/processors input the first sub-image into a neural network to create a first inference result that includes an overlapping inference result, for the overlapping area, that recognizes a partial portion of a specific type of object based on the overlapping area. The user/processors infer that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image based on the second sub-image and the overlapping inference result. The neural network identifies the specific type of object in the digital image based on the first and second sub-images being sub-images of a same type of object.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: selecting by cloud-based computers, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another in an overlapping area; inputting by a digital camera the first sub-image into a convolutional neural network in order to create a first inference result that comprises an overlapping inference result for the overlapping area that recognizes a partial portion of a specific type of object based on the overlapping area; caching the overlapping inference result; using the cached overlapping inference result to infer by the cloud-based computers that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image; and identifying by the digital camera, by the convolutional neural network, the specific type of object in the digital image based on recognizing the first and second sub-images as being sub-images of a same type of object, wherein: the digital camera captures the digital image of the specific type of object, and the convolutional neural network is a component of the digital camera. 2. The method of claim 1 , wherein the convolutional neural network has been trained to recognize the specific type of object. 3. The method of claim 1 , wherein the first inference result describes a first portion of the specific type of object based on the first sub-image. 4. The method of claim 1 , wherein the digital image is a graph of electronic transmissions of speech, wherein the graph has a time axis, wherein the graph has a frequency axis that is visually coded to create a visually coded graph that indicates an intensity of signals in the electronic transmissions at each time and frequency on the graph, and wherein the method further comprises: sliding, by the convolutional neural network, a window over the visually coded graph in order to perform speech recognition of the speech in the electronic transmissions by the convolutional neural network. 5. The method of claim 1 , wherein the digital image is a full resolution image of the specific type of object. 6. The method of claim 1 , wherein the digital image is a graph of a stream of sound. 7. The method of claim 1 , wherein the digital image is a graph of electronic signal transmissions for a specific sound, and wherein the specific type of object is identified based on inferring that the first sub-image and the second sub-image are parts of the specific sound. 8. A computer program product comprising a computer readable storage medium having program code embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, and wherein the program code is readable and executable by a processor to perform a method comprising: selecting by cloud-based computers, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another in an overlapping area; inputting by a digital camera the first sub-image into a convolutional neural network in order to create a first inference result that comprises an overlapping inference result for the overlapping area that recognizes a partial portion of a specific type of object based on the overlapping area; caching the overlapping inference result; using the cached overlapping inference result to infer by the cloud-based computers that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image; and directing the convolutional neural network to identify by the digital camera the specific type of object in the digital image based on recognizing the first and second sub-images as being sub-images of a same type of object, wherein: the digital camera captures the digital image of the specific type of object, and the convolutional neural network is a component of the digital camera. 9. The computer program product of claim 8 , wherein the convolutional neural network has been trained to recognize the specific type of object. 10. The computer program product of claim 8 , wherein the first inference result describes a first portion of the specific type of object based on the first sub-image. 11. The computer program product of claim 8 , wherein the digital image is a graph of electronic transmissions of speech, wherein the graph has a time axis, wherein the graph has a frequency axis that is visually coded to create a visually coded graph that indicates an intensity of signals in the electronic transmissions at each time and frequency on the graph, and wherein the method further comprises: sliding, by the convolutional neural network, a window over the visually coded graph in order to perform speech recognition of the speech in the electronic transmissions by the convolutional neural network. 12. The computer program product of claim 8 , wherein the digital image is a full resolution image of the specific type of object. 13. The computer program product of claim 8 , wherein the digital image is a graph of a stream of sound. 14. The computer program product of claim 8 , wherein the digital image is a graph of electronic signal transmissions for a specific sound, and wherein the specific type of object is identified based on inferring that the first sub-image and the second sub-image are parts of the specific sound. 15. The computer program product of claim 8 , wherein the program code is provided as a service in a cloud environment. 16. A computer system comprising one or more processors, one or more computer readable memories, and one or more computer readable non-transitory storage mediums, and program instructions stored on at least one of the one or more computer readable non-transitory storage mediums for execution by at least one of the one or more processors via at least one of the one or more computer readable memories, the stored program instructions executed to perform a method comprising: selecting by cloud-based computers, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another in an overlapping area; inputting by a digital camera the first sub-image into a convolutional neural network in order to create a first inference result that comprises an overlapping inference result for the overlapping area that recognizes a partial portion of a specific type of object based on the overlapping area; caching the overlapping inference result; using the cached overlapping inference result to infer by the cloud-based computers that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image; and directing the convolutional neural network to identify by the digital camera the specific type of object in the digital image based on recognizing the first and second sub-images as being sub-images of a same type of object, wherein: the digital camera captures the digital image of the specific type of object, and the convolutional neural network is a component of the digital camera. 17. The computer system of claim 16 , wherein the convolutional neural network has been trained to recognize the specific type of object. 18. The computer system of claim 16 , wherein the program code is provided as a service in a cloud environment.

Assignees

Inventors

Classifications

G06V10/82
using neural networks · CPC title
G06V30/19173
Classification techniques · CPC title
G06V20/41
Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title
G06V10/50Primary
by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis · CPC title

Patent family

Related publications grouped by family.

View patent family 69883240

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12039769B2 cover?: A method identifies a type of object in a digital image. A user and/or one or more processors selects, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another. The user/processors input the first sub-image into a neural network to create a first inference result that includes an overlapping inference result, for …
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06V10/50. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).