Region proposal for image regions that include objects of interest using feature maps from multiple layers of a convolutional neural network model
US-2019073553-A1 · Mar 7, 2019 · US
US12039769B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12039769-B2 |
| Application number | US-202117492485-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 1, 2021 |
| Priority date | Sep 26, 2018 |
| Publication date | Jul 16, 2024 |
| Grant date | Jul 16, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method identifies a type of object in a digital image. A user and/or one or more processors selects, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another. The user/processors input the first sub-image into a neural network to create a first inference result that includes an overlapping inference result, for the overlapping area, that recognizes a partial portion of a specific type of object based on the overlapping area. The user/processors infer that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image based on the second sub-image and the overlapping inference result. The neural network identifies the specific type of object in the digital image based on the first and second sub-images being sub-images of a same type of object.
Opening claim text (preview).
What is claimed is: 1. A method comprising: selecting by cloud-based computers, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another in an overlapping area; inputting by a digital camera the first sub-image into a convolutional neural network in order to create a first inference result that comprises an overlapping inference result for the overlapping area that recognizes a partial portion of a specific type of object based on the overlapping area; caching the overlapping inference result; using the cached overlapping inference result to infer by the cloud-based computers that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image; and identifying by the digital camera, by the convolutional neural network, the specific type of object in the digital image based on recognizing the first and second sub-images as being sub-images of a same type of object, wherein: the digital camera captures the digital image of the specific type of object, and the convolutional neural network is a component of the digital camera. 2. The method of claim 1 , wherein the convolutional neural network has been trained to recognize the specific type of object. 3. The method of claim 1 , wherein the first inference result describes a first portion of the specific type of object based on the first sub-image. 4. The method of claim 1 , wherein the digital image is a graph of electronic transmissions of speech, wherein the graph has a time axis, wherein the graph has a frequency axis that is visually coded to create a visually coded graph that indicates an intensity of signals in the electronic transmissions at each time and frequency on the graph, and wherein the method further comprises: sliding, by the convolutional neural network, a window over the visually coded graph in order to perform speech recognition of the speech in the electronic transmissions by the convolutional neural network. 5. The method of claim 1 , wherein the digital image is a full resolution image of the specific type of object. 6. The method of claim 1 , wherein the digital image is a graph of a stream of sound. 7. The method of claim 1 , wherein the digital image is a graph of electronic signal transmissions for a specific sound, and wherein the specific type of object is identified based on inferring that the first sub-image and the second sub-image are parts of the specific sound. 8. A computer program product comprising a computer readable storage medium having program code embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, and wherein the program code is readable and executable by a processor to perform a method comprising: selecting by cloud-based computers, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another in an overlapping area; inputting by a digital camera the first sub-image into a convolutional neural network in order to create a first inference result that comprises an overlapping inference result for the overlapping area that recognizes a partial portion of a specific type of object based on the overlapping area; caching the overlapping inference result; using the cached overlapping inference result to infer by the cloud-based computers that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image; and directing the convolutional neural network to identify by the digital camera the specific type of object in the digital image based on recognizing the first and second sub-images as being sub-images of a same type of object, wherein: the digital camera captures the digital image of the specific type of object, and the convolutional neural network is a component of the digital camera. 9. The computer program product of claim 8 , wherein the convolutional neural network has been trained to recognize the specific type of object. 10. The computer program product of claim 8 , wherein the first inference result describes a first portion of the specific type of object based on the first sub-image. 11. The computer program product of claim 8 , wherein the digital image is a graph of electronic transmissions of speech, wherein the graph has a time axis, wherein the graph has a frequency axis that is visually coded to create a visually coded graph that indicates an intensity of signals in the electronic transmissions at each time and frequency on the graph, and wherein the method further comprises: sliding, by the convolutional neural network, a window over the visually coded graph in order to perform speech recognition of the speech in the electronic transmissions by the convolutional neural network. 12. The computer program product of claim 8 , wherein the digital image is a full resolution image of the specific type of object. 13. The computer program product of claim 8 , wherein the digital image is a graph of a stream of sound. 14. The computer program product of claim 8 , wherein the digital image is a graph of electronic signal transmissions for a specific sound, and wherein the specific type of object is identified based on inferring that the first sub-image and the second sub-image are parts of the specific sound. 15. The computer program product of claim 8 , wherein the program code is provided as a service in a cloud environment. 16. A computer system comprising one or more processors, one or more computer readable memories, and one or more computer readable non-transitory storage mediums, and program instructions stored on at least one of the one or more computer readable non-transitory storage mediums for execution by at least one of the one or more processors via at least one of the one or more computer readable memories, the stored program instructions executed to perform a method comprising: selecting by cloud-based computers, from a plurality of partially overlapping sub-images of a digital image, a first sub-image and a second sub-image that overlap one another in an overlapping area; inputting by a digital camera the first sub-image into a convolutional neural network in order to create a first inference result that comprises an overlapping inference result for the overlapping area that recognizes a partial portion of a specific type of object based on the overlapping area; caching the overlapping inference result; using the cached overlapping inference result to infer by the cloud-based computers that the second sub-image creates a second inference result that recognizes a second portion of the specific type of object in the second sub-image; and directing the convolutional neural network to identify by the digital camera the specific type of object in the digital image based on recognizing the first and second sub-images as being sub-images of a same type of object, wherein: the digital camera captures the digital image of the specific type of object, and the convolutional neural network is a component of the digital camera. 17. The computer system of claim 16 , wherein the convolutional neural network has been trained to recognize the specific type of object. 18. The computer system of claim 16 , wherein the program code is provided as a service in a cloud environment.
using neural networks · CPC title
Classification techniques · CPC title
Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title
by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.