Machine-vision method to classify input data based on object components

US11023789B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11023789-B2
Application numberUS-201815936403-A
CountryUS
Kind codeB2
Filing dateMar 26, 2018
Priority dateMar 28, 2017
Publication dateJun 1, 2021
Grant dateJun 1, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described is a system for classifying objects and scenes in images. The system identifies salient regions of an image based on activation patterns of a convolutional neural network (CNN). Multi-scale features for the salient regions are generated by probing the activation patterns of the CNN at different layers. Using an unsupervised clustering technique, the multi-scale features are clustered to identify key attributes captured by the CNN. The system maps from a histogram of the key attributes onto probabilities for a set of object categories. Using the probabilities, an object or scene in the image is classified as belonging to an object category, and a vehicle component is controlled based on the object category causing the vehicle component to perform an automated action.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for classifying intensity image data, the system comprising: one or more processors and a non-transitory computer-readable medium having executable instructions encoded thereon such that when executed, the one or more processors perform operations of: identifying salient regions of an intensity image based on activation patterns of a convolutional neural network (CNN) having a plurality of layers; generating multi-scale features for the salient regions by probing the activation patterns of the CNN at different layers; using an unsupervised clustering technique, clustering the multi-scale features to identify key attributes captured by the CNN, wherein the unsupervised clustering technique is an unsupervised deep embedding method, and wherein the output of the unsupervised deep embedding method is an embedding mapping that embeds the multi-scale features into a subspace with the key attributes; obtaining a cluster membership for each salient region using the embedding mapping; using the cluster memberships, generating a histogram of key attributes, wherein the histogram of key attributes encodes a normalized frequency of key attribute occurrences; mapping from the histogram of the key attributes onto probabilities for a set of object categories; classifying at least one object or scene in the intensity image as belonging to an object category using the probabilities; and controlling a vehicle component based on the object category causing the vehicle component to perform an automated action. 2. The system as set forth in claim 1 , wherein each salient region is extracted from the intensity image and identified based on the CNN's activation for each image pixel or group of image pixels in the salient region. 3. The system as set forth in claim 1 , wherein for generating the multi-scale features, the one or more processors perform general average pooling (GAP) at each layer of the CNN. 4. The system as set forth in claim 1 , wherein the one or more processors further perform operations of: calculating a bag of key attributes (BoKA) feature for the intensity image using deep embedding for clustering; concatenating the BoKA feature with an output feature of the CNN, resulting in concatenated features; and using the concatenated features for classifying the at least one object or scene in the intensity image. 5. A computer implemented method for classifying intensity image data, the method comprising an act of: causing one or more processers to execute instructions encoded on a non-transitory computer-readable medium, such that upon execution, the one or more processors perform operations of: identifying salient regions of an intensity image based on activation patterns of a convolutional neural network (CNN) having a plurality of layers; generating multi-scale features for the salient regions by probing the activation patterns of the CNN at different layers; using an unsupervised clustering technique, clustering the multi-scale features to identify key attributes captured by the CNN, wherein the unsupervised clustering technique is an unsupervised deep embedding method, and wherein the output of the unsupervised deep embedding method is an embedding mapping that embeds the multi-scale features into a subspace with the key attributes; obtaining a cluster membership for each salient region using the embedding mapping; using the cluster memberships, generating a histogram of key attributes, wherein the histogram of key attributes encodes a normalized frequency of key attribute occurrences; mapping from the histogram of the key attributes onto probabilities for a set of object categories; classifying at least one object or scene in the intensity image as belonging to an object category using the probabilities; and controlling a vehicle component based on the object category causing the vehicle component to perform an automated action. 6. The method as set forth in claim 5 , wherein each salient region is extracted from the intensity image and identified based on the CNN's activation for each image pixel or group of image pixels in the salient region. 7. The method as set forth in claim 5 , wherein for generating the multi-scale features, the one or more processors perform general average pooling (GAP) at each layer of the CNN. 8. The method as set forth in claim 5 , wherein the one or more processors further perform operations of: calculating a bag of key attributes (BoKA) feature for the intensity image using deep embedding for clustering; concatenating the BoKA feature with an output feature of the CNN, resulting in concatenated features; and using the concatenated features for classifying the at least one object or scene in the intensity image. 9. A computer program product for classifying intensity image data, the computer program product comprising: computer-readable instructions stored on a non-transitory computer-readable medium that are executable by a computer having one or more processors for causing the processor to perform operations of: identifying salient regions of an intensity image based on activation patterns of a convolutional neural network (CNN) having a plurality of layers; generating multi-scale features for the salient regions by probing the activation patterns of the CNN at different layers; using an unsupervised clustering technique, clustering the multi-scale features to identify key attributes captured by the CNN, wherein the unsupervised clustering technique is an unsupervised deep embedding method, and wherein the output of the unsupervised deep embedding method is an embedding mapping that embeds the multi-scale features into a subspace with the key attributes; obtaining a cluster membership for each salient region using the embedding mapping; using the cluster memberships, generating a histogram of key attributes, wherein the histogram of key attributes encodes a normalized frequency of key attribute occurrences; mapping from the histogram of the key attributes onto probabilities for a set of object categories; classifying at least one object or scene in the intensity image as belonging to an object category using the probabilities; and controlling a vehicle component based on the object category causing the vehicle component to perform an automated action. 10. The computer program product as set forth in claim 9 , wherein each salient region is extracted from the intensity image and identified based on the CNN's activation for each image pixel or group of image pixels in the salient region. 11. The computer program product as set forth in claim 9 , wherein for generating the multi-scale features, the one or more processors perform general average pooling (GAP) at each layer of the CNN. 12. The computer program product as set forth in claim 9 , further comprising instructions for causing the one or more processors to further perform operations of: calculating a bag of key attributes (BoKA) feature for the intensity image using deep embedding for clustering; concatenating the BoKA feature with an output feature of the CNN, resulting in concatenated features; and using the concatenated features for classifying the at least one object or scene in the intensity image.

Assignees

Inventors

Classifications

  • G06V20/70Primary

    Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • Classification techniques · CPC title

  • using neural networks · CPC title

  • involving differential geometry, e.g. embedding of pattern manifold · CPC title

  • Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11023789B2 cover?
Described is a system for classifying objects and scenes in images. The system identifies salient regions of an image based on activation patterns of a convolutional neural network (CNN). Multi-scale features for the salient regions are generated by probing the activation patterns of the CNN at different layers. Using an unsupervised clustering technique, the multi-scale features are clustered …
Who is the assignee on this patent?
Hrl Lab Llc
What technology area does this patent fall under?
Primary CPC classification G06V20/70. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 01 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).