Attribute aware zero shot machine vision system via joint sparse representations

US10908616B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10908616-B2
Application numberUS-201816033638-A
CountryUS
Kind codeB2
Filing dateJul 12, 2018
Priority dateMay 5, 2017
Publication dateFeb 2, 2021
Grant dateFeb 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described is a system for object recognition. The system generates a training image set of object images from multiple image classes. Using a training image set and annotated semantic attributes, a model is trained that maps visual features from known images to the annotated semantic attributes using joint sparse representations with respect to dictionaries of visual features and semantic attributes. The trained model is used for mapping visual features of an unseen input image to its semantic attributes. The unseen input image is classified as belonging to an image class, and a device is controlled based on the classification of the unseen input image.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for object recognition, the system comprising: one or more processors and a non-transitory computer-readable medium having executable instructions encoded thereon such that when executed, the one or more processors perform operations of: using a training image set and annotated semantic attributes, training a model that maps visual features from known images to the annotated semantic attributes using joint sparse representations with respect to dictionaries of visual features and semantic attributes; using the trained model for mapping visual features of an unseen input image to its semantic attributes; classifying the unseen input image as belonging to an image class; and controlling a device based on the classification of the unseen input image, wherein the device is a vehicle component, and controlling the device results in a vehicle maneuver. 2. The system as set forth in claim 1 , wherein the one or more processors further perform an operation of generating a training image set comprising object images from a plurality of image classes, wherein each object image in the training image set has been annotated with a class label and semantic attributes describing the object image. 3. The system as set forth in claim 1 , wherein for training the model, a visual feature space and a semantic attribute space are modeled as nonlinear spaces that provide an identical sparse representation for visual features and their corresponding semantic attributes. 4. The system as set forth in claim 1 , wherein the one or more processors further perform operations of: finding a sparse representation for a visual feature extracted from the unseen input image; and generating a semantic attribute prediction that is resolved in the semantic attribute space of the model, wherein a soft-assignment probability vector identifies a probability of the semantic attribute prediction belonging to a class of unseen images. 5. The system as set forth in claim 4 , wherein a regularization parameter is used to regulate entropy of the soft-assignment probability vector. 6. The system as set forth in claim 4 , wherein, given the semantic attribute prediction, the unseen input image is labeled using a class label of a closest semantic attribute in the semantic attribute space of the model. 7. A computer implemented method for object recognition, the method comprising an act of: causing one or more processors to execute instructions encoded on a non-transitory computer-readable medium, such that upon execution, the one or more processors perform operations of: using a training image set and annotated semantic attributes, training a model that maps visual features from known images to the annotated semantic attributes using joint sparse representations with respect to dictionaries of visual features and semantic attributes; using the trained model for mapping visual features of an unseen input image to its semantic attributes; classifying the unseen input image as belonging to an image class; and controlling a device based on the classification of the unseen input image, wherein the device is a vehicle component, and controlling the device results in a vehicle maneuver. 8. The method as set forth in claim 7 , wherein the one or more processors further perform an operation of generating a training image set comprising object images from a plurality of image classes, wherein each object image in the training image set has been annotated with a class label and semantic attributes describing the object image. 9. The method as set forth in claim 7 , wherein for training the model, a visual feature space and a semantic attribute space are modeled as nonlinear spaces that provide an identical sparse representation for visual features and their corresponding semantic attributes. 10. The method as set forth in claim 7 , wherein the one or more processors further perform operations of: finding a sparse representation for a visual feature extracted from the unseen input image; and generating a semantic attribute prediction that is resolved in the semantic attribute space of the model, wherein a soft-assignment probability vector identifies a probability of the semantic attribute prediction belonging to a class of unseen images. 11. The method as set forth in claim 10 , wherein a regularization parameter is used to regulate entropy of the soft-assignment probability vector. 12. The method as set forth in claim 10 , wherein, given the semantic attribute prediction, the unseen input image is labeled using a class label of a closest semantic attribute in the semantic attribute space of the model. 13. A computer program product for object recognition, the computer program product comprising: a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions by one or more processors, the one or more processors perform operations of: using a training image set and annotated semantic attributes, training a model that maps visual features from known images to the annotated semantic attributes using joint sparse representations with respect to dictionaries of visual features and semantic attributes; using the trained model for mapping visual features of an unseen input image to its semantic attributes; classifying the unseen input image as belonging to an image class; and controlling a device based on the classification of the unseen input image, wherein the device is a vehicle component, and controlling the device results in a vehicle maneuver. 14. The computer program product as set forth in claim 13 , further comprising instructions for causing the one or more processors to further perform an operation of generating a training image set comprising object images from a plurality of image classes, wherein each object image in the training image set has been annotated with a class label and semantic attributes describing the object image. 15. The computer program product as set forth in claim 13 , wherein for training the model, a visual feature space and a semantic attribute space are modeled as nonlinear spaces that provide an identical sparse representation for visual features and their corresponding semantic attributes. 16. The computer program product as set forth in claim 13 , further comprising instructions for causing the one or more processors to further perform operations of: finding a sparse representation for a visual feature extracted from the unseen input image; and generating a semantic attribute prediction that is resolved in the semantic attribute space of the model, wherein a soft-assignment probability vector identifies a probability of the semantic attribute prediction belonging to a class of unseen images. 17. The computer program product as set forth in claim 16 , wherein a regularization parameter is used to regulate entropy of the soft-assignment probability vector. 18. The computer program product as set forth in claim 16 , wherein, given the semantic attribute prediction, the unseen input image is labeled using a class label of a closest semantic attribute in the semantic attribute space of the model. 19. The system as set forth in claim 1 , wherein the vehicle maneuver is a collision avoidance maneuver. 20. The system as set forth in claim 1 , wherein the unseen input image is an image of an avoidance object, and wherein an alert is generated when the avoidance object is classified.

Assignees

Inventors

Classifications

  • Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title

  • characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title

  • based on sparsity criteria, e.g. with an overcomplete basis · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • based on distances to training or reference patterns · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10908616B2 cover?
Described is a system for object recognition. The system generates a training image set of object images from multiple image classes. Using a training image set and annotated semantic attributes, a model is trained that maps visual features from known images to the annotated semantic attributes using joint sparse representations with respect to dictionaries of visual features and semantic attri…
Who is the assignee on this patent?
Hrl Lab Llc
What technology area does this patent fall under?
Primary CPC classification G06V10/7715. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).