Multi-view embedding with soft-max based compatibility function for zero-shot learning

US2018225548A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018225548-A1
Application numberUS-201815874856-A
CountryUS
Kind codeA1
Filing dateJan 18, 2018
Priority dateJan 19, 2017
Publication dateAug 9, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described is a system for multi-view embedding for object recognition. During operation, the system receives an input image and transforms raw data of objects in the image with corresponding labels into low-level features and high-level semantic representations of the labels. A trained object recognition model is generated by embedding the low-level features with multiple high-level semantic representations. The system then receives data of an unknown object and assigns a label to the unknown object using the trained object recognition model. Finally, a device can be controlled based on the label.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for multi-view embedding, the system comprising: one or more processors and a memory, the memory being a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and generate a trained object recognition model by embedding the low-level features with multiple high-level semantic representations; receiving data of an unknown object and assigning a label to the unknown object using the trained object recognition model; and controlling a device based on the label. 2 . The system as set forth in claim 1 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 3 . The system as set forth in claim 2 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 4 . The system as set forth in claim 1 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 5 . The system as set forth in claim 1 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition. 6 . The system as set forth in claim 1 , wherein a soft-max formulation provides a confidence of each possible classification outcome, and the assignment of the label to the unknown object is based on the soft-max formulation. 7 . The system as set forth in claim 1 , wherein a soft-max function is used to provide a degree of compatibility of a pattern's low-level features and corresponding high-level semantic representations, and the assignment of the label to the unknown object is based on the soft-max formulation. 8 . The system as set forth in claim 1 , wherein embedding the low-level features with multiple high-level semantic representations unitizes information from multiple views of a label's semantic representation. 9 . The system as set forth in claim 1 , wherein generating the trained object recognition model further comprises an operation of maximizing a compatibility function value of a feature vector and its matched class high-level representation, while suppressing compatibilities of the feature vector and un-matched class high-level representations. 10 . The system as set forth in claim 9 , wherein generating the trained object recognition model includes improving inference accuracy by maximizing the margins between compatible pairs and incompatible pairs. 11 . A computer program product for multi-view embedding, the computer program product comprising: a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions by one or more processors, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and generate a trained object recognition model by embedding the low-level features with multiple high-level semantic representations; receiving data of an unknown object and assigning a label to the unknown object using the trained object recognition model; and controlling a device based on the label. 12 . The computer program product as set forth in claim 11 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 13 . The computer program product as set forth in claim 12 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 14 . The computer program product as set forth in claim 11 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 15 . The computer program product as set forth in claim 11 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition. 16 . The computer program product as set forth in claim 11 , wherein a soft-max formulation provides a confidence of each possible classification outcome, and the assignment of the label to the unknown object is based on the soft-max formulation. 17 . The computer program product as set forth in claim 1 , wherein a soft-max function is used to provide a degree of compatibility of a pattern's low-level features and corresponding high-level semantic representations, and the assignment of the label to the unknown object is based on the soft-max formulation. 18 . The computer program product as set forth in claim 11 , wherein embedding the low-level features with multiple high-level semantic representations unitizes information from multiple views of a label's semantic representation. 19 . The computer program product as set forth in claim 11 , wherein generating the trained object recognition model further comprises an operation of maximizing a compatibility function value of a feature vector and its matched class high-level representation, while suppressing compatibilities of the feature vector and un-matched class high-level representations. 20 . A computer implemented method for multi-view embedding, the method comprising an act of: causing one or more processors to execute instructions encoded on a non-transitory computer-readable medium, such that upon execution, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and embedding the low-level features with multiple high-level semantic representations to generate a trained object recognition model; receiving data of an unknown object and assigning a label to the unknown object; and controlling a device based on the label. 21 . The method as set forth in claim 20 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 22 . The method as set forth in claim 21 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 23 . The method as set forth in claim 20 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 24 . The method as set forth in claim 20 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition.

Assignees

Inventors

Classifications

  • Categorising the entire scene, e.g. birthday party or wedding scene · CPC title

  • Classification techniques · CPC title

  • Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title

  • Extraction of image or video features · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018225548A1 cover?
Described is a system for multi-view embedding for object recognition. During operation, the system receives an input image and transforms raw data of objects in the image with corresponding labels into low-level features and high-level semantic representations of the labels. A trained object recognition model is generated by embedding the low-level features with multiple high-level semantic re…
Who is the assignee on this patent?
Hrl Lab Llc
What technology area does this patent fall under?
Primary CPC classification G06K9/6232. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).