What technology area does this patent fall under?

Primary CPC classification G06K9/6232. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Aug 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Multi-view embedding with soft-max based compatibility function for zero-shot learning

US2018225548A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2018225548-A1
Application number	US-201815874856-A
Country	US
Kind code	A1
Filing date	Jan 18, 2018
Priority date	Jan 19, 2017
Publication date	Aug 9, 2018
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described is a system for multi-view embedding for object recognition. During operation, the system receives an input image and transforms raw data of objects in the image with corresponding labels into low-level features and high-level semantic representations of the labels. A trained object recognition model is generated by embedding the low-level features with multiple high-level semantic representations. The system then receives data of an unknown object and assigns a label to the unknown object using the trained object recognition model. Finally, a device can be controlled based on the label.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for multi-view embedding, the system comprising: one or more processors and a memory, the memory being a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and generate a trained object recognition model by embedding the low-level features with multiple high-level semantic representations; receiving data of an unknown object and assigning a label to the unknown object using the trained object recognition model; and controlling a device based on the label. 2 . The system as set forth in claim 1 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 3 . The system as set forth in claim 2 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 4 . The system as set forth in claim 1 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 5 . The system as set forth in claim 1 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition. 6 . The system as set forth in claim 1 , wherein a soft-max formulation provides a confidence of each possible classification outcome, and the assignment of the label to the unknown object is based on the soft-max formulation. 7 . The system as set forth in claim 1 , wherein a soft-max function is used to provide a degree of compatibility of a pattern's low-level features and corresponding high-level semantic representations, and the assignment of the label to the unknown object is based on the soft-max formulation. 8 . The system as set forth in claim 1 , wherein embedding the low-level features with multiple high-level semantic representations unitizes information from multiple views of a label's semantic representation. 9 . The system as set forth in claim 1 , wherein generating the trained object recognition model further comprises an operation of maximizing a compatibility function value of a feature vector and its matched class high-level representation, while suppressing compatibilities of the feature vector and un-matched class high-level representations. 10 . The system as set forth in claim 9 , wherein generating the trained object recognition model includes improving inference accuracy by maximizing the margins between compatible pairs and incompatible pairs. 11 . A computer program product for multi-view embedding, the computer program product comprising: a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions by one or more processors, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and generate a trained object recognition model by embedding the low-level features with multiple high-level semantic representations; receiving data of an unknown object and assigning a label to the unknown object using the trained object recognition model; and controlling a device based on the label. 12 . The computer program product as set forth in claim 11 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 13 . The computer program product as set forth in claim 12 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 14 . The computer program product as set forth in claim 11 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 15 . The computer program product as set forth in claim 11 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition. 16 . The computer program product as set forth in claim 11 , wherein a soft-max formulation provides a confidence of each possible classification outcome, and the assignment of the label to the unknown object is based on the soft-max formulation. 17 . The computer program product as set forth in claim 1 , wherein a soft-max function is used to provide a degree of compatibility of a pattern's low-level features and corresponding high-level semantic representations, and the assignment of the label to the unknown object is based on the soft-max formulation. 18 . The computer program product as set forth in claim 11 , wherein embedding the low-level features with multiple high-level semantic representations unitizes information from multiple views of a label's semantic representation. 19 . The computer program product as set forth in claim 11 , wherein generating the trained object recognition model further comprises an operation of maximizing a compatibility function value of a feature vector and its matched class high-level representation, while suppressing compatibilities of the feature vector and un-matched class high-level representations. 20 . A computer implemented method for multi-view embedding, the method comprising an act of: causing one or more processors to execute instructions encoded on a non-transitory computer-readable medium, such that upon execution, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and embedding the low-level features with multiple high-level semantic representations to generate a trained object recognition model; receiving data of an unknown object and assigning a label to the unknown object; and controlling a device based on the label. 21 . The method as set forth in claim 20 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 22 . The method as set forth in claim 21 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 23 . The method as set forth in claim 20 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 24 . The method as set forth in claim 20 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition.

Assignees

Hrl Lab Llc

Inventors

Classifications

G06V20/35
Categorising the entire scene, e.g. birthday party or wedding scene · CPC title
G06V30/19173
Classification techniques · CPC title
G06V30/19147
Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V10/7715
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
G06V10/40
Extraction of image or video features · CPC title

Patent family

Related publications grouped by family.

View patent family 62908346

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018225548A1 cover?: Described is a system for multi-view embedding for object recognition. During operation, the system receives an input image and transforms raw data of objects in the image with corresponding labels into low-level features and high-level semantic representations of the labels. A trained object recognition model is generated by embedding the low-level features with multiple high-level semantic re…
Who is the assignee on this patent?: Hrl Lab Llc
What technology area does this patent fall under?: Primary CPC classification G06K9/6232. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Aug 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).