Rapid iterative detection (RID)
US-9213913-B1 · Dec 15, 2015 · US
US2018225548A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2018225548-A1 |
| Application number | US-201815874856-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 18, 2018 |
| Priority date | Jan 19, 2017 |
| Publication date | Aug 9, 2018 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described is a system for multi-view embedding for object recognition. During operation, the system receives an input image and transforms raw data of objects in the image with corresponding labels into low-level features and high-level semantic representations of the labels. A trained object recognition model is generated by embedding the low-level features with multiple high-level semantic representations. The system then receives data of an unknown object and assigns a label to the unknown object using the trained object recognition model. Finally, a device can be controlled based on the label.
Opening claim text (preview).
What is claimed is: 1 . A system for multi-view embedding, the system comprising: one or more processors and a memory, the memory being a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and generate a trained object recognition model by embedding the low-level features with multiple high-level semantic representations; receiving data of an unknown object and assigning a label to the unknown object using the trained object recognition model; and controlling a device based on the label. 2 . The system as set forth in claim 1 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 3 . The system as set forth in claim 2 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 4 . The system as set forth in claim 1 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 5 . The system as set forth in claim 1 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition. 6 . The system as set forth in claim 1 , wherein a soft-max formulation provides a confidence of each possible classification outcome, and the assignment of the label to the unknown object is based on the soft-max formulation. 7 . The system as set forth in claim 1 , wherein a soft-max function is used to provide a degree of compatibility of a pattern's low-level features and corresponding high-level semantic representations, and the assignment of the label to the unknown object is based on the soft-max formulation. 8 . The system as set forth in claim 1 , wherein embedding the low-level features with multiple high-level semantic representations unitizes information from multiple views of a label's semantic representation. 9 . The system as set forth in claim 1 , wherein generating the trained object recognition model further comprises an operation of maximizing a compatibility function value of a feature vector and its matched class high-level representation, while suppressing compatibilities of the feature vector and un-matched class high-level representations. 10 . The system as set forth in claim 9 , wherein generating the trained object recognition model includes improving inference accuracy by maximizing the margins between compatible pairs and incompatible pairs. 11 . A computer program product for multi-view embedding, the computer program product comprising: a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions by one or more processors, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and generate a trained object recognition model by embedding the low-level features with multiple high-level semantic representations; receiving data of an unknown object and assigning a label to the unknown object using the trained object recognition model; and controlling a device based on the label. 12 . The computer program product as set forth in claim 11 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 13 . The computer program product as set forth in claim 12 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 14 . The computer program product as set forth in claim 11 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 15 . The computer program product as set forth in claim 11 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition. 16 . The computer program product as set forth in claim 11 , wherein a soft-max formulation provides a confidence of each possible classification outcome, and the assignment of the label to the unknown object is based on the soft-max formulation. 17 . The computer program product as set forth in claim 1 , wherein a soft-max function is used to provide a degree of compatibility of a pattern's low-level features and corresponding high-level semantic representations, and the assignment of the label to the unknown object is based on the soft-max formulation. 18 . The computer program product as set forth in claim 11 , wherein embedding the low-level features with multiple high-level semantic representations unitizes information from multiple views of a label's semantic representation. 19 . The computer program product as set forth in claim 11 , wherein generating the trained object recognition model further comprises an operation of maximizing a compatibility function value of a feature vector and its matched class high-level representation, while suppressing compatibilities of the feature vector and un-matched class high-level representations. 20 . A computer implemented method for multi-view embedding, the method comprising an act of: causing one or more processors to execute instructions encoded on a non-transitory computer-readable medium, such that upon execution, the one or more processors perform operations of: transforming raw data of objects with corresponding labels into low-level features and high-level semantic representations of the labels; and embedding the low-level features with multiple high-level semantic representations to generate a trained object recognition model; receiving data of an unknown object and assigning a label to the unknown object; and controlling a device based on the label. 21 . The method as set forth in claim 20 , wherein embedding the low-level features with multiple high-level representations includes mapping vectors from distinct spaces into a common space. 22 . The method as set forth in claim 21 , wherein mapping vectors from distinct spaces into a common space includes mapping feature vectors from a m 0 -dimensional space to a m-dimensional space and mapping v-th view semantic vectors from a m v -dimensional space to a common m-dimensional space. 23 . The method as set forth in claim 20 , wherein the unknown object is a navigation object and controlling a device includes causing a vehicle to navigate based on the navigation object. 24 . The method as set forth in claim 20 , wherein a label is assigned to the unknown object if the unknown object matches a label in unknown object classes, thereby resulting in an object recognition.
Categorising the entire scene, e.g. birthday party or wedding scene · CPC title
Classification techniques · CPC title
Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
Extraction of image or video features · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.