Attention based feature compression and localization for autonomous devices

US11449713B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11449713-B2
Application numberUS-201916598629-A
CountryUS
Kind codeB2
Filing dateOct 10, 2019
Priority dateNov 16, 2018
Publication dateSep 20, 2022
Grant dateSep 20, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, tangible non-transitory computer-readable media, and devices associated with object localization and generation of compressed feature representations are provided. For example, a computing system can access training data including a target feature representation and a source feature representation. An attention feature representation can be generated based on the target feature representation and a machine-learned attention model. An attended target feature representation can be generated based on masking the target feature representation with the attention feature representation. A matching score for the source feature representation and the target feature representation can be determined. A loss associated with the matching score and a ground-truth matching score for the source feature representation and the target feature representation can be determined. Furthermore, parameters of the machine-learned attention model can be adjusted based on the loss.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for training machine-learned models, the computer-implemented method comprising: accessing training data comprising a target feature representation and a source feature representation; generating an attention feature representation based at least in part on the target feature representation and a machine-learned attention model; generating an attended target feature representation based at least in part on masking the target feature representation with the attention feature representation; determining a matching score based at least in part on application of a matching function to the source feature representation and the attended target feature representation; determining a loss associated with the matching score and a ground-truth matching score for the source feature representation and the target feature representation; and adjusting one or more parameters of the machine-learned attention model based at least in part on the loss. 2. The computer-implemented method of claim 1 , further comprising: generating the training data comprising at least one of the source feature representation and the target feature representation based at least in part on one or more machine-learned feature extraction models. 3. The computer-implemented method of claim 1 , wherein the generating the attended target feature representation based at least in part on masking the target feature representation with the attention feature representation comprises: performing one or more content-aware band pass filtering operations that mask one or more portions of the attended target feature representation based at least in part on attention to specific bands in a frequency domain. 4. The computer-implemented method of claim 1 , wherein the determining the matching score based at least in part on application of a matching function to the attended target feature representation and the source feature representation comprises: determining an estimated position of a source object in an environment based at least in part on one or more comparisons of the source feature representation to the attended target feature representation. 5. The computer-implemented method of claim 4 , wherein the determining the loss associated with the matching score and the ground-truth matching score for the source feature representation and the target feature representation comprises: determining the loss based at least in part on one or more comparisons of the estimated position of the source object relative to a ground-truth position of the source object. 6. A computing system comprising: one or more processors; and one or more tangible non-transitory computer-readable media storing computer-readable instructions that are executable by the one or more processors to cause the one or more processors to perform operations, the operations comprising: accessing target data comprising a target feature representation of an environment; accessing a machine-learned attention model configured to generate an attention feature representation of the target feature representation of the environment based at least in part on evaluation of a loss associated with a matching score for a source feature representation and an attended target feature representation relative to a ground-truth matching score for the source feature representation and the target feature representation; generating the attention feature representation based at least in part on the target feature representation and the machine-learned attention model; and generating the attended target feature representation based at least in part on masking the target feature representation with the attention feature representation. 7. The computing system of claim 6 , wherein generating the attended target feature representation based at least in part on masking the target feature representation with the attention feature representation comprises: performing one or more hard attention operations to increase sparsity of the attended target feature representation. 8. The computing system of claim 7 , wherein the performing the one or more hard attention operations on the target feature representation to increase sparsity of the attended target feature representation comprises determining the sparsity of the attended target feature representation based at least in part on evaluation of the attended target feature representation with respect to a sparsity threshold. 9. The computing system of claim 8 , wherein the sparsity threshold is based in part on at least one of a predetermined accuracy of the attended target feature representation with respect to the target feature representation and a predetermined data size of the attended target feature representation. 10. The computing system of claim 6 , wherein the generating the attended target feature representation based at least in part on masking the target feature representation with the attention feature representation comprises: performing one or more compression operations on the attended target feature representation. 11. The computing system of claim 10 , wherein the one or more compression operations comprise a plurality of lossless binary compression operations that reconstruct the attended target feature representation without loss of information encoded in the attended target feature representation. 12. The computing system of claim 10 , wherein the one or more compression operations comprise one or more Huffman encoding operations performed prior to one or more Run-Length-Encoding operations. 13. The computing system of claim 6 , wherein the machine-learned attention model is a convolutional neural network that is trained end-to-end. 14. The computing system of claim 6 , further comprising: storing the attended target feature representation in a storage device of an autonomous vehicle associated with the computing system. 15. The computing system of claim 6 , further comprising: operating, based at least in part on the attended target feature representation, one or more vehicle localization systems or one or more mapping systems, wherein the attended target feature representation is used to determine a location in an environment based at least in part on one or more comparisons to another representation of the environment. 16. A vehicle comprising: one or more processors; a memory comprising one or more computer-readable media, the memory storing computer-readable instructions that are executable by the one or more processors to cause the one or more processors to perform operations comprising: accessing target data comprising a target feature representation of an environment; generating an attention feature representation of the target feature representation based at least in part on a machine-learned attention model that is trained by evaluating a loss associated with a matching score for the attention feature representation and a source feature representation compared to a ground-truth matching score for the target feature representation and the source feature representation, wherein the loss is based at least in part on at least one of a matching loss and a sparsity-inducing loss, the sparsity-inducing loss associated with increasing a sparsity of the attention feature representation; and generating an attended feature representation based at least in part on masking the target feature representation with the attention feature representation. 17. The vehicle of claim 16 , further comprising: storing the attended feature representation in the

Assignees

Inventors

Classifications

  • Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

  • using neural networks · CPC title

  • Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11449713B2 cover?
Systems, methods, tangible non-transitory computer-readable media, and devices associated with object localization and generation of compressed feature representations are provided. For example, a computing system can access training data including a target feature representation and a source feature representation. An attention feature representation can be generated based on the target featur…
Who is the assignee on this patent?
Uatc Llc
What technology area does this patent fall under?
Primary CPC classification H04N19/91. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Sep 20 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).