Attention Based Feature Compression and Localization for Autonomous Devices
US-2020160117-A1 · May 21, 2020 · US
US11461583B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11461583-B2 |
| Application number | US-201916598579-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 10, 2019 |
| Priority date | Nov 16, 2018 |
| Publication date | Oct 4, 2022 |
| Grant date | Oct 4, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, tangible non-transitory computer-readable media, and devices associated with object localization and generation of compressed feature representations are provided. For example, a computing system can access training data including a source feature representation and a target feature representation. An encoded target feature representation can be generated based on the target feature representation and a machine-learned encoding model. A binarized target feature representation can be generated based on the encoded target feature representation and lossless binarization operations. A reconstructed target feature representation can be generated based on the binarized target feature representation and a machine-learned decoding model. A matching score for the source feature representation and the reconstructed target feature representation can be determined. A loss associated with the matching score can be determined. Parameters of the machine-learned encoding model and the machine-learned decoding model can be adjusted based on the loss.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for training machine-learned models, the computer-implemented method comprising: accessing training data comprising a source feature representation of a training environment and a target feature representation of the training environment, wherein the target feature representation of the training environment comprises at least one of: one or more aerial images of the training environment, one or more satellite images of the training environment, or one or more maps of the training environment; generating an encoded target feature representation based at least in part on the target feature representation and a machine-learned encoding model; generating a binarized target feature representation based at least in part on the encoded target feature representation; generating a reconstructed target feature representation based at least in part on the binarized target feature representation and a machine-learned decoding model; determining a matching score based at least in part on application of a matching function to the source feature representation and the reconstructed target feature representation; determining a loss associated with the matching score for the source feature representation and the reconstructed target feature representation relative to a ground-truth matching score; and adjusting one or more parameters of at least one of the machine-learned encoding model or the machine-learned decoding model based at least in part on the loss. 2. The computer-implemented method of claim 1 , further comprising: generating at least one of the source feature representation and the target feature representation based at least in part on one or more machine-learned feature extraction models. 3. The computer-implemented method of claim 1 , wherein the determining the matching score based at least in part on the application of the matching function to the source feature representation and the reconstructed target feature representation comprises: determining a localized state of a source object in the training environment based at least in part on one or more comparisons of the source feature representation to the reconstructed target feature representation. 4. The computer-implemented method of claim 1 , wherein the determining the loss associated with the matching score for the source feature representation and the reconstructed target feature representation relative to the ground-truth matching score comprises: determining the loss based at least in part on one or more comparisons of the matching score to the ground-truth score. 5. The computer-implemented method of claim 1 , wherein the adjusting one or more parameters of at least one of the machine-learned encoding model or the machine-learned decoding model based at least in part on the loss comprises: backpropagating the loss through the machine-learned decoding model, wherein backpropagating the loss comprises straight through estimation that bypasses the generating the binarized target feature representation of the encoded target feature representation, and wherein the straight through estimation comprises substituting generating the binarized target feature representation with use of an identity function; and backpropagating the loss through the machine-learned encoding model. 6. The computer-implemented method of claim 1 , wherein the loss is based in part on at least one of a matching loss and a compression loss, and wherein the matching loss is based at least in part on an accuracy of the matching score with respect to the ground-truth matching score, and wherein the compression loss is based at least in part on a data size of the reconstructed target feature representation. 7. A computing system comprising: one or more processors; a machine-learned encoding model configured to access a target feature representation and generate an encoded target feature representation based at least in part on the target feature representation; and one or more tangible non-transitory computer-readable media storing computer-readable instructions that when executed by one or more processors cause the one or more processors to perform operations, the operations comprising: accessing target data comprising a target feature representation of an environment; generating an encoded target feature representation based at least in part on the target feature representation and the machine-learned encoding model; generating a binarized target feature representation based at least in part on performance of one or more binary encoding operations on the encoded target feature representation; generating a compressed target feature representation of the encoded target feature representation based at least in part on performance of one or more compression operations on the binarized target feature representation; generating one or more maps based at least in part on the compressed target feature representation; and storing the one or more maps in a storage device of an autonomous vehicle associated with the computing system. 8. The computing system of claim 7 , wherein the machine-learned encoding model is configured based at least in part on joint training with a machine-learned decoding model configured to generate a reconstructed target feature representation based at least in part on the binarized target feature representation, wherein the reconstructed target feature representation is a reconstruction of the target feature representation. 9. The computing system of any of claim 8 , wherein at least one of the machine-learned encoding model or the machine-learned decoding model are trained based at least in part on a matching score based at least in part on application of a matching function to a source feature representation of the environment and the reconstructed target feature representation relative to a ground-truth matching score associated with the source feature representation and the target feature representation. 10. The computing system of claim 9 , wherein a compression loss function comprises a regularization term that is used to increase sparsity of the binarized target feature representation. 11. The computing system of claim 7 , wherein the one or more binary encoding operations reconstruct the compressed target feature representation without loss of information encoded in the target feature representation. 12. The computing system of claim 7 , wherein the one or more binary encoding operations are based at least in part on a frequency of occurrence of one or more portions of the target feature representation, and one or more subsequent encoding operations are based at least in part on one or more redundancies in one or more portions of the target feature representation. 13. The computing system of claim 7 , wherein the one or more compression operations comprise one or more Huffman encoding operations performed prior to one or more Run-Length-Encoding operations. 14. A computing device comprising: one or more processors; a memory comprising one or more computer-readable media, the memory storing computer-readable instructions that when executed by the one or more processors cause the one or more processors to perform operations comprising: accessing target data comprising a target feature representation of an environment; generating an encoded target feature representation of the target feature representation based at least in part on a machine-learned encoding model, wherein the encoded target feature representation has a smaller data size than the target feature representation; generating a binarized target feature
Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods · CPC title
Validation; Performance evaluation · CPC title
Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.