Object pose recognition
US-9818195-B2 · Nov 14, 2017 · US
US10402448B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10402448-B2 |
| Application number | US-201715635387-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 28, 2017 |
| Priority date | Jun 28, 2017 |
| Publication date | Sep 3, 2019 |
| Grant date | Sep 3, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods of the present disclosure can use machine-learned image descriptor models for image retrieval applications and other applications. A trained image descriptor model can be used to analyze a plurality of database images to create a large-scale index of keypoint descriptors associated with the database images. An image retrieval application can provide a query image as input to the trained image descriptor model, resulting in receipt of a set of keypoint descriptors associated with the query image. Keypoint descriptors associated with the query image can be analyzed relative to the index to determine matching descriptors (e.g., by implementing a nearest neighbor search). Matching descriptors can then be geometrically verified and used to identify one or more matching images from the plurality of database images to retrieve and provide as output (e.g., by providing for display) within the image retrieval application.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of image retrieval, comprising: receiving, by a computing system comprising one or more computing devices, a query image; determining, by the computing system, a plurality of local feature descriptors from the query image; determining, by the computing system, an attention score for each local feature descriptor; determining, by the computing system, a set of keypoint descriptors for the query image based at least in part on the attention scores, the set of keypoint descriptors corresponding to a subset of the local feature descriptors; reducing, by the computing system, a spatial dimensionality of the set of keypoint descriptors for the query image; and retrieving, by the computing system, one or more images corresponding to the query image, based at least in part on the set of keypoint descriptors for the query image. 2. The computer-implemented method of image retrieval of claim 1 , wherein the set of keypoint descriptors comprises a predetermined number of local feature descriptors having the highest attention scores for the query image. 3. The computer-implemented method of image retrieval of claim 1 , further comprising: receiving, by the computing system, a plurality of database images; determining, by the computing system, a plurality of local feature descriptors for each database image; determining, by the computing system, an attention score for the local feature descriptors associated with each database image; and determining, by the computing system, a set of keypoint descriptors for each database image based at least in part on the attention scores, the set of keypoint descriptors corresponding to a subset of the local feature descriptors for that database image; and wherein retrieving, by the computing system, one or more images corresponding to the query image comprises retrieving, by the computing system, one or more images from the plurality of database images based at least in part on the set of keypoint descriptors for the query image and the set of keypoint descriptors for each database image. 4. The computer-implemented method of image retrieval of claim 3 , further comprising determining a set of matching features by comparing the keypoint descriptors associated with the query image with the keypoint descriptors associated with the plurality of database images, and wherein the set of matching features is used to retrieve the one or more matching images from the plurality of database images. 5. The computer-implemented method of image retrieval of claim 4 , wherein determining a set of matching features comprises implementing a nearest neighbor search among keypoint descriptors associated with the query image and keypoint descriptors associated with the plurality of database images. 6. The computer-implemented method of image retrieval of claim 4 , further comprising performing, by the computing system, geometric verification to evaluate the set of matching features across the query image and the one or more matching images. 7. The computer-implemented method of image retrieval of claim 1 , further comprising: constructing, by the computing system, an image pyramid based at least in part on the query image, the image pyramid comprising a plurality of image levels; and inputting each of the plurality of image levels into the machine-learned image descriptor model, independently. 8. One or more tangible, non-transitory computer-readable media storing computer-readable instructions that when executed by one or more processors cause the one or more processors to perform operations, the operations comprising: obtaining data descriptive of a machine-learned image descriptor model, wherein the machine-learned image descriptor model has been trained to receive one or more input images and, in response to receipt of the one or more input images, determine one or more local feature descriptors in the one or more input images, determine an attention score for each of the one or more local feature descriptors, and provide a set of keypoint descriptors based at least in part on the attention score for each of the one or more local feature descriptors, each keypoint descriptor describing a selected local feature determined from the one or more input images such that the set of keypoint descriptors corresponds to a subset of the local feature descriptors; obtaining a query image; inputting the query image into the machine-learned image descriptor model; receiving, as an output of the machine-learned image descriptor model, a set of keypoint descriptors, each keypoint descriptor describing a selected local feature determined from the query image and selected based on a respective attention score generated for the selected local feature by the machine-learned image descriptor model; and providing the set of keypoint descriptors as to an image processing application. 9. The one or more tangible, non-transitory computer-readable media of claim 8 , wherein the machine-learned image descriptor model has been trained based on a set of training data that includes a first portion of training data corresponding to a plurality of training images and a second portion of training data corresponding to image-level labels associated with the plurality of training images. 10. The one or more tangible, non-transitory computer-readable media of claim 9 , wherein the image-level labels included within the second portion of training data comprise one or more of a visual feature label and a geographic position label. 11. The one or more tangible, non-transitory computer-readable media of claim 10 , wherein one or more of the plurality of training images do not contain a visual feature. 12. The one or more tangible, non-transitory computer-readable media of claim 8 , wherein the machine-learned image descriptor model comprises a convolutional neural network. 13. The one or more tangible, non-transitory computer-readable media of claim 8 , wherein the machine-learned image descriptor model has been trained based on a first training process to learn determination of the one or more local feature descriptors and a second training process to learn determination of the attention score for each of the one or more local feature descriptors given the determined local feature descriptors. 14. The one or more tangible, non-transitory computer-readable media of claim 13 , wherein the machine-learned image descriptor model has been trained based on a set of training data that includes a plurality of training images, and wherein the plurality of training images are randomly resealed during the second training process. 15. The one or more tangible, non-transitory computer-readable media of claim 8 , wherein the machine-learned image descriptor model comprises a plurality of shared layers that are used at least in part for both determining the one or more local feature descriptors and for determining an attention score for each of the one or more local feature descriptors. 16. The one or more tangible, non-transitory computer-readable media of claim 8 , the operations further comprising: obtaining a plurality of database images; inputting the plurality of database images into the machine-learned image descriptor model; receiving, as an output of the machine-learned image descriptor model, a set of keypoint descriptors, each keypoint descriptor describing a selected local feature identified from the plurality of database images; determining a set of matching features by comparing the keypoint descriptors associated with the query image with the keypoint descri
Salient features, e.g. scale invariant feature transforms [SIFT] · CPC title
using shape and object relationship · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Matching configurations of points or features · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.