What technology area does this patent fall under?

Primary CPC classification G06K9/00228. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 19 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Generating object embeddings from images

US10657359B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10657359-B2
Application number	US-201715818124-A
Country	US
Kind code	B2
Filing date	Nov 20, 2017
Priority date	Nov 20, 2017
Publication date	May 19, 2020
Grant date	May 19, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an object embedding system. In one aspect, a method comprises providing selected images as input to the object embedding system and generating corresponding embeddings, wherein the object embedding system comprises a thumbnailing neural network and an embedding neural network. The method further comprises backpropagating gradients based on a loss function to reduce the distance between embeddings for same instances of objects, and to increase the distance between embeddings for different instances of objects.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for end-to-end training of an object embedding system, the method comprising: iteratively training the object embedding system on a plurality of images, each of the images depicting an object of a particular type, each iteration of the training comprising: providing selected images as input to the object embedding system and generating corresponding embeddings, wherein the object embedding system comprises a thumbnailing neural network and an embedding neural network, wherein each neural network comprises a plurality of consecutive layers that are exclusive of each other, and wherein generating an embedding for an object depicted in an image using the object embedding system comprises: generating a thumbnail representation of the object depicted in the image as output of the thumbnailing neural network, wherein the thumbnailing neural network processes an input in accordance with values of a set of thumbnailing neural network parameters to: determine values of parameters of a spatial transformation that defines a correspondence between pixels of the thumbnail representation and pixels of the image; and generate as output the thumbnail representation using the spatial transformation and the image; generating an embedding by providing the thumbnail representation as input to the embedding neural network that is configured to process the thumbnail representation in accordance with values of a set of embedding neural network parameters to generate an embedding as output; determining gradients based on a loss function to reduce a distance between embeddings for same instances of objects, and to increase the distance between embeddings for different instances of objects; and adjusting the values of the set of thumbnailing neural network parameters and the values of the set of embedding neural network parameters using the gradients. 2. The computer-implemented method of claim 1 , wherein the object embedding system additionally comprises a detection neural network comprising a plurality of consecutive layers, and generating an embedding for an object depicted in an image using the object embedding system additionally comprises: generating an encoded representation of the image by providing the image as input to the detection neural network, wherein the detection neural network is configured to process the image in accordance with values of a set of detection neural network parameters to generate an encoded representation of the image; and providing the encoded representation of the image as input to the thumbnailing neural network. 3. The computer-implemented method of claim 2 , wherein the detection neural network is pre-trained to generate encoded representations of images comprising data identifying predicted locations of objects of the particular type in the image. 4. The computer-implemented method of claim 1 , wherein the embedding neural network is pre-trained based on thumbnail representations of objects of the particular type that are not generated by the thumbnailing neural network. 5. The computer-implemented method of claim 1 , wherein determining gradients based on the loss function additionally comprises, for each selected image: determining positions of key points of the thumbnail representation generated by the thumbnailing neural network; determining positions of the key points of the thumbnail representation in a frame of reference of the image; and reducing an error measure between positions of key points of the object of the particular type depicted in the image and the positions of the key points of the thumbnail representation in the frame of reference of the image. 6. The computer-implemented method of claim 5 , wherein the key points of the object of the particular type depicted in the image comprise vertices of a bounding box around the object of the particular type depicted in the image, and wherein the key points of the thumbnail representation comprise bounding vertices of the thumbnail representation. 7. The computer-implemented method of claim 5 , wherein: the error measure is a sum of errors between the positions of the key points of the object of the particular type depicted in the image and the positions of the key points of the thumbnail representation in the frame of reference of the image; and the error between a position of a key point of the object of the particular type depicted in the image and a corresponding position of a key point of the thumbnail representation in the frame of reference of the image is zero if a distance between them is less than a tolerance radius. 8. The computer-implemented method of claim 7 , wherein the tolerance radius is increased over the training iterations until it reaches a maximum threshold. 9. The computer-implemented method of claim 1 , wherein the spatial transformation of the thumbnailing neural network includes an image warping spatial transformation that defines a correspondence between the pixels of the thumbnail representation and the pixels of the image according to a displacement vector at each pixel of the thumbnail representation. 10. The computer-implemented method of claim 9 , wherein the spatial transformation of the thumbnailing neural network is a composition of an affine spatial transformation and the image warping spatial transformation. 11. The computer-implemented method of claim 1 , wherein the objects of the particular type are faces. 12. A computer-implemented method for identifying objects in images, the method comprising: providing an image as input to an object embedding system trained using the computer-implemented method of claim 1 ; and receiving as output an embedding vector which is indicative of an object in the image. 13. The computer-implemented method of claim 12 , wherein the object embedding system is trained to generate embeddings of faces and wherein the object in the image is a face, the method further comprising: comparing the embedding vector to one or more reference embedding vectors, each associated with a different face, thereby to identify the face in the input image. 14. One or more non-transitory computer storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations for end-to-end training of an object embedding system, the operations comprising: iteratively training the object embedding system on a plurality of images, each of the images depicting an object of a particular type, each iteration of the training comprising: providing selected images as input to the object embedding system and generating corresponding embeddings, wherein the object embedding system comprises a thumbnailing neural network and an embedding neural network, wherein each neural network comprises a plurality of consecutive layers that are exclusive of each other, and wherein generating an embedding for an object depicted in an image using the object embedding system comprises: generating a thumbnail representation of the object depicted in the image as output of the thumbnailing neural network, wherein the thumbnailing neural network processes an input in accordance with values of a set of thumbnailing neural network parameters to: determine values of parameters of a spatial transformation that defines a correspondence between pixels of the thumbnail representation and pixels of the image; and generate as output the thumbnail representation using the spatial transformation and the image; generating an embedding by providing the thumbnail representation as input to the embedding neural network that is con

Assignees

Google Llc

Inventors

Classifications

G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
G06K9/6248
Physics · mapped topic
G06K9/00228Primary
Physics · mapped topic
G06K9/6262
Physics · mapped topic
G06K9/4628
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 64362692

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10657359B2 cover?: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an object embedding system. In one aspect, a method comprises providing selected images as input to the object embedding system and generating corresponding embeddings, wherein the object embedding system comprises a thumbnailing neural network and an embedding neural network. The met…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06K9/00228. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 19 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).