Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F16/532. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Nov 12 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Techniques for Modifying a Query Image

US2020356591A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2020356591-A1
Application number	US-201916408192-A
Country	US
Kind code	A1
Filing date	May 9, 2019
Priority date	May 9, 2019
Publication date	Nov 12, 2020
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented technique is described herein for performing an image-based search that allows a user to create a custom query image that expresses the user's search intent. The technique generates the query image based on one or more input images and/or one or more information items that describe at least one desired characteristic of the query image. The technique then submits the query image to a search engine, and, in response, receives a set of candidate images that match the query image. In one implementation, the technique constructs the query image using a decoder neural network that operates on a mixed latent variable vector. In one approach, the technique uses a generative adversarial network (GAN) to produce the decoder neural network.

First claim

Opening claim text (preview).

What is claimed is: 1 . One or more computing devices for performing an image-based search, comprising: hardware logic circuitry including: (a) one or more hardware processors that perform operations by executing machine-readable instructions stored in a memory, and/or (b) one or more other hardware logic units that perform operations using a task-specific collection of logic gates, the operations including: receiving a selection of an input image from a user in response to manipulation of an input device by the user; extracting a first information item from the input image, the first information item representing at least one existing characteristic of the input image; providing a second information item that specifies at least one desired image characteristic; generating a query image based on the first information item and the second information item, the query image containing content that represents a combination of said at least one existing characteristic of the input image and said at least one desired image characteristic; submitting the query image to a computer-implemented search engine; receiving a set of candidate images that match the query image, as assessed by the search engine; and presenting the set of candidate images to the user using an output device. 2 . The one or more computing devices of claim 1 , wherein the input image corresponds to a first input image, wherein the first information item corresponds to a first latent variable vector associated with the first input image, and the second information item corresponds to a second latent variable vector associated with a received second input image, wherein said extracting comprises using an encoder, implemented by the hardware logic circuitry, to produce the first latent variable vector based on the first input image, wherein said providing comprises using an encoder, implemented by the hardware logic circuitry, to produce the second latent variable vector based on the second input image, wherein the operations further include combining the first latent variable vector and at least the second latent variable vector to produce a mixed latent variable vector, and wherein said generating comprises using a decoder neural network, implemented by the hardware logic circuitry, to produce the query image based on the mixed latent variable vector, the decoder neural network operating based on parameter values provided by a generative machine-trained model. 3 . The one or more computing devices of claim 2 , wherein the operations further include: receiving textual information from the user that describes the second input image; and retrieving the second input image by performing a search based on the textual information. 4 . The one or more computing devices of claim 2 , wherein the first input image shows a product, and the second input image shows a desired characteristic of the product. 5 . The one or more computing devices of claim 2 , wherein said combining includes combining the first latent variable vector and plural supplemental latent variable vectors, to produce the mixed latent variable vector, the plural supplemental latent variable vectors being associated with plural input images retrieved by performing a text-based image search, the plural supplemental latent variable vectors including the second latent variable vector. 6 . The one or more computing devices of claim 2 , wherein the operations further comprise: receiving one or more weighting values in response to one or more selections made by the user by manipulating a graphical control provided by a user interface presentation; and modifying one or more latent variable vectors associated with one or more respective input images based on said one or more weighting values. 7 . The one or more computing devices of claim 6 , wherein the graphical control includes at least one slider bar. 8 . The one or more computing devices of claim 6 , wherein the graphical control includes a cursor navigation space, wherein different reference points on a periphery of the cursor navigation space correspond to respective input images, and wherein a weighting value to be applied to an input image is based on a position of a cursor in the cursor navigation space with respect to the reference points. 9 . The one or more computing devices of claim 2 , wherein the operations further include: changing one or more weighting values that are applied to one or more respective latent variable vectors associated with one or more respective input images; in response to said changing, displaying a changing representation of a generated image produced based on said one or more latent variable vectors; and receiving an instruction from the user to save a set of weighting values, the user making the instruction upon observing a desired state of the generated image. 10 . The one or more computing devices of claim 2 , wherein a training system produces the decoder neural network by training a generator component in a generative adversarial network. 11 . The one or more computing devices of claim 2 , wherein each encoder operates by: (a) converting a given input image into a feature-space representation of the given input item; (b) using the decoder neural network to convert a candidate latent variable vector associated with the given input image into a candidate output image; (c) converting the candidate output image into a feature-space representation of the candidate output image; (d) determining a distance between the feature-space representation of the given input image and the feature-space representation of the candidate output image; (e) adjusting the candidate latent variable vector based on the distance; and repeating operations (a) through (e) plural times until an optimization objective is achieved. 12 . The one or more computing devices of claim 2 , wherein each encoder is implemented using a feed-forward neural network that approximates results of a process for iteratively finding a latent variable vector. 13 . The one or more computing devices of claim 2 , wherein each encoder operates by down-sampling a given input image into a reduced-size input image. 14 . The one or more computing devices of claim 2 , wherein a training system produces each encoder and the decoder neural network by training an encoder component and a decoder component, respectively, of a variational autoencoder (VAE). 15 . The one or more computing devices of claim 2 , wherein a training system produces each encoder and the decoder neural network by training an encoder component and a decoder component, respectively, of a flow-based neural network system in which the decoder component implements an inverse of a function provided by the encoder component. 16 . A method performing an image-based search, comprising: receiving at least a first input image and a second input image in response to selection of the first input image and the second input image by a user; using an encoder to produce a first latent variable vector based on the first input image; using an encoder to produce a second latent variable vector based on the second input image; combining the first latent variable vector and at least the second latent variable vector to produce a mixed latent variable vector; using a decoder neural network to produce a query image based on the mixed latent variable vector; submitting the query image to a computer-implemented search engine; receiving a set of candidate images that match the query image, as assessed by the search engine;

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/094
Adversarial learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0475
Generative networks · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title

Patent family

Related publications grouped by family.

View patent family 70289452

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020356591A1 cover?: A computer-implemented technique is described herein for performing an image-based search that allows a user to create a custom query image that expresses the user's search intent. The technique generates the query image based on one or more input images and/or one or more information items that describe at least one desired characteristic of the query image. The technique then submits the quer…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F16/532. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Nov 12 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).