Generating multimodal image edits

US2020175322A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020175322-A1
Application numberUS-202016784989-A
CountryUS
Kind codeA1
Filing dateFeb 7, 2020
Priority dateFeb 8, 2017
Publication dateJun 4, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure is directed towards methods and systems for determining multimodal image edits for a digital image. The systems and methods receive a digital image and analyze the digital image. The systems and methods further generate a feature vector of the digital image, wherein each value of the feature vector represents a respective feature of the digital image. Additionally, based on the feature vector and determined latent variables, the systems and methods generate a plurality of determined image edits for the digital image, which includes determining a plurality of set of potential image attribute values and selecting a plurality of sets of determined image attribute values from the plurality of sets of potential image attribute values wherein each set of determined image attribute values comprises a determined image edit of the plurality of image edits.

First claim

Opening claim text (preview).

We claim: 1 . In a digital medium environment for editing digital images, a method of generating multimodal image edit values, the method comprising: generating a feature vector representing a digital image; determining at least one latent variable; determining a plurality of potential image edit values based on the feature vector of the digital image and the at least one latent variable by processing the feature vector and the at least one latent variable utilizing a neural network; identifying sets of image edit values from the plurality of potential image edit values; and generating a set of differently edited versions of the digital image by modifying copies of the digital image using the sets of image edit values. 2 . The method of claim 1 , wherein identifying sets of image edit values from the plurality of potential image edit values comprises: receiving data representing a user's previous edits performed on other digital images; clustering, based at least partially on the data representing the user's previous edits, the plurality of potential image edit values to determine a plurality of clusters; and selecting a cluster center of each cluster of the plurality of clusters, the cluster center representing an image edit value. 3 . The method of claim 1 , wherein each image edit value corresponds to an image attribute value for contrast, exposure, saturation, temperature, tint, highlights, shadows, whites, blacks, lights, darks, clarity, or vibrance. 4 . The method of claim 1 , wherein identifying the sets of image edit values from the plurality of potential image edit values comprises: clustering the plurality of potential image edit values to determine a plurality of clusters; and selecting a cluster center of each cluster of the plurality of clusters, the cluster center representing an image edit value of a set of image edit values. 5 . The method of claim 1 , wherein: the neural network comprises a conditional variational autoencoder; and determining the plurality of potential image edit values comprises determining a mean distribution for each of the potential image edit values by processing a concatenation of the feature vector and the at least one latent variable utilizing the conditional variational autoencoder. 6 . The method of claim 5 , wherein identifying the sets of image edit values from the plurality of potential image edit values comprises: k-means clustering the mean distributions to determine a plurality of clusters; and selecting a cluster center from each cluster of the plurality of clusters, the cluster center representing an image edit value. 7 . The method of claim 1 , wherein generating the feature vector representing the digital image comprises extracting feature values from the digital image utilizing a convolutional neural network. 8 . The method of claim 1 , wherein determining the at least one latent variable comprises determining the at least one latent variable from noise data. 9 . The method of claim 1 , wherein determining the at least one latent variable comprises determining a variable that is not directly measurable from noise data. 10 . A system for generating multimodal image edit values, the system comprising: one or more memory devices storing a digital image and a neural network; one or more computing devices configured to cause the system to: generate a feature vector representing the digital image; determine at least one latent variable; determine a plurality of potential image edit values based on the feature vector of the digital image and the at least one latent variable by processing, utilizing the neural network, a combination of the feature vector and the at least one latent variable; identify sets of image edit values from the plurality of potential image edit values; and generate a set of differently edited versions of the digital image by changing, for each edited version of the digital image, image attribute values of the digital image to match a set of image edit values from the sets of image edit values. 11 . The system of claim 10 , wherein the one or more computing devices are configured to cause the system to identify the sets of image edit values from the plurality of potential image edit values by: receiving data representing a user's previous edits performed on other digital images; clustering, based at least partially on the data representing the user's previous edits, the plurality of potential image edit values to determine a plurality of clusters; and selecting a cluster center of each cluster of the plurality of clusters, the cluster center representing an image edit value of a set of image edit values. 12 . The system of claim 10 , wherein the one or more computing devices are configured to cause the system to: determine a weight value of each feature of the digital image represented in the feature vector; and determine, based at least in further part on the weight value of each feature of the digital image represented in the feature vector, the plurality of potential image edit values. 13 . The system of claim 10 , wherein: the neural network comprises a conditional variational autoencoder; and the one or more computing devices are configured to cause the system to determine the plurality of potential image edit values by determining a mean distribution for each of the potential image edit values by processing a concatenation of the feature vector and the at least one latent variable utilizing the conditional variational autoencoder. 14 . The system of claim 13 , wherein the one or more computing devices are configured to cause the system to identify the sets of image edit values from the plurality of potential image edit values by: k-means clustering the mean distributions to determine a plurality of clusters; and selecting a cluster center from each cluster of the plurality of clusters, the cluster center representing an image edit value. 15 . The system of claim 10 , wherein the one or more computing devices are configured to cause the system to generate the feature vector representing the digital image by extracting feature values from the digital image utilizing a convolutional neural network. 16 . A non-transitory computer readable medium comprising instructions that, when executed by at least one processor, cause a computing device to: receive a digital image from a client device; generate a feature vector of the digital image, wherein each value of the feature vector represents a respective feature of the digital image; determine at least one latent variable; determine a plurality of potential image edit values based on the feature vector of the digital image and the at least one latent variable; identify sets of image edit values from the plurality of potential image edit values; and generate a set of differently edited versions of the digital image by changing, for each edited version of the digital image, image attribute values of the digital image to match a set of image edit values from the sets of image edit values. 17 . The non-transitory computer readable medium of claim 16 , further comprising instructions that, when executed by the at least one processor, cause the computing device to identify the sets of image edit values from the plurality of potential image edit values by: determining a mean distribution of the plurality of potential image edit values utilizing a conditional variational autoencoder; k-means clustering the mean distribution to determine a plurality of clusters; and selecting a c

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020175322A1 cover?
The present disclosure is directed towards methods and systems for determining multimodal image edits for a digital image. The systems and methods receive a digital image and analyze the digital image. The systems and methods further generate a feature vector of the digital image, wherein each value of the feature vector represents a respective feature of the digital image. Additionally, based …
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06K9/6223. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 04 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).