Generating multimodal image edits for a digital image

US10592776B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10592776-B2
Application numberUS-201715427598-A
CountryUS
Kind codeB2
Filing dateFeb 8, 2017
Priority dateFeb 8, 2017
Publication dateMar 17, 2020
Grant dateMar 17, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure is directed towards methods and systems for determining multimodal image edits for a digital image. The systems and methods receive a digital image and analyze the digital image. The systems and methods further generate a feature vector of the digital image, wherein each value of the feature vector represents a respective feature of the digital image. Additionally, based on the feature vector and determined latent variables, the systems and methods generate a plurality of determined image edits for the digital image, which includes determining a plurality of set of potential image attribute values and selecting a plurality of sets of determined image attribute values from the plurality of sets of potential image attribute values wherein each set of determined image attribute values comprises a determined image edit of the plurality of image edits.

First claim

Opening claim text (preview).

We claim: 1. A system for identifying multimodal image edits for a digital image, the system comprising: one or more processors; and memory coupled to the one or more processors, the memory encoded with a set of instructions that, when executed by the one or more processors, causes the one or more processors to: receive a digital image from a client device; generate a feature vector representing the digital image, wherein generating the feature vector includes feeding the digital image into a convolutional neural network, wherein each value of the feature vector represents a feature of the digital image; determine at least one latent variable from noise data; determine a plurality of potential image edits based on the feature vector of the digital image and the at least one latent variable by feeding the feature vector and the at least one latent variable into a conditional variational autoencoder; identify a subset of image edits from the plurality of potential image edits by clustering the plurality of potential image edits; and provide a set of differently edited versions of the digital image, each edited version of the digital image comprising an image edit of the subset of image edits applied to the digital image. 2. The system of claim 1 , wherein identifying a subset of image edits from the plurality of potential image edits further comprises: receiving data representing a user's previous edits performed on other digital images; clustering, based at least partially on the data representing the user's previous edits, the plurality of potential image edits to determine a plurality of clusters; and selecting a cluster center of each cluster of the plurality of clusters, the cluster center representing an image edit of the subset of image edits. 3. The system of claim 2 , further comprising instructions that, when executed by the one or more processors, cause the system to associate, based at least partially on the data representing the user's previous edits, an editing category with the user. 4. The system of claim 3 , wherein associating the editing category with the user comprises associating an expert editing category with the user. 5. The system of claim 1 , further comprising instructions that, when executed by the one or more processors, cause the system to: determine a weight value of each feature of the digital image represented in the feature vector; and determine, based at least in further part on the weight value of each feature of the digital image represented in the feature vector, the plurality of potential image edits. 6. The system of claim 1 , wherein each image edit of the subset of image edits comprises a plurality of individual image attribute values. 7. The system of claim 1 , wherein identifying a subset of image edits from the plurality of potential image edits further comprises: determining a mean distribution of the plurality of potential image edits via the conditional variational autoencoder; k-means clustering the mean distribution to determine a plurality of clusters; and selecting a cluster center from each cluster of the plurality of clusters, the cluster center representing an image edit of the subset of image edits. 8. In a digital medium environment for editing digital images, a method of providing a plurality of differently edited digital images, the method comprising: receiving a digital image from a client device; generating a feature vector representing the digital image, wherein generating the feature vector includes feeding the digital image into a convolutional neural network, wherein each value of the feature vector represents a feature of the digital image; determining at least one latent variable from noise data; determining a plurality of potential image edits based on the feature vector of the digital image and the at least one latent variable by feeding the feature vector and the at least one latent variable into a conditional variational autoencoder; identifying a subset of image edits from the plurality of potential image edits by clustering the plurality of potential image edits; and providing a set of differently edited versions of the digital image, each edited version of the digital image comprising an image edit of the subset of image edits applied to the digital image. 9. The method of claim 8 , wherein identifying a subset of image edits from the plurality of potential image edits comprises: receiving data representing a user's previous edits performed on other digital images; clustering, based at least partially on the data representing the user's previous edits, the plurality of potential image edits to determine a plurality of clusters; and selecting a cluster center of each cluster of the plurality of clusters, the cluster center representing an image edit of the subset of image edits. 10. The method of claim 9 , further comprising associating, based at least partially on the data representing the user's previous edits, an editing category with the user. 11. The method of claim 10 , wherein associating the editing category with the user comprises associating an expert editing category with the user. 12. The method of claim 8 , further comprising: determining a weight value of each feature of the digital image represented in the feature vector; and determining, based at least in further part on the weight value of each feature of the digital image represented in the feature vector, the plurality of potential image edits. 13. The method of claim 8 , wherein each image edit of the subset of image edits comprises at least eleven individual image attribute values. 14. The method of claim 8 , wherein identifying a subset of image edits from the plurality of potential image edits further comprises: determining a mean distribution of the plurality of potential image edits via the conditional variational autoencoder; k-means clustering the mean distribution to determine a plurality of clusters; and selecting a cluster center from each cluster of the plurality of clusters, the cluster center representing an image edit of the subset of image edits. 15. A non-transitory computer readable medium comprising instructions that, when executed by at least one processor, cause a computer system to: receive a digital image from a client device; generate a feature vector representing the digital image, wherein generating the feature vector includes feeding the digital image into a convolutional neural network, wherein each value of the feature vector represents a feature of the digital image; determine at least one latent variable from noise data; determine a plurality of potential image edits based on the feature vector of the digital image and the at least one latent variable by feeding the feature vector and the at least one latent variable into a conditional variational autoencoder; identify a subset of image edits from the plurality of potential image edits by clustering the plurality of potential image edits; and provide a set of differently edited versions of the digital image, each edited version of the digital image comprising an image edit of the subset of image edits applied to the digital image. 16. The non-transitory computer readable medium of claim 15 , further comprising instructions that, when executed by the at least one processor, cause the computer system to identify a subset of image edits from the plurality of potential image edits by: receiving data representing a user's previous edits performed on other digital images; clustering, based at least partially on the data representing

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Non-hierarchical techniques, e.g. based on statistics of modelling distributions · CPC title

  • with fixed number of clusters, e.g. K-means clustering · CPC title

  • G06T11/60Primary

    Creating or editing images; Combining images with text · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10592776B2 cover?
The present disclosure is directed towards methods and systems for determining multimodal image edits for a digital image. The systems and methods receive a digital image and analyze the digital image. The systems and methods further generate a feature vector of the digital image, wherein each value of the feature vector represents a respective feature of the digital image. Additionally, based …
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T11/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 17 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).