Utilizing deep learning for rating aesthetics of digital images

US10002415B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10002415-B2
Application numberUS-201615097113-A
CountryUS
Kind codeB2
Filing dateApr 12, 2016
Priority dateApr 12, 2016
Publication dateJun 19, 2018
Grant dateJun 19, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are disclosed for estimating aesthetic quality of digital images using deep learning. In particular, the disclosed systems and methods describe training a neural network to generate an aesthetic quality score digital images. In particular, the neural network includes a training structure that compares relative rankings of pairs of training images to accurately predict a relative ranking of a digital image. Additionally, in training the neural network, an image rating system can utilize content-aware and user-aware sampling techniques to identify pairs of training images that have similar content and/or that have been rated by the same or different users. Using content-aware and user-aware sampling techniques, the neural network can be trained to accurately predict aesthetic quality ratings that reflect subjective opinions of most users as well as provide aesthetic scores for digital images that represent the wide spectrum of aesthetic preferences of various users.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method of estimating aesthetic quality of digital images using deep learning, the method comprising: receiving a plurality of training images with associated user provided ratings; sampling the plurality of training images to identify pairs of training images; and training a neural network to output aesthetic quality scores for identified pairs of training images that, for a given pair of training images, minimizes a difference between predicted user ratings and average user ratings of the user provided ratings for the respective training images in the given pair of training images while maintaining a relative difference between the associated user provided ratings of the training images in the given pair of training images. 2. The method as recited in claim 1 , wherein the training the neural network comprises constructing a training structure including a pairwise loss model and a regression loss model, wherein: the pairwise loss model compares the relative difference between the associated user provided ratings for the identified pairs of training images; and the regression loss model minimizes the difference between the predicted user ratings and the average user ratings for the plurality of training images. 3. The method as recited in claim 2 , wherein minimizing the difference between predicted user ratings and the average user ratings for the plurality of training images comprises minimizing a Euclidean loss between an average user rating of the user provided ratings for the plurality of training images and predicted user ratings for the plurality of training images. 4. The method as recited in claim 2 , wherein generating an aesthetic quality score comprises summing outputs of the regression loss model and the pairwise loss model. 5. The method as recited in claim 1 , wherein sampling the plurality of training images to identify pairs of training images comprises identifying the pairs of training images from the plurality of training images based on an identity of one or more users that rated each training image from the plurality of training images. 6. The method as recited in claim 5 , wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images from the plurality of training images that have been rated by a common user. 7. The method as recited in claim 1 , wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images from the plurality of training images having a predetermined difference between user ratings. 8. The method as recited in claim 7 , wherein the predetermined difference between user ratings differs based on whether images of the pairs of training images are associated with user ratings from a common user or different users. 9. The method as recited in claim 1 , wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images having a common type of content. 10. The method as recited in claim 1 , wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images having a predetermined difference between user ratings based on whether the images of the pairs of training images have a common type of content or different type of content. 11. The method as recited in claim 1 , wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images having a threshold number of common attributes that have been identified by users that rated the plurality of training images. 12. The method as recited in claim 1 , further comprising: utilizing the trained neural network to generate aesthetic quality scores for a collection of input digital images; and categorizing the collection of input digital images based on the generated aesthetic quality scores. 13. A non-transitory computer readable storage medium storing instructions thereon that, when executed by at least one processor, cause a computer system to: receive a digital image; and generate an aesthetic quality score for the digital image and an attribute quality score for each of a plurality of attributes of the digital image using a neural network having a training structure that jointly learns low level parameters for pairs of training images of a plurality of training images and includes an attribute model for each of the plurality of attributes that utilizes the jointly learned low level parameters and outputs an attribute quality score for a given attribute. 14. The non-transitory computer readable medium of claim 13 , wherein the training structure further comprises a regression loss model that minimizes a difference between predicted user ratings and user provided ratings for the plurality of training images. 15. The non-transitory computer readable medium of claim 14 , wherein minimizing the difference between predicted user ratings and user provided ratings for the plurality of training images comprises minimizing a Euclidean loss between a predicted overall quality rating and an average rating of the user provided ratings for each of the plurality of training images. 16. The non-transitory computer readable medium of claim 13 , wherein the attribute model for each of the plurality of attributes minimizes a difference between a predicted rating for the given attribute and user provided ratings for the given attribute. 17. The non-transitory computer readable medium of claim 13 , wherein the training structure further comprises a pairwise loss model that: compares a relative difference between user provided ratings for selected pairs of training images from the plurality of training images; and maintains the relative difference between the user provided ratings for the selected pairs of training images from the plurality of training images. 18. The non-transitory computer readable medium of claim 13 , wherein the plurality of attributes comprise two or more of: interesting content, object emphasis, lighting, color harmony, vivid color, depth of an image field, motion blur, rule of thirds, balancing element, repetition, or symmetry. 19. A system for analyzing digital images to estimate aesthetic quality of the digital images using deep learning, the system comprising: at least one processor; a non-transitory storage medium comprising instructions that, when executed by the at least one processor, cause the system to: receive a plurality of training images with user provided ratings; sample the plurality of training images to identify pairs of images that are rated by one or more common users, pairs of images having a common type of content, or pairs of images that are rated by different users; and train a neural network to output aesthetic quality scores for identified pairs of training images that, for a given pair of training images, minimizes a difference between predicted user ratings and average user ratings of associated user provided ratings for the respective training images in the given pair of training images while maintaining a relative difference between the associated user provided ratings of the given pair of training images. 20. The system as recited in claim 19 , wherein the instructions, when executed by the at least one processor, cause the system to sample the plurality of training images to identify the pairs of training images having a common

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Learning methods · CPC title

  • Image quality inspection · CPC title

  • G06T7/0002Primary

    Inspection of images, e.g. flaw detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10002415B2 cover?
Systems and methods are disclosed for estimating aesthetic quality of digital images using deep learning. In particular, the disclosed systems and methods describe training a neural network to generate an aesthetic quality score digital images. In particular, the neural network includes a training structure that compares relative rankings of pairs of training images to accurately predict a rela…
Who is the assignee on this patent?
Adobe Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/0002. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 19 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).