Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images

US10565757B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10565757-B2
Application numberUS-201715619114-A
CountryUS
Kind codeB2
Filing dateJun 9, 2017
Priority dateJun 9, 2017
Publication dateFeb 18, 2020
Grant dateFeb 18, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing system transforms an input image into a stylized output image by applying first and second style features from a style exemplar. The input image is provided to a multimodal style-transfer network having a low-resolution-based stylization subnet and a high-resolution stylization subnet. The low-resolution-based stylization subnet is trained with low-resolution style exemplars to apply the first style feature. The high-resolution stylization subnet is trained with high-resolution style exemplars to apply the second style feature. The low-resolution-based stylization subnet generates an intermediate image by applying the first style feature from a low-resolution version of the style exemplar to first image data obtained from the input image. Second image data from the intermediate image is provided to the high-resolution stylization subnet. The high-resolution stylization subnet generates the stylized output image by applying the second style feature from a high-resolution version of the style exemplar to the second image data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: accessing, from a memory device, an input image and a style exemplar; transforming, by a processing device, the input image into a stylized output image by applying a first style feature and a second style feature from the style exemplar to the input image, wherein applying the first style feature and the second style feature comprises: providing the input image to a multimodal style-transfer network having a low-resolution-based stylization subnet and a high-resolution stylization subnet generating, by the low-resolution-based stylization subnet, an intermediate image by applying the first style feature from a low-resolution version of the style exemplar to first image data obtained from the input image, wherein the low-resolution-based stylization subnet is trained with low-resolution style exemplars to apply the first style feature, providing second image data obtained from the intermediate image to an input of the high-resolution stylization subnet, wherein the high-resolution stylization subnet is trained with high-resolution style exemplars to apply the second style feature, wherein the high-resolution stylization subnet comprises an identity connection from the input of the high-resolution stylization subnet to an output of the high-resolution stylization subnet, and generating, by the high-resolution stylization subnet, the stylized output image by applying the second style feature from a high-resolution version of the style exemplar to the second image data, wherein the stylized output image comprises a sum of (i) the second image data received at the input of the high-resolution stylization subnet via the identity connection and (ii) third image data generated by the high-resolution stylization subnet transforming the second image data; and causing an output device to display the stylized output image. 2. The method of claim 1 , further comprising: obtaining, by the processing device, the first image data by downsampling the input image to a first resolution, wherein the low-resolution style exemplars have the first resolution and the intermediate image is generated with the first resolution; and obtaining, by the processing device, the second image data by upsampling, to a second resolution, the intermediate image or an additional intermediate image having a higher resolution than the intermediate image, wherein the high-resolution style exemplars have the second resolution and the stylized output image is generated with the first resolution. 3. The method of claim 1 , wherein the multimodal style-transfer network comprises a feed-forward neural network with subnets that include the low-resolution-based stylization subnet and the high-resolution stylization subnet, wherein the multimodal style-transfer network is trained by iteratively performing operations comprising: computing stylization losses for the subnets, wherein a stylization loss for a subnet is computed from a combination of: (i) a content loss weighted with a content weight, wherein the content loss indicates a difference in semantic content between the input image and an output of the subnet, and (ii) a texture loss weighted with a texture weight, wherein the texture loss indicates a difference in texture between the style exemplar and the output of the subnet; and adjusting the feed-forward neural network based on one or more of the stylization losses. 4. The method of claim 3 , wherein adjusting the feed-forward neural network based on the stylization losses comprises: computing, for each subnet and in each iteration, a respective hierarchical stylization loss for a respective set of subsequent subnets in the feed-forward neural network, wherein each hierarchical stylization loss is computed from a respective weighted combination of stylization losses for the respective set of subsequent subnets; and modifying the feed-forward neural network such that each hierarchical stylization loss is minimized, wherein the iteration ceases based on each hierarchical stylization loss being minimized. 5. The method of claim 4 , wherein the low-resolution-based stylization subnet has a first content weight and a first texture weight that is less than the first content weight, wherein the high-resolution stylization subnet has a second content weight and a second texture weight that is greater than the first content weight. 6. The method of claim 3 , wherein a stylization loss for the high-resolution stylization subnet is computed with respect to the stylized output image having the sum. 7. The method of claim 3 , further comprising outputting the trained multimodal style-transfer network by storing the trained multimodal style-transfer network in an additional memory device accessible by an image manipulation application. 8. The method of claim 1 , wherein the multimodal style-transfer network is a convolutional neural network having a first set of layers comprised in the low-resolution-based stylization subnet and a second set of layers comprised in the high-resolution stylization subnet, wherein the high-resolution stylization subnet receives one or more of (i) an output of the low-resolution-based stylization subnet and (ii) image data generated from the output of the low-resolution-based stylization subnet. 9. A system comprising: a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: accessing an input image and a style exemplar; transforming the input image into a stylized output image by applying a first style feature and a second style feature from the style exemplar to the input image, wherein applying the first style feature and the second style feature comprises: providing the input image to a multimodal style-transfer network having a low-resolution-based stylization subnet and a high-resolution stylization subnet, generating, by the low-resolution-based stylization subnet, an intermediate image by applying the first style feature from a low-resolution version of the style exemplar to first image data obtained from the input image, wherein the low-resolution-based stylization subnet is trained with low-resolution style exemplars to apply the first style feature, providing second image data obtained from the intermediate image to an input of the high-resolution stylization subnet, wherein the high-resolution stylization subnet is trained with high-resolution style exemplars to apply the second style feature, wherein the high-resolution stylization subnet comprises an identity connection from the input of the high-resolution stylization subnet to an output of the high-resolution stylization subnet, and generating, by the high-resolution stylization subnet, the stylized output image by applying the second style feature from a high-resolution version of the style exemplar to the second image data, wherein the stylized output image comprises a sum of (i) the second image data received at the input of the high-resolution stylization subnet via the identity connection and (ii) third image data generated by the high-resolution stylization subnet transforming the second image data; and causing an output device to display the stylized output image. 10. The system of claim 9 , the operations further comprising: obtaining the first image data by downsampling the input image to a first resolution, wherein the low-resolution style exemplars have the first resolution and the intermediate image is generated with the first resolution; and obtaining the second

Assignees

Inventors

Classifications

  • Backpropagation, e.g. using gradient descent · CPC title

  • G06T11/60Primary

    Creating or editing images; Combining images with text · CPC title

  • Scaling of whole images or parts thereof, e.g. expanding or contracting · CPC title

  • Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title

  • Activation functions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10565757B2 cover?
A computing system transforms an input image into a stylized output image by applying first and second style features from a style exemplar. The input image is provided to a multimodal style-transfer network having a low-resolution-based stylization subnet and a high-resolution stylization subnet. The low-resolution-based stylization subnet is trained with low-resolution style exemplars to appl…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T11/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 18 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).