Switchable propagation neural network

US11328173B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11328173-B2
Application numberUS-202017081805-A
CountryUS
Kind codeB2
Filing dateOct 27, 2020
Priority dateSep 26, 2017
Publication dateMay 10, 2022
Grant dateMay 10, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving ground truth property data for a first frame of a video sequence defining values of a property of each pixel in the first frame; receiving ground truth property data for a second frame of the video sequence defining values of the property of each pixel in the second frame; receiving task-specific affinity values for transitions from the first frame to the second frame; processing, by a first switchable temporal propagation neural network, the ground truth property data for the first frame and the task-specific affinity values to produce property data for the second frame; processing, by a second switchable temporal propagation neural network, the ground truth property data for the second frame and the task-specific affinity values to produce property data for the first frame; and updating coefficients of the first switchable temporal propagation neural network to reduce differences between the ground truth property data for the second frame and the property data for the second frame. 2. The computer-implemented method of claim 1 , further comprising updating coefficients of the second switchable temporal propagation neural network to reduce differences between the ground truth property data for the first frame and the property data for the first frame. 3. The computer-implemented method of claim 1 , wherein a guidance neural network model generates the task-specific affinity values for a task based on task-specific data for the first frame defining values of an attribute of pixels in the first frame and task-specific data for the second frame defining values of the attribute of pixels in the second frame. 4. The computer-implemented method of claim 3 , wherein the guidance neural network model is jointly trained with the first switchable temporal propagation neural network and the second switchable temporal propagation neural network using a training dataset for the task. 5. The computer-implemented method of claim 3 , further comprising updating parameters of the guidance neural network model to reduce the differences between the ground truth property data for the second frame and the property data for the second frame. 6. The computer-implemented method of claim 3 , further comprising updating parameters of the guidance neural network model to reduce differences between a style energy of the ground truth property data for the second frame and a style energy of the property data for the second frame. 7. The computer-implemented method of claim 1 , wherein the task-specific affinity values comprise a global transformation matrix. 8. The computer-implemented method of claim 7 , wherein the second switchable temporal propagation neural network is configured to produce the property data for the first frame according to an inverse transformation matrix corresponding to the global transformation matrix. 9. The computer-implemented method of claim 8 , wherein the inverse transformation matrix and the global transformation matrix are orthogonal when the differences between a style energy of the ground truth property data for the second frame and a style energy of the property data for the second frame are minimized. 10. A system, comprising: a processor configured to implement a first switchable temporal propagation neural network and a second switchable temporal propagation neural network, wherein the first switchable temporal propagation neural network is configured to: receive ground truth property data for a first of a video sequence defining values of a property of each pixel in the first frame; receive task-specific affinity values for transitions from the first frame to the second frame; and process the property data for the first frame and the task-specific affinity values to produce property data for the second frame, and the second switchable temporal propagation neural network is configured to: receive ground truth property data for a second frame of the video sequence defining values of the property of each pixel in the second frame; and process ground truth property data for the second frame and the task-specific affinity values to produce property data for the first frame, wherein coefficients of the first switchable temporal propagation neural network are updated to reduce differences between the ground truth property data for the second frame and the property data for the second frame. 11. The system of claim 10 , wherein coefficients of the second switchable temporal propagation neural network are updated to reduce differences between the ground truth property data for the first frame and the property data for the first frame. 12. The system of claim 10 , wherein the processor is further configured to implement a guidance neural network model that generates the task-specific affinity values for a task based on task-specific data for the first frame defining values of an attribute of pixels in the first frame and task-specific data for the second frame defining values of the attribute of pixels in the second frame. 13. The system of claim 12 , wherein the guidance neural network model is jointly trained with the first switchable temporal propagation neural network and the second switchable temporal propagation neural network using a training dataset for the task. 14. The system of claim 12 , wherein parameters of the guidance neural network model are updated to reduce the differences between the ground truth property data for the second frame and the property data for the second frame. 15. The system of claim 12 , wherein parameters of the guidance neural network model are updated to reduce differences between a style energy of the ground truth property data for the second frame and a style energy of the property data for the second frame. 16. The system of claim 10 , wherein the task-specific affinity values comprise a global transformation matrix. 17. The system of claim 16 , wherein the second switchable temporal propagation neural network is configured to produce the property data for the first frame according to an inverse transformation matrix corresponding to the global transformation matrix. 18. The system of claim 17 , wherein the inverse transformation matrix and the global transformation matrix are orthogonal when the differences between a style energy of the ground truth property data for the second frame and a style energy of the property data for the second frame are minimized. 19. A non-transitory computer-readable media storing computer instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of: receiving ground truth property data for a first frame of a video sequence defining values of a property of each pixel in the first frame; receiving ground truth property data for a second frame of the video sequence defining values of the property of each pixel in the second frame; receiving task-specific affinity values for transitions from the first frame to the second frame; processing, by a first switchable temporal propagation neural network, the ground truth property data for the first frame and the task-specific affinity values to produce property data for the second frame; processing, by a second switchable temporal propagation neural network, the ground truth property data for the second frame and the task-specific affinity values to produce property data for the first frame; and updating coefficients of the first switchable temporal propagation neural network to red

Assignees

Inventors

Classifications

  • G06V10/82Primary

    using neural networks · CPC title

  • relating to colour · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11328173B2 cover?
A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represente…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 10 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).