System and method for content and motion controlled action video generation

US12506897B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12506897-B2
Application numberUS-202016812058-A
CountryUS
Kind codeB2
Filing dateMar 6, 2020
Priority dateMar 31, 2017
Publication dateDec 23, 2025
Grant dateDec 23, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, computer readable medium, and system are disclosed for action video generation. The method includes the steps of generating, by a recurrent neural network, a sequence of motion vectors from a first set of random variables and receiving, by a generator neural network, the sequence of motion vectors and a content vector sample. The sequence of motion vectors and the content vector sample are sampled by the generator neural network to produce a video clip.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method, comprising: using a first neural network to generate a plurality of motion vectors; and using a second neural network to use the plurality of motion vectors to produce content, wherein the second neural network comprises a generator neural network. 2 . The computer-implemented method of claim 1 , wherein the first neural network comprises a recurrent neural network. 3 . The computer-implemented method of claim 1 , further comprising: using a third neural network to use image frames from the content to generate updated information for the second neural network. 4 . The computer-implemented method of claim 3 , wherein the third neural network comprises a discriminative neural network. 5 . The computer-implemented method of claim 3 , further comprising: using the third neural network to use sets of sequential frames from the content to generate updated information for the first and the second neural networks. 6 . The computer-implemented method of claim 1 , wherein the content comprises a sequence of image frames. 7 . The computer-implemented method of claim 1 , further comprising: passing a first set of variables to the first neural network to generate the plurality of motion vectors; and passing a second set of variables to the first neural network to generate a second set of a plurality of motion vectors that is different from the plurality of vectors to generate additional content. 8 . A processor, comprising: one or more arithmetic logic units (ALUs) to use a first neural network to generate a plurality of motion vectors and a second neural network to use the plurality of motion vectors to produce content, wherein the second neural network comprises a generator neural network. 9 . The processor of claim 8 , wherein the first neural network comprises a recurrent neural network. 10 . The processor of claim 8 , further comprising one or more ALUs to use a third neural network to use image frames from the content to generate updated information for the second neural network. 11 . The processor of claim 10 , wherein the third neural network comprises a discriminative neural network. 12 . The processor of claim 10 , further comprising one or more ALUs to use the third neural network to use sets of sequential frames from the content to generate updated information for the first and the second neural networks. 13 . The processor of claim 8 , wherein the content comprises a sequence of video frames. 14 . The processor of claim 8 , further comprising one or more ALUs to: use the first neural network generate additional plurality of motion vectors using different input used to generate the plurality of motion vectors; and use the second neural network to generate additional content by using the additional plurality of vectors. 15 . A system, comprising: one or more computers having one or more processors to use a first neural network to generate a plurality of motion vectors and a second neural network to use the plurality of motion vectors to produce content, wherein the second neural network comprises a generator neural network. 16 . The system of claim 15 , wherein the first neural network comprises a recurrent neural network. 17 . The system of claim 15 , further comprising one or more computers having one or more processors to use a third neural network to use image frames from the content to generate updated information for the second neural network. 18 . The system of claim 17 , wherein the third neural network comprises a discriminative neural network. 19 . The system of claim 18 , further comprising one or more computers having one or more processors to use the third neural network to use sets of sequential frames from the content to generate updated information for the first and the second neural networks. 20 . The system of claim 15 , wherein the content comprises a sequence of image frames. 21 . The system of claim 15 , further comprising one or more computers having one or more processors to pass input to the first neural network to generate additional plurality of motion vectors; and use the second neural network to generate additional content using the additional plurality of motion vectors. 22 . A non-transitory machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to use a first neural network to generate a plurality of motion vectors and a second neural network to use the plurality of motion vectors to produce content, wherein the second neural network comprises a generator neural network. 23 . The non-transitory machine-readable medium of claim 22 , wherein the first neural network comprises a recurrent neural network. 24 . The non-transitory machine-readable medium of claim 22 , having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to further use a third neural network to use image frames from the content to generate updated information for the second neural network. 25 . The non-transitory machine-readable medium of claim 24 , wherein the third neural network comprises a discriminative neural network. 26 . The non-transitory machine-readable medium of claim 24 , having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to further use the third neural network to use sets of sequential frames from the content to generate updated information for the first and the second neural networks. 27 . The non-transitory machine-readable medium of claim 24 , wherein the content comprises a sequence of image frames.

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Probabilistic or stochastic networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Generative networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12506897B2 cover?
A method, computer readable medium, and system are disclosed for action video generation. The method includes the steps of generating, by a recurrent neural network, a sequence of motion vectors from a first set of random variables and receiving, by a generator neural network, the sequence of motion vectors and a content vector sample. The sequence of motion vectors and the content vector sampl…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).