What technology area does this patent fall under?

Primary CPC classification G06V40/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jul 17 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Scene-aware synthetic human motion generation using neural networks

US2025232506A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2025232506-A1
Application number	US-202418415496-A
Country	US
Kind code	A1
Filing date	Jan 17, 2024
Priority date	Jan 17, 2024
Publication date	Jul 17, 2025
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A motion diffusion model may be pre-trained on motion data, and a scene-aware component (e.g., one or more layers of a neural network) may be connected and used to extract and inject a representation of scene information into the pre-trained motion diffusion model. For example, to predict orientations of joint waypoints along a path through a particular 3D scene, a scene-aware input channel that accepts a representation of the 3D structure of the scene may be added to a pre-trained motion diffusion model. To predict orientations of joint waypoints along a path that interacts with a 3D object in the 3D scene, a scene-aware input channel that accepts a representation of the 3D object and/or a surface thereof may be added to a pre-trained motion diffusion model. As such, the resulting scene-aware motion diffusion model(s) may be tuned on motion-scene data and used to generate human motion.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processor comprising: one or more processing units to generate, based at least on processing a representation of at least a portion of a three-dimensional (3D) scene using a diffusion model comprising a scene-aware component and a pre-trained motion diffusion model, a representation of scene-aware motion comprising one or more orientations of one or more joint waypoints along one or more paths of a character at least partially depicted in the 3D scene. 2 . The processor of claim 1 , wherein the one or more processing units are further to generate the diffusion model based at least on adding the scene-aware component to the pre-trained motion diffusion model and tuning the diffusion model using motion-scene training data. 3 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting a top-down height map of the 3D scene into the pre-trained motion diffusion model. 4 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting a 3D point cloud representing at least a portion of a 3D object in the 3D scene into the pre-trained motion diffusion model. 5 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more classified objects in the 3D scene into the pre-trained motion diffusion model. 6 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more other characters or one or more audio sources in the 3D scene into the pre-trained motion diffusion model. 7 . The processor of claim 1 , wherein the one or more processing units are further to update the diffusion model using training data generated based at least on retargeting motion data comprising one or more contact locations with a first object to one or more corresponding locations where a modeled body surface makes contact with a target object. 8 . The processor of claim 1 , wherein the processor is comprised in at least one of: a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing remote operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for generating synthetic data; a system for generating synthetic data using AI; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 9 . A system comprising one or more processing units to generate, based at least on processing a representation of at least a portion of a three-dimensional (3D) scene using a diffusion model, a representation of scene-aware motion corresponding to a character at least partially depicted in the 3D scene. 10 . The system of claim 9 , wherein the one or more processing units are further to generate the diffusion model based at least on adding a scene-aware component to a pre-trained motion diffusion model and tuning the diffusion model using motion-scene training data. 11 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting a top-down height map of the 3D scene into a pre-trained motion diffusion model of the diffusion model. 12 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting a 3D point cloud representing at least a portion of a 3D object in the 3D scene into a pre-trained motion diffusion model of the diffusion model. 13 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more classified objects in the 3D scene into a pre-trained motion diffusion model of the diffusion model. 14 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more other characters or one or more audio sources in the 3D scene into a pre-trained motion diffusion model of the diffusion model. 15 . The system of claim 9 , wherein the one or more processing units are further to update the diffusion model using training data generated based at least on retargeting motion data comprising one or more contact locations with a first object to one or more corresponding locations where a surface of a modeled body makes contact with a target object. 16 . The system of claim 9 , wherein the system is comprised in at least one of: a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing remote operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for generating synthetic data; a system for generating synthetic data using AI; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 17 . A method comprising: generating, based at least on injecting a representation of at least a portion of a three-dimensional (3D) scene into a pre-trained diffusion model, a representation of one or more orientations of one or more waypoints along one or more paths of a character in the 3D scene. 18 . The method of claim 17 , further comprising generating a diffusion model based at least on adding a scene-aware component to the pre-trained diffusion model and tuning the diffusion model using motion-scene training data. 19 . The method of claim 17 , further comprising updating a diffusion model comprising the pre-trained diffusion model using training data generated based at least on retargeting motion data comprising one or more contact locations with a first object to one or more corresponding locations where a modeled body surface makes contact with a target object. 20 . The method of claim 17 , wherein the method is performed by at least one of: a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing remote operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for generating synthetic data;

Assignees

Nvidia Corp

Inventors

Classifications

G06N3/08
Learning methods · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06V10/82
using neural networks · CPC title
G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V10/26
Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title

Patent family

Related publications grouped by family.

View patent family 96172738

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025232506A1 cover?: A motion diffusion model may be pre-trained on motion data, and a scene-aware component (e.g., one or more layers of a neural network) may be connected and used to extract and inject a representation of scene information into the pre-trained motion diffusion model. For example, to predict orientations of joint waypoints along a path through a particular 3D scene, a scene-aware input channel tha…
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification G06V40/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jul 17 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).