Mid-air-gesture editing method, device, display system and medium
US-2024427423-A1 · Dec 26, 2024 · US
US2025232506A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025232506-A1 |
| Application number | US-202418415496-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 17, 2024 |
| Priority date | Jan 17, 2024 |
| Publication date | Jul 17, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A motion diffusion model may be pre-trained on motion data, and a scene-aware component (e.g., one or more layers of a neural network) may be connected and used to extract and inject a representation of scene information into the pre-trained motion diffusion model. For example, to predict orientations of joint waypoints along a path through a particular 3D scene, a scene-aware input channel that accepts a representation of the 3D structure of the scene may be added to a pre-trained motion diffusion model. To predict orientations of joint waypoints along a path that interacts with a 3D object in the 3D scene, a scene-aware input channel that accepts a representation of the 3D object and/or a surface thereof may be added to a pre-trained motion diffusion model. As such, the resulting scene-aware motion diffusion model(s) may be tuned on motion-scene data and used to generate human motion.
Opening claim text (preview).
What is claimed is: 1 . A processor comprising: one or more processing units to generate, based at least on processing a representation of at least a portion of a three-dimensional (3D) scene using a diffusion model comprising a scene-aware component and a pre-trained motion diffusion model, a representation of scene-aware motion comprising one or more orientations of one or more joint waypoints along one or more paths of a character at least partially depicted in the 3D scene. 2 . The processor of claim 1 , wherein the one or more processing units are further to generate the diffusion model based at least on adding the scene-aware component to the pre-trained motion diffusion model and tuning the diffusion model using motion-scene training data. 3 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting a top-down height map of the 3D scene into the pre-trained motion diffusion model. 4 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting a 3D point cloud representing at least a portion of a 3D object in the 3D scene into the pre-trained motion diffusion model. 5 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more classified objects in the 3D scene into the pre-trained motion diffusion model. 6 . The processor of claim 1 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more other characters or one or more audio sources in the 3D scene into the pre-trained motion diffusion model. 7 . The processor of claim 1 , wherein the one or more processing units are further to update the diffusion model using training data generated based at least on retargeting motion data comprising one or more contact locations with a first object to one or more corresponding locations where a modeled body surface makes contact with a target object. 8 . The processor of claim 1 , wherein the processor is comprised in at least one of: a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing remote operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for generating synthetic data; a system for generating synthetic data using AI; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 9 . A system comprising one or more processing units to generate, based at least on processing a representation of at least a portion of a three-dimensional (3D) scene using a diffusion model, a representation of scene-aware motion corresponding to a character at least partially depicted in the 3D scene. 10 . The system of claim 9 , wherein the one or more processing units are further to generate the diffusion model based at least on adding a scene-aware component to a pre-trained motion diffusion model and tuning the diffusion model using motion-scene training data. 11 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting a top-down height map of the 3D scene into a pre-trained motion diffusion model of the diffusion model. 12 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting a 3D point cloud representing at least a portion of a 3D object in the 3D scene into a pre-trained motion diffusion model of the diffusion model. 13 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more classified objects in the 3D scene into a pre-trained motion diffusion model of the diffusion model. 14 . The system of claim 9 , wherein the processing using the diffusion model comprises injecting classification data representing one or more classified locations of one or more other characters or one or more audio sources in the 3D scene into a pre-trained motion diffusion model of the diffusion model. 15 . The system of claim 9 , wherein the one or more processing units are further to update the diffusion model using training data generated based at least on retargeting motion data comprising one or more contact locations with a first object to one or more corresponding locations where a surface of a modeled body makes contact with a target object. 16 . The system of claim 9 , wherein the system is comprised in at least one of: a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing remote operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for generating synthetic data; a system for generating synthetic data using AI; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 17 . A method comprising: generating, based at least on injecting a representation of at least a portion of a three-dimensional (3D) scene into a pre-trained diffusion model, a representation of one or more orientations of one or more waypoints along one or more paths of a character in the 3D scene. 18 . The method of claim 17 , further comprising generating a diffusion model based at least on adding a scene-aware component to the pre-trained diffusion model and tuning the diffusion model using motion-scene training data. 19 . The method of claim 17 , further comprising updating a diffusion model comprising the pre-trained diffusion model using training data generated based at least on retargeting motion data comprising one or more contact locations with a first object to one or more corresponding locations where a modeled body surface makes contact with a target object. 20 . The method of claim 17 , wherein the method is performed by at least one of: a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for performing remote operations; a system for performing real-time streaming; a system for generating or presenting one or more of augmented reality content, virtual reality content, or mixed reality content; a system implemented using an edge device; a system implemented using a robot; a system for generating synthetic data;
Learning methods · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
using neural networks · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.