Generating three-dimensional human models representing two-dimensional humans in two-dimensional images

US12499574B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12499574-B2
Application numberUS-202318304144-A
CountryUS
Kind codeB2
Filing dateApr 20, 2023
Priority dateOct 6, 2022
Publication dateDec 16, 2025
Grant dateDec 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify two-dimensional images via scene-based editing using three-dimensional representations of the two-dimensional images. For instance, in one or more embodiments, the disclosed systems utilize three-dimensional representations of two-dimensional images to generate and modify shadows in the two-dimensional images according to various shadow maps. Additionally, the disclosed systems utilize three-dimensional representations of two-dimensional images to modify humans in the two-dimensional images. The disclosed systems also utilize three-dimensional representations of two-dimensional images to provide scene scale estimation via scale fields of the two-dimensional images. In some embodiments, the disclosed systems utilizes three-dimensional representations of two-dimensional images to generate and visualize 3D planar surfaces for modifying objects in two-dimensional images. The disclosed systems further use three-dimensional representations of two-dimensional images to customize focal points for the two-dimensional images.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system comprising: one or more memory devices comprising a two-dimensional image; and one or more processors configured to cause the system to: extract, utilizing one or more neural networks, two-dimensional pose data corresponding to a two-dimensional skeleton with two-dimensional bones for a two-dimensional human extracted from the two-dimensional image; extract, utilizing the one or more neural networks, three-dimensional pose data and three-dimensional shape data corresponding to a three-dimensional skeleton for the two-dimensional human extracted from the two-dimensional image; and generate, within a three-dimensional space corresponding to the two-dimensional image, a three-dimensional human model representing the two-dimensional human by refining the three-dimensional skeleton of the three-dimensional pose data according to the two-dimensional skeleton of the two-dimensional pose data and the three-dimensional shape data. 2 . The system of claim 1 , wherein the one or more processors are configured to cause the system to: extract the two-dimensional pose data from the two-dimensional image utilizing a first neural network of the one or more neural networks; and extract the three-dimensional pose data and the three-dimensional shape data utilizing a second neural network of the one or more neural networks. 3 . The system of claim 1 , wherein the one or more processors are configured to cause the system to extract the three-dimensional pose data by: generating a body bounding box corresponding to a body portion of the two-dimensional human; and extracting, utilizing a neural network, three-dimensional pose data corresponding to the body portion of the two-dimensional human according to the body bounding box. 4 . The system of claim 3 , wherein the one or more processors are configured to cause the system to extract the three-dimensional pose data by: generating one or more hand bounding boxes corresponding to one or more hands of the two-dimensional human; and extracting, utilizing an additional neural network, additional three-dimensional pose data corresponding to the one or more hands of the two-dimensional human according to the one or more hand bounding boxes. 5 . The system of claim 4 , wherein the one or more processors are configured to cause the system to generate the three-dimensional human model by combining the three-dimensional pose data corresponding to the body portion of the two-dimensional human with the additional three-dimensional pose data corresponding to the one or more hands of the two-dimensional human. 6 . The system of claim 1 , wherein the one or more processors are configured to cause the system to generate the three-dimensional human model by iteratively modifying positions of bones in the three-dimensional skeleton based on positions of bones in the two-dimensional skeleton. 7 . The system of claim 1 , wherein the one or more processors are configured to cause the system to generate a modified two-dimensional image by: modifying a pose of the three-dimensional human model within the three-dimensional space; generating a modified pose of the two-dimensional human within the two-dimensional image according to the pose of the three-dimensional human model in the three-dimensional space; and generating, utilizing the one or more neural networks, the modified two-dimensional image comprising a modified two-dimensional human according to the modified pose of the two-dimensional human and a camera position associated with the two-dimensional image. 8 . A non-transitory computer readable medium storing executable instructions which, when executed by a processing device, cause the processing device to perform operations comprising: extracting, utilizing one or more neural networks, two-dimensional pose data, comprising a two-dimensional skeleton with two-dimensional bones, from a two-dimensional human extracted from a two-dimensional image; extracting, utilizing the one or more neural networks, three-dimensional pose data and three-dimensional shape data corresponding to the two-dimensional human extracted from the two-dimensional image; and generating, within a three-dimensional space corresponding to the two-dimensional image, a three-dimensional human model representing the two-dimensional human by combining the two-dimensional pose data with the three-dimensional pose data and the three-dimensional shape data. 9 . The non-transitory computer readable medium of claim 8 , wherein: extracting the two-dimensional pose data comprises extracting a two-dimensional skeleton from a cropped portion of the two-dimensional image utilizing a first neural network of the one or more neural networks; and extracting the three-dimensional pose data comprises extracting a three-dimensional skeleton from the cropped portion of the two-dimensional image utilizing a second neural network of the one or more neural networks. 10 . The non-transitory computer readable medium of claim 8 , wherein extracting the three-dimensional pose data comprises: extracting a first three-dimensional skeleton corresponding to a first portion of the two-dimensional human utilizing a first neural network; and extracting a second three-dimensional skeleton corresponding to a second portion of the two-dimensional human comprising a hand utilizing a second neural network. 11 . The non-transitory computer readable medium of claim 10 , wherein generating the three-dimensional human model comprises: iteratively modifying positions of bones of the second three-dimensional skeleton according to positions of bones of the first three-dimensional skeleton within the three-dimensional space to merge the first three-dimensional skeleton and the second three-dimensional skeleton; and iteratively modifying positions of bones in the first three-dimensional skeleton according to positions of bones of a two-dimensional skeleton from the two-dimensional pose data. 12 . A computer-implemented method comprising: extracting, by at least one processor utilizing one or more neural networks, two-dimensional pose data, comprising a two-dimensional skeleton with two-dimensional bones, from a two-dimensional human extracted from a two-dimensional image; extracting, by the at least one processor utilizing the one or more neural networks, three-dimensional pose data and three-dimensional shape data corresponding to the two-dimensional human extracted from the two-dimensional image; and generating, by the at least one processor and within a three-dimensional space corresponding to the two-dimensional image, a three-dimensional human model representing the two-dimensional human by combining the two-dimensional pose data with the three-dimensional pose data and the three-dimensional shape data. 13 . The computer-implemented method of claim 12 , wherein extracting the two-dimensional pose data comprises extracting, utilizing a first neural network, the two-dimensional pose data comprising a two-dimensional skeleton with two-dimensional bones and annotations indicating one or more portions of the two-dimensional skeleton. 14 . The computer-implemented method of claim 13 , wherein extracting the three-dimensional pose data and the three-dimensional shape data comprises extracting, utilizing a second neural network, the three-dimensional pose data comprising a three-dimensional skeleton with three-dimensional bones and the three-dimensional shape data comprising a three-dimensional mesh according to the two-dimensional human. 15 . The computer-implemented method of claim 14 , where

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12499574B2 cover?
The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify two-dimensional images via scene-based editing using three-dimensional representations of the two-dimensional images. For instance, in one or more embodiments, the disclosed systems utilize three-dimensional representations of two-dimensional images to generate and modify shadows in the tw…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/73. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).