Holographic calling for artificial reality

US12554221B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12554221-B2
Application numberUS-202117360735-A
CountryUS
Kind codeB2
Filing dateJun 28, 2021
Priority dateJun 28, 2021
Publication dateFeb 17, 2026
Grant dateFeb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth value for each pixel while generating parts masks and a body model; use the masks to segment the images into parts needed for hologram generation; convert depth images into a 3D mesh; paint the 3D mesh with color data; perform torso disocclusion; perform face reconstruction; and perform audio synchronization. In various implementations, different of these stages can be performed sender-side or receiver side. The holographic calling pipeline also includes sender-side compression, transmission over a communication channel, and receiver-side decompression and hologram output.

First claim

Opening claim text (preview).

We claim: 1 . A method for adjusting one or more images of a sending user in a holographic call by densification, segmentation, and body modeling, the method comprising: receiving, by at least one physical processor, a depth image captured by a depth sensor on a sender device; receiving, by the at least one physical processor, a color image captured by a camera on the sender device; providing, by the at least one physical processor, the depth image and the color image as input to a machine learning model configured to generate a densified version of the depth image, segment the densified version of the depth image and the color image to generate segmenting masks that include a first mask identifying the sending user and a second mask identifying an XR device worn by the sending user, and generate a body model of a current pose of the sending user; and transmitting, by the at least one physical processor and across a network to a receiver device, the densified version of the depth image, the color image, the segmenting masks, and the body model. 2 . The method of claim 1 , further comprising configuring depth and color data for application to the machine learning model at least in part by converting the color data to grayscale. 3 . The method of claim 1 , further comprising configuring depth and color data for application to the machine learning model at least in part by: assigning an area of interest by: applying a foreground mask determined for a previous frame to obtain an expected foreground area; determining a buffer zone around the expected foreground area based on one or more of: a framerate, a determined speed of movement for the sending user, and/or a determined expected movement range of parts of the sending user; and expanding the expected foreground area by the buffer zone to obtain the area of interest; and removing or downscaling portions of the color data that are not in the area of interest. 4 . The method of claim 1 , further comprising configuring depth and color data for application to the machine learning model at least in part by: assigning an area of interest by: applying a foreground mask determined for a previous frame to obtain an expected foreground area; determining a buffer zone around the expected foreground area; and expanding the expected foreground area by the buffer zone to obtain the area of interest; and removing or downscaling portions of the color data that are not in the area of interest. 5 . The method of claim 1 , further comprising configuring depth and color data for application to the machine learning model at least in part by removing or downscaling portions of the color data that are not in an area of interest. 6 . The method of claim 1 , further comprising: receiving, by the at least one physical processor, a different color image captured by another camera on the sender device, wherein the color image and the different color image are captured simultaneously for a same image frame; and providing, by the at least one physical processor, the different color image as input to the machine learning model, wherein the machine learning model is further configured to remove all portions from the color image that do not overlap with any portions of the different color image. 7 . The method of claim 1 , wherein an output of the machine learning model from one or more previous frames includes both stored backbone output of the machine learning model for the one or more previous frames and stored decoder output for the one or more previous frames. 8 . The method of claim 1 , wherein obtained depth data has less than a depth value for each pixel in at least a portion of the depth image depicting the sending user; and wherein the densified version of the depth image has a depth value for each pixel in the portion of the depth image depicting the sending user. 9 . The method of claim 1 , wherein the segmenting masks identify at least a foreground depicting the sending user, a face of the sending user, and portions of a body of the sending user including at least two of: a torso of the sending user; one or more arms of the sending user; one or more hands of the sending user; or a head of the sending user. 10 . A non-transitory computer-readable storage medium storing instructions that, when executed by a computing system, cause the computing system to perform a process for adjusting one or more images of a sending user in a holographic call by densification, segmentation, and body modeling, the process comprising: receiving, by at least one physical processor, a depth image captured by a depth sensor on a sender device; receiving, by the at least one physical processor, a color image captured by a camera on the sender device; providing, by the at least one physical processor, the depth image and the color image as input to a machine learning model configured to generate a densified version of the depth image, segment the densified version of the depth image and the color image to generate segmenting masks that include a first mask identifying the sending user and a second mask identifying an XR device worn by the sending user, and generate a body model of a current pose of the sending user; and transmitting, by the at least one physical processor and across a network to a receiver device, the densified version of the depth image, the color image, the segmenting masks, and the body model. 11 . The non-transitory computer-readable storage medium of claim 10 , wherein the machine learning model was trained by: obtaining computer-generated images of people in various poses and in various environments, each computer-generated image automatically assigned tags with per-pixel depth data, segmentation data, and a body model specifying a pose of a depicted person; and for each particular image of the computer generated images: applying the particular image to the machine learning model; comparing output of the machine learning model to the tags for the particular image; and based on the comparing, applying one or more loss functions to update parameters of the machine learning model. 12 . The non-transitory computer-readable storage medium of claim 10 , wherein the process further comprises configuring depth and color data for application to the machine learning model at least in part by converting the color data to grayscale. 13 . The non-transitory computer-readable storage medium of claim 10 , the process further comprises configuring depth and color data for application to the machine learning model at least in part by: assigning an area of interest by: applying a foreground mask determined for a previous frame to obtain an expected foreground area; determining a buffer zone around the expected foreground area based on one or more of: a framerate, a determined speed of movement for the sending user, and/or a determined expected movement range of parts of the sending user; and expanding the expected foreground area by the buffer zone to obtain the area of interest; and removing or downscaling portions of the color data that are not in the area of interest. 14 . The non-transitory computer-readable storage medium of claim 10 , the process further comprises configuring depth and color data for application to the machine learning model at least in part by: assigning an area of interest by: applying a foreground mask determined for a previous frame to obtain an expected foreground area; determining a buffer zone around the expected foreground area; and expanding the expected foreground area by the buffer zone to obtain the area o

Assignees

Inventors

Classifications

  • in augmented reality scenes · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

  • Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12554221B2 cover?
A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth…
Who is the assignee on this patent?
Meta Platforms Tech Llc
What technology area does this patent fall under?
Primary CPC classification H04M3/567. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).