Holographic calling for artificial reality

US12099327B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12099327-B2
Application numberUS-202117360693-A
CountryUS
Kind codeB2
Filing dateJun 28, 2021
Priority dateJun 28, 2021
Publication dateSep 24, 2024
Grant dateSep 24, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth value for each pixel while generating parts masks and a body model; use the masks to segment the images into parts needed for hologram generation; convert depth images into a 3D mesh; paint the 3D mesh with color data; perform torso disocclusion; perform face reconstruction; and perform audio synchronization. In various implementations, different of these stages can be performed sender-side or receiver side. The holographic calling pipeline also includes sender-side compression, transmission over a communication channel, and receiver-side decompression and hologram output.

First claim

Opening claim text (preview).

We claim: 1. A method for conducting a holographic call using a holographic call pipeline, the method comprising: establishing a communication channel between a sending device and at least one receiving device; capturing, at the sending device, color, depth, and audio data and using the color and depth data to generate one or more color images and one or more depths images; generating one or more masks for the one or more color images and one or more depth images; applying the one or more masks to the one or more color images and one or more depth images to obtain masked portions of the one or more color images and one or more depth images; compressing the masked portions of the one or more color images and one or more depth images; and synchronizing and transmitting, over the communication channel, the one or more color images, one or more depth images, and the audio data; wherein the receiving device, in response to the transmitting: decompresses the compressed portions; converts the portions of the one or more depth images into a 3D mesh; paints the portions of the one or more color images onto the 3D mesh; synchronizes the audio data with the painted 3D mesh; performs torso disocclusion on the 3D mesh; performs facial reconstruction on the 3D mesh; and outputs the painted 3D mesh as a hologram with synchronized audio. 2. The method of claim 1 , wherein the communication channel is a real-time communication channel that provides latency guarantees. 3. The method of claim 1 , wherein capturing the depth data includes capturing structured light by emitting a pattern of infrared (IR) light, capturing reflections of the IR light, and determining depth data based on how the pattern of IR light is distorted and/or using time-of-flight readings for parts of the pattern of IR light. 4. The method of claim 1 , wherein the one or more color images include multiple color images captured from multiple cameras at different resolutions, and wherein the generating one or more masks for the one or more color images and one or more depth images is performed based on the color images captured at the lower resolution. 5. The method of claim 1 , wherein the one or more depth images do not have a depth value for each pixel, and wherein the method further comprises: performing a densification procedure on the one or more depth images to assign a depth value to each pixel of the one or more depth images; wherein the densification procedure comprises applying a machine learning model trained, to densify depth image, using synthetic images of people, the synthetic images of people generated with specified depth data for each pixel. 6. The method of claim 1 , wherein generating the one or more masks comprises: identifying segments for the one or more color images and/or the one or more depth image, the segments comprising at least foreground and background distinctions; wherein the identifying the segments comprises applying a machine learning model trained, to segment images, using synthetic images of people, the synthetic images of people generated with specified labeled segments. 7. The method of claim 1 , wherein the one or more depth images do not have a depth value for each pixel, and wherein the method further comprises: performing a densification procedure on the one or more depth images to assign a depth value to each pixel of the one or more depth images and identify segments for the one or more color images and/or the one or more depth image, the segments comprising at least foreground and background distinctions; wherein the densification procedure comprises applying a machine learning model trained, to densify depth image and segment images, using synthetic images of people, the synthetic images of people generated with specified depth data for each pixel and labeled segments. 8. The method of claim 1 , wherein at least part of the compression is performed using an RVL compression algorithm. 9. The method of claim 1 , wherein the torso disocclusion on the 3D mesh includes: generating an existing model of a body of a sending user based on one or more previously captured images of the sending user; identifying one or more holes in the 3D mesh corresponding to occlusions between a depth sensor and the sending user; and filling in the one or more holes in the 3D mesh with corresponding portions from the existing model. 10. The method of claim 1 , wherein the facial reconstruction on the 3D mesh includes: encoding at least a facial portion, depicting a sending user wearing an XR headset, of at least one of the one or more color images; applying a geometry branch of a machine learning model to the encoded facial portion to produce a predicted geometry of the sending user without the XR headset; applying a texture branch of a machine learning model to the encoded facial portion to produce a predicted texture of the sending user without the XR headset; and skinning the predicted texture onto the predicted geometry. 11. The method of claim 1 , wherein the facial reconstruction on the 3D mesh includes: performing a pre-scan of a sending user, while not wearing an XR headset, to generate multiple expression meshes for the sending user; determining a head pose and facial expression for the sending user; selecting multiple of the expression meshes that match the determined facial expression and adjusting the selected multiple expression meshes to conform to the head pose; combining the multiple selected expression meshes; and applying depth and color blending on the combined expression mesh with at least the one or more color images. 12. A computer-readable storage medium storing instructions that, when executed by a computing system, cause the computing system to perform a process for conducting a holographic call using a holographic call pipeline, the process comprising: capturing, at a sending device, color, depth, and audio data and using the color and depth data to generate one or more color images and one or more depths images; generating one or more masks for the one or more color images and one or more depth images; applying the one or more masks to the one or more color images and one or more depth images to obtain masked portions of the one or more color images and one or more depth images; compressing the masked portions of the one or more color images and one or more depth images; and transmitting, over a communication channel, the compressed portions and the audio data; wherein a receiving device, in response to the transmitting: decompresses the compressed portions; converts the portions of the one or more depth images into a 3D mesh; paints the portions of the one or more color images onto the 3D mesh; and outputs the painted 3D mesh as a hologram with synchronized audio. 13. The computer-readable storage medium of claim 12 , wherein capturing the depth data includes capturing structured light by emitting a pattern of infrared (IR) light, capturing reflections of the IR light, and determining depth data based on how the pattern of IR light is distorted and/or using time-of-flight readings for parts of the pattern of IR light. 14. The computer-readable storage medium of claim 12 , wherein the one or more depth images do not have a depth value for each pixel, and wherein the process further comprises: performing a densification procedure on the one or more depth images to assign a depth value to each pixel of the one or more depth images and identify segments for the one or more color images and/or the one or more depth image, the segments comprising at least foreground and background

Assignees

Inventors

Classifications

  • Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title

  • Augmented reality · CPC title

  • audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants (echo suppression in two-way loud-speaking telephone systems H04M9/02; sound field processing per se H04S7/30) · CPC title

  • Multimedia conference systems · CPC title

  • Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12099327B2 cover?
A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth…
Who is the assignee on this patent?
Meta Platforms Tech Llc
What technology area does this patent fall under?
Primary CPC classification G03H1/0005. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).