Volumetric capture of objects with a single RGBD camera

US11328486B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11328486-B2
Application numberUS-202016861530-A
CountryUS
Kind codeB2
Filing dateApr 29, 2020
Priority dateApr 30, 2019
Publication dateMay 10, 2022
Grant dateMay 10, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes receiving a first image including color data and depth data, determining a viewpoint associated with an augmented reality (AR) and/or virtual reality (VR) display displaying a second image, receiving at least one calibration image including an object in the first image, the object being in a different pose as compared to a pose of the object in the first image, and generating the second image based on the first image, the viewpoint and the at least one calibration image.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for generating an image comprising: receiving a first image including color data and depth data; determining a viewpoint associated with an augmented reality (AR) and/or virtual reality (VR) display displaying a second image; receiving at least one calibration image including an object in the first image, the object being in a pose in the at least one calibration image different from a pose of the object in the first image; and generating the second image based on the first image, the viewpoint, the pose of the object in the first image, and the at least one calibration image, the first image and the at least one calibration image are captured using a single camera, and the pose of the object in the first image includes a position of a first portion of the object relative to a position of a second portion of the object and the pose of the object in the at least one calibration image includes a second position of the first portion of the object relative to the position of the second portion of the object. 2. The method of claim 1 , wherein the single camera is configured to capture the color data as red, green, blue (RGB) data and at least one of capture the depth data and generate the depth data based on the color data. 3. The method of claim 1 , wherein the viewpoint associated with the AR and/or VR display is different than a viewpoint associated with the first image. 4. The method of claim 1 , wherein the at least one calibration image is a silhouette image of the object. 5. The method of claim 1 , wherein the generating of the second image includes, determining a target pose of the object by mapping two dimensional (2D) keypoints to corresponding three dimensional (3D) points of depth data associated with the at least one calibration image, and generating the second image by warping the object in the at least one calibration image using a convolutional neural network that takes the at least one calibration image and the target pose of the object as input. 6. The method of claim 1 , wherein the generating of the second image includes, generating at least one part-mask in a first pass of a convolutional neural network having the at least one calibration image as an input, generating at least one part-image in the first pass of the convolutional neural network, and generating the second image a second pass of the convolutional neural network having the at least one part-mask and the at least one part-image as input. 7. The method of claim 1 , wherein the generating of the second image includes using two passes of a convolutional neural network that is trained by minimizing at least two losses associated with warping the object. 8. The method of claim 1 , wherein the second image is blended using a neural network to generate missing portions of the second image. 9. The method of claim 1 , wherein the second image is a silhouette image of the object, the method further comprising merging the second image with a background image. 10. The method of claim 1 , further comprising: a pre-processing stage in which a plurality of images are captured while the pose of the object is changed; storing the plurality of images as the at least one calibration image; generating a similarity score for each of the at least one calibration image based on a target pose of the object; and selecting the at least one calibration image from the at least one calibration image based on the similarity score. 11. The method of claim 1 , further comprising: a pre-processing stage in which a plurality of images are captured while the pose of the object is changed; storing the plurality of images as the at least one calibration image; capturing an image, during a communications event, the image including the object in a new pose, and adding the image to the stored plurality of images. 12. A non-transitory computer-readable storage medium having stored thereon computer executable program code which, when executed on a computer system, causes the computer system to perform steps comprising: receiving a first image including color data and depth data; determining a viewpoint associated with an augmented reality (AR) and/or virtual reality (VR) display displaying a second image; receiving at least one calibration image including an object in the first image, the object being in pose in the at least one calibration image different from a pose of the object in the first image; and generating the second image based on the first image, the viewpoint, a pose of the object in the first image, and the at least one calibration image, the first image and the at least one calibration image are captured using a single sensor, and the pose of the object in the first image includes a position of a first portion of the object relative to a position of a second portion of the object. 13. The non-transitory computer-readable storage medium of claim 12 , wherein the single sensor is configured to capture the color data as red, green, blue (RGB) data and at least one of capture the depth data and generate the depth data based on the color data. 14. The non-transitory computer-readable storage medium of claim 12 , wherein the generating of the second image includes, determining a target pose of the object by mapping two dimensional (2D) keypoints to corresponding three dimensional (3D) points of depth data associated with the at least one calibration image, and generating the second image by warping the object in the at least one calibration image using a convolutional neural network that takes the at least one calibration image and the target pose of the object as input. 15. The non-transitory computer-readable storage medium of claim 12 , wherein the generating of the second image includes, generating at least one part-mask in a first pass of a convolutional neural network having the at least one calibration image as an input, generating at least one part-image in the first pass of the convolutional neural network, and generating the second image a second pass of the convolutional neural network having the at least one part-mask and the at least one part-image as input. 16. The non-transitory computer-readable storage medium of claim 12 , wherein the second image is blended using a neural network to generate missing portions of the second image. 17. The non-transitory computer-readable storage medium of claim 12 , wherein the second image is a silhouette image of the object, the steps further comprising merging the second image with a background image. 18. The non-transitory computer-readable storage medium of claim 12 , the steps further comprising: a pre-processing stage in which a plurality of images are captured while the pose of the object is changed; storing the plurality of images as the at least one calibration image; generating a similarity score for each of the at least one calibration image based on a target pose of the object; and selecting the at least one calibration image from the at least one calibration image based on the similarity score. 19. The non-transitory computer-readable storage medium of claim 12 , the steps further comprising: a pre-processing stage in which a plurality of images are captured while the pose of the object is changed; storing the plurality of images as the at least one calibration image; capturing an image, during a communications event, the image including the object in a new pose, and adding the image to the stored plurality of images. 20. A

Assignees

Inventors

Classifications

  • in augmented reality scenes · CPC title

  • G06T19/00Primary

    Manipulating three-dimensional [3D] models or images for computer graphics · CPC title

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • G06T19/006Primary

    Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11328486B2 cover?
A method includes receiving a first image including color data and depth data, determining a viewpoint associated with an augmented reality (AR) and/or virtual reality (VR) display displaying a second image, receiving at least one calibration image including an object in the first image, the object being in a different pose as compared to a pose of the object in the first image, and generating …
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06T19/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 10 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).