Surface reconstruction for environments with moving objects

US11263810B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11263810-B2
Application numberUS-202016875779-A
CountryUS
Kind codeB2
Filing dateMay 15, 2020
Priority dateApr 19, 2018
Publication dateMar 1, 2022
Grant dateMar 1, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Optimizations are provided for reconstructing geometric surfaces for an environment that includes moving objects. Multiple depth maps for the environment are created, where some of the depth maps correspond to different perspectives of the environment. A motion state identifier is assigned to at least some pixels in at least some of the depth maps corresponding to moving objects in the environment. A composite 3D mesh is built using at least some of the multiple depth maps, by incorporating pixel information from the depth maps, while omitting pixel information identified by the motion state identifiers as being associated with moving objects.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system configured to facilitate improvements in how surface reconstruction of an environment is performed, said computer system comprising: one or more processors; and one or more computer-readable hardware storage devices storing instructions that are executable by the one or more processors to cause the computer system to at least: obtain images of a real-world environment, at least two of the images being generated at different points in time; provide the images as input to a machine learning (ML) algorithm, the ML algorithm being trained to classify image objects as dynamic or static; identify that the ML algorithm classified a substantially stationary object embodied in the at least two images as being dynamic even though any movement detected for the stationary object, as detected between the at least two images, falls below and thereby satisfies a maximum movement threshold used for determining whether objects are potentially static; based on one or more of the at least two images, generate a depth map that includes depth identifying pixels, each one of said pixels being assigned a corresponding motion state identifier indicating whether each one of said pixels is reflective of a corresponding dynamic object or a corresponding static object, wherein a group of said pixels corresponds to the stationary object and are assigned motion state identifiers reflecting the stationary object as being dynamic; and based at least partially on the depth map, generate a three-dimensional (3D) mesh of the real-world environment, said generating being performed by including depth information from pixels having motion state identifiers corresponding to static objects while omitting depth information from pixels having motion state identifiers corresponding to dynamic objects, and such that depth information corresponding to the stationary object is omitted from the 3D mesh even though said any movement detected for the stationary object is determined to fall below the maximum movement threshold. 2. The computer system of claim 1 , wherein image objects classified as dynamic are determined to satisfy a volatility degree while image objects classified as static are determined to not satisfy the volatility degree. 3. The computer system of claim 1 , wherein pose information is also provided as input to the ML algorithm. 4. The computer system of claim 1 , wherein the ML algorithm generates, as output, a label map detailing whether objects are dynamic or static. 5. The computer system of claim 1 , wherein assigning motion state identifiers includes performing skeleton tracking to classify objects. 6. The computer system of claim 1 , wherein morphological dilation is performed to generate a buffer surrounding the stationary object. 7. The computer system of claim 6 , wherein depth information for the buffer is also refrained from being included in the 3D mesh. 8. The computer system of claim 1 , wherein multiple depth maps are used to generate the 3D mesh. 9. The computer system of claim 1 , wherein motion state identifiers are Boolean values. 10. The computer system of claim 1 , wherein a confidence level is included as a part of each motion state identifier for each pixel of the depth map, said confidence level indicating a level of confidence regarding whether that pixel's corresponding object is dynamic or static. 11. A method for facilitating improvements in how surface reconstruction of an environment is performed, said method comprising: obtaining images of a real-world environment, at least two of the images being generated at different points in time; providing the images as input to a machine learning (ML) algorithm, the ML algorithm being trained to classify image objects as dynamic or static; identifying that the ML algorithm classified a substantially stationary object embodied in the at least two images as being dynamic even though any movement detected for the stationary object, as detected between the at least two images, falls below and thereby satisfies a maximum movement threshold used for determining whether objects are potentially static; based on one or more of the at least two images, generating a depth map that includes depth identifying pixels, each one of said pixels being assigned a corresponding motion state identifier indicating whether each one of said pixels is reflective of a corresponding dynamic object or a corresponding static object, wherein a group of said pixels corresponds to the stationary object and are assigned motion state identifiers reflecting the stationary object as being dynamic; and based at least partially on the depth map, generating a three-dimensional (3D) mesh of the real-world environment, said generating being performed by including depth information from pixels having motion state identifiers corresponding to static objects while omitting depth information from pixels having motion state identifiers corresponding to dynamic objects, and such that depth information corresponding to the stationary object is omitted from the 3D mesh even though said any movement detected for the stationary object is determined to fall below the maximum movement threshold. 12. The method of claim 11 , wherein the images capture different perspectives of the real-world environment. 13. The method of claim 12 , wherein, to capture the different perspectives of the real-world environment, cameras used to generate the images are physically positioned at different locations within the real-world environment. 14. The method of claim 12 , wherein, to capture the different perspectives of the real-world environment, re-projections are performed on one or more of the images to obtain one of more of the different perspectives. 15. The method of claim 11 , wherein image objects classified as dynamic are determined to satisfy a volatility degree while image objects classified as static are determined to not satisfy the volatility degree. 16. The method of claim 11 , wherein pose information is also provided as input to the ML algorithm. 17. The method of claim 11 , wherein the ML algorithm generates, as output, a label map detailing whether objects are dynamic or static. 18. The method of claim 11 , wherein assigning motion state identifiers includes performing skeleton tracking to classify objects. 19. The method of claim 11 , wherein morphological dilation is performed to generate a buffer surrounding the object. 20. A computer system comprising: one or more processors; and one or more computer-readable hardware storage devices that store computer-executable instructions that are executable by the one or more processors to cause the computer system to at least: obtain images of a real-world environment, at least two of the images being generated at different points in time, wherein the images of the real-world environment include one or more of a visible light image or an infrared light image; provide the images as input to a machine learning (ML) algorithm, the ML algorithm being trained to classify image objects as dynamic or static; identify that the ML algorithm classified a substantially stationary object embodied in the at least two images as being dynamic even though any movement detected for the stationary object, as detected between the at least two images, falls below and thereby satisfies a maximum movement threshold used for determining whether objects are potentially static; based on one or more of the at least two images, generate a depth map that inc

Assignees

Inventors

Classifications

  • Image-based rendering · CPC title

  • Aligning objects, relative positioning of parts · CPC title

  • G06T17/20Primary

    Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title

  • Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts · CPC title

  • General purpose rendering architectures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11263810B2 cover?
Optimizations are provided for reconstructing geometric surfaces for an environment that includes moving objects. Multiple depth maps for the environment are created, where some of the depth maps correspond to different perspectives of the environment. A motion state identifier is assigned to at least some pixels in at least some of the depth maps corresponding to moving objects in the environm…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06T17/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).