Method and system for generating a depth map

US12536682B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12536682-B2
Application numberUS-202318301032-A
CountryUS
Kind codeB2
Filing dateApr 14, 2023
Priority dateApr 15, 2022
Publication dateJan 27, 2026
Grant dateJan 27, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for generating a depth map corresponding to a frame of a sequence of frames in a video clip is disclosed. This can involve generating a single image depth map for each of a plurality of frames, scaling the single image depth maps, and processing a time sequence of scaled single image depth maps to generate said depth map corresponding to the frame of the sequence of frames in the video clip.

First claim

Opening claim text (preview).

The invention claimed is: 1 . A method of generating a depth map corresponding to a frame of a sequence of frames in a video clip, the method comprising: generating a single image depth map for each frame of a plurality of frames; scaling the single image depth map for each frame to generate a scaled single image depth map for said each frame by applying a scale value to each pixel of said single image depth map, wherein the scale value for each pixel of the single image depth map is generated using a method comprising: for each grid point of a plurality of grid points which are arranged across the frame: generating an initial scale value using a depth value for the grid point and depth values corresponding to the same grid point from a plurality of temporally related frames; generating a final scale value for said grid point on the basis of said grid point's initial scale value and the initial scale value of one or more neighboring grid points; and determining corresponding scale values for application to each pixel of said single image depth map from the final scale values of the grid points; and processing a time sequence of scaled single image depth maps to generate said depth map corresponding to the frame of the sequence of frames in the video clip. 2 . The method of claim 1 wherein the step of generating an initial scale value using a depth value for the grid point and depth values for the same grid point from a plurality of temporally related frames comprises determining a depth value for the grid point in said frame by determining an average depth value for a region including the grid point; and wherein determining depth values corresponding to the same grid point for a plurality of temporally related frames comprises: determining a correspondence between content of said frame and content of said temporally related frames such that a location corresponding to said grid point can be determined for each of the plurality of temporally related frames, and determining an average depth value for a region including said location in each temporally related frame to determine a depth value corresponding to said grid point for each temporally related frame. 3 . The method of claim 2 wherein the initial scale value for each grid point is determined using a ratio of: a measure of central tendency of a group of depth values including at least the depth values for the same grid point from the plurality of temporally related frames, to the depth value for the grid point. 4 . The method of claim 3 wherein the group of depth values includes the depth value for the grid point. 5 . The method of claim 2 wherein determining a correspondence between the content of said frame and the content of said temporally related frames includes analyzing optical flow between temporally adjacent frames and generating a warped depth map of each of said plurality of temporally related frames in accordance with the optical flow, whereby said location corresponding to said grid point is aligned with said grid point, and determining the average depth value for the region around said location in each temporally related frame uses the warped depth map. 6 . The method of claim 5 wherein the method further includes defining a mask including pixels of said frame in which the single image depth map is determined to be either or both of: unreliable based on optical flow analysis of the plurality of frames; or have a depth greater than a threshold depth, and, wherein at least one of: determining a depth value for the grid point by determining an average depth value for a region including the grid point, and/or determining depth values corresponding to the same grid point for a plurality of temporally related frames, excludes pixels that are included in said mask. 7 . The method of claim 2 wherein determining a correspondence between the content of said frame and the content of said temporally related frames includes analyzing optical flow between temporally adjacent frames and tracking the location of said grid point in each of said temporally related frames using said optical flow and determining the average depth value for a region around said location in each temporally related frame. 8 . The method of claim 7 , wherein the method further includes defining a mask including pixels of said frame in which the single image depth map is determined to be either or both of: unreliable based on optical flow analysis of the plurality of frames; or have a depth greater than a threshold depth, and, wherein at least one of: determining a depth value for the grid point by determining an average depth value for a region including the grid point, and/or determining depth values corresponding to the same grid point for a plurality of temporally related frames, excludes pixels that are included in said mask. 9 . The method of claim 1 wherein the method includes defining a mask including pixels of said frame in which the single image depth map is determined to be either or both of: unreliable based on optical flow analysis of the plurality of frames; or have a depth greater than a threshold depth. 10 . The method of claim 1 , wherein the step of generating a final scale value for said grid point on the basis of said grid point's initial scale value and an initial scale value of one or more neighboring grid points comprises: determining a relative contribution of each of said one or more neighboring grid points and said grid point's initial scale value. 11 . The method of claim 10 wherein the method further includes: defining a mask including pixels of said frame in which the single image depth map is determined to be either or both of: unreliable based on optical flow analysis of the plurality of frames; or have a depth greater than a threshold depth; and determining a relative contribution for said one or more neighboring grid points based on said mask. 12 . The method of claim 1 wherein generating a final scale value for said grid point on the basis of said grid point's initial scale value and an initial scale value of one or more neighboring grid point includes solving a series of linear equations representing an initial scale value of each of said grid points and the initial scale value for each of said grid point's neighboring grid points. 13 . The method of claim 1 wherein determining corresponding scale values for application to each pixel of said single image depth map from the final scale values of the grid points comprises generating a scale value for each pixel between said grid points by interpolation. 14 . The method of claim 1 wherein determining corresponding scale values for application to each pixel of said single image depth map from the final scale values of the grid points comprises assigning a scale value for each pixel based on a position relative to said grid points. 15 . The method of claim 1 wherein generating said single image depth map for each frame comprises using a deep learning model to generate said single image depth map. 16 . The method of claim 15 wherein using said deep learning model comprises using a convolutional neural network to generate said single image depth map. 17 . A computer system including a processor operating in accordance with execution instructions stored in a non-transitory storage medium, whereby the instructions, when executed, configure the computer system to perform the method of claim 1 . 18 . The computer system of claim 17 wherein the instr

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12536682B2 cover?
A method and system for generating a depth map corresponding to a frame of a sequence of frames in a video clip is disclosed. This can involve generating a single image depth map for each of a plurality of frames, scaling the single image depth maps, and processing a time sequence of scaled single image depth maps to generate said depth map corresponding to the frame of the sequence of frames i…
Who is the assignee on this patent?
Blackmagic Design Pty Ltd
What technology area does this patent fall under?
Primary CPC classification G06T7/55. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 27 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).