Dynamic frame selection for scene understanding

US12536737B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-12536737-B1
Application numberUS-202318135449-A
CountryUS
Kind codeB1
Filing dateApr 17, 2023
Priority dateApr 21, 2022
Publication dateJan 27, 2026
Grant dateJan 27, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various implementations disclosed herein include devices, systems, and methods that dynamically selects particular frames of image data for generating three-dimensional (3D) representations of a physical environment. For example, an example process may include acquiring frames of image data from one or more sensors in a physical environment. The process may include determining one or more deterrent properties associated with the frames of image data. The process may include selecting a subset of the frames of image data by determining that at least one of the determined deterrent properties satisfies at least one condition of one or more conditions. The process may include determining a scene understanding of the physical environment based on the selected subset of the frames. The scene understanding may determine geometric properties of the physical environment.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: at an electronic device having a processor and one or more sensors: acquiring frames of image data from the one or more sensors in a physical environment; determining one or more deterrent properties associated with the frames of image data by a first technique based on a first set of criteria; selecting a subset of the frames of image data for a second technique based on a second set of criteria by determining that at least one of the determined deterrent properties satisfies at least one condition of one or more conditions associated with the first technique, wherein the second set of criteria is different than the first set of criteria; and determining three-dimensional (3D) representation data associated with the physical environment based on the selected subset of the frames. 2 . The method of claim 1 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises determining a semantic understanding of the image data. 3 . The method of claim 1 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises determining image statistics of the image data. 4 . The method of claim 1 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises detecting a hand or a body of a user of the electronic device for each frame while acquiring the frames of image data. 5 . The method of claim 1 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises tracking hand or body movements of a user of the electronic device while acquiring the frames of image data. 6 . The method of claim 1 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises determining motion of the electronic device based on motion data. 7 . The method of claim 1 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises determining motion of the electronic device using sequential frames of the image data to estimate a distance traveled. 8 . The method of claim 1 , wherein determining that at least one of the determined deterrent properties satisfies at least one condition of the one or more conditions comprises determining that the subset of the frames of image data for the second technique includes images that comprise occluded pixels of one or more objects of the physical environment that satisfies an occlusion threshold requirement. 9 . The method of claim 1 , wherein determining that at least one of the determined deterrent properties satisfies at least one condition of the one or more conditions comprises determining that the subset of the frames of image data for the second technique includes images that were acquired while the electronic device was moving less than a movement threshold. 10 . The method of claim 1 , wherein determining that at least one of the determined deterrent properties satisfies at least one condition of the one or more conditions comprises determining that the subset of the frames of image data for the second technique includes image data that comprise luminance values that satisfy a luminance threshold condition. 11 . The method of claim 1 , wherein the physical environment comprises one or more objects. 12 . The method of claim 11 , wherein determining that at least one of the determined deterrent properties satisfies at least one condition of the one or more conditions associated with the first technique comprises: detecting that a first subset of the one or more objects comprises non-static objects; detecting that a second subset of the one or more objects comprises static objects, the first subset is different than the second subset of the one or more objects; and determining that the subset of the frames of image data only includes image data that comprises the second subset of the one or more objects. 13 . The method of claim 1 , wherein the second technique comprises determining a scene understanding associated with the physical environment. 14 . The method of claim 1 , wherein the image data comprises depth data and light intensity image data obtained during a scanning process. 15 . The method of claim 1 , wherein the electronic device is a head-mounted device (HMD). 16 . The method of claim 1 , wherein the first set of criteria is associated with a first frame rate that is different than a second frame rate associated with the second set of criteria. 17 . A system comprising: a non-transitory computer-readable storage medium; and one or more processors coupled to the non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium comprises program instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising: acquiring frames of image data from one or more sensors of an electronic device in a physical environment; determining one or more deterrent properties associated with the frames of image data by a first technique based on a first set of criteria; selecting a subset of the frames of image data for a second technique based on a second set of criteria by determining that at least one of the determined deterrent properties satisfies at least one condition of one or more conditions associated with the first technique, wherein the second set of criteria is different than the first set of criteria; and determining three-dimensional (3D) representation data associated with the physical environment based on the selected subset of the frames. 18 . The system of claim 17 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises determining a semantic understanding of the image data. 19 . The system of claim 17 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises determining image statistics of the image data. 20 . The system of claim 17 , wherein determining the one or more deterrent properties associated with the frames of image data by the first technique based on the first set of criteria comprises detecting a hand or a body of a user of the electronic device for each frame while acquiring the frames of image data. 21 . A non-transitory computer-readable storage medium storing program instructions executable via one or more processors to perform operations comprising: acquiring frames of image data from one or more sensors of an electronic device in a physical environment; determining one or more deterrent properties associated with the frames of image data by a first technique based on a first set of criteria; selecting a subset of the frames of image data for a second technique based on a second set of criteria by determining that at least one of the determined deterrent properties satisfies at least one condition of one or more conditions

Assignees

Inventors

Classifications

  • Range image; Depth image; 3D point clouds · CPC title

  • Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

  • G06F3/011Primary

    Arrangements for interaction with the human body, e.g. for user immersion in virtual reality (blind teaching G09B21/00) · CPC title

  • G06T17/00Primary

    Three-dimensional [3D] modelling for computer graphics · CPC title

  • Human being; Person · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12536737B1 cover?
Various implementations disclosed herein include devices, systems, and methods that dynamically selects particular frames of image data for generating three-dimensional (3D) representations of a physical environment. For example, an example process may include acquiring frames of image data from one or more sensors in a physical environment. The process may include determining one or more deter…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/011. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 27 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).