Apparatus and methods for distance estimation using stereo imagery

US10820009B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10820009-B2
Application numberUS-201816104646-A
CountryUS
Kind codeB2
Filing dateAug 17, 2018
Priority dateJul 8, 2014
Publication dateOct 27, 2020
Grant dateOct 27, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Frame sequences from multiple image sensors may be combined in order to form, for example, an interleaved frame sequence. Individual frames of the combined sequence may be configured a by combination (e.g., concatenation) of frames from one or more source sequences. The interleaved/concatenated frame sequence may be encoded using a motion estimation encoder. Output of the video encoder may be processed (e.g., parsed) in order to extract motion information present in the encoded video. The motion information may be utilized in order to determine a depth of visual scene, such as by using binocular disparity between two or more images by an adaptive controller in order to detect one or more objects salient to a given task. In one variant, depth information is utilized during control and operation of mobile robotic devices.

First claim

Opening claim text (preview).

What is claimed: 1. A method for motion detection and distance measurement of at least one target object, comprising: receiving a first image frame from a first imaging camera and a second image frame from a second imaging camera, the first image frame being captured contemporaneous to the second image frame; combining the first and second image frame to create a first combined frame; receiving a third image frame from the first imaging camera and a fourth image frame from the second imaging camera, the third image frame being captured contemporaneous to the fourth image frame and immediately subsequent to the first and second image frames; combining the third and fourth image frames to create a second combined frame; generating an interleaved sequence of concatenated frames to evaluate distance and motion detection; determining the distance measurement based on a pixel-wise disparity between image frames of the first or second combined frames; and determining the motion of at least one target object based on a pixel-wise disparity between the first and second combined frames. 2. The method of claim 1 , further comprising: extracting luminance data of pixels of the image frames within the interleaved sequence of concatenated frames; and generating macroblocks comprising pixels of similar luminance within each of the image frames of the interleaved sequence of concatenated frames. 3. The method of claim 2 , wherein the determining of the distance measurement is based on an at least one pixel disparity between macroblocks of the first and second image frames of a first interleaved sequence of concatenated frames and a spatial separation of first and second imaging cameras. 4. The method of claim 2 , wherein the determining of motion of the at least one target object corresponds to assigning a motion vector to each of the macroblocks within the interleaved sequence of concatenated frames, the motion vectors being assigned based on an at least one pixel disparity between macroblocks of the first combined frame and the second combined frame. 5. The method of claim 4 , further comprising: determining gestures of the at least one target object within the interleaved sequence of concatenated frames by performing at least one of: (i) identifying background pixels within the interleaved sequence of concatenated frames based on spatially coherent motion or differential motion; (ii) removing pixels corresponding to the identified background pixels; and (iii) determining the gesture based on a resulting motion vector field of macroblocks within the first interleaved sequence of concatenated frames, the motion vector field being formed using the image processor by at least one remaining macroblock with an assigned motion vectors upon removal of the background pixels. 6. The method of claim 5 , wherein, the resulting motion vector field of macroblocks is associated with a gesture based on an output from an adaptive predictor apparatus, and the at least one target object comprises at least one of a human, inanimate object, portions of a human, or portions of an inanimate object. 7. The method of claim 1 , wherein, the first and second imaging cameras are separated spatially by a nonzero distance. 8. A non-transitory computer readable medium comprising a plurality of instructions stored thereon, that when executed by at least one processor, configure the at least one processor to, receive a first image frame from a first imaging camera and a second image frame from a second imaging camera, the first image frame being captured contemporaneous to the second image frame; combine the first and second image frames to create a first combined frame; receive a third image frame from the first imaging camera and a fourth image frame from the second imaging camera, the third image frame being captured contemporaneous to the fourth and subsequent to the first and second image frames; combine the third and fourth image frames to create a second combined frame; generate an interleaved sequence of concatenated frames to evaluate distance and motion detection; determine distance measurement based on a pixel-wise disparity between image frames of the first or second combined frames; and determine motion of at least one target object based on the pixel-wise disparity between the first and second combined frames. 9. The non-transitory computer readable medium of claim 8 , wherein the at least one processor is further configured to execute the computer readable instructions to, extract luminance data of pixels of the image frames within the interleaved sequence of concatenated frames; and generate macroblocks comprising pixels of similar luminance within each of the image frames of the interleaved sequence of concatenated frames. 10. The non-transitory computer readable medium of claim 9 , wherein the distance measurement determination is based on an at least one pixel disparity between macroblocks of the first and second image frames of a first interleaved sequence of concatenated frames and a spatial separation of first and second imaging cameras. 11. The non-transitory computer readable medium of claim 9 , wherein the motion of the at least one target object corresponds to assigning a motion vector to each of the macroblocks within the interleaved sequence of concatenated frames, the motion vectors being assigned based on an at least one pixel disparity between macroblocks of the first combined frame and the second combined frame. 12. The non-transitory computer readable medium of claim 11 , wherein the at least one processor is further configured to execute the computer readable instructions to, determine gestures of the at least one target object within the interleaved sequence of concatenated frames by performing at least one of: (i) identifying background pixels within the interleaved sequence of concatenated frames based on spatially coherent motion or differential motion; (ii) removing pixels corresponding to the identified background pixels; and (iii) determining the gesture based on a resulting motion vector field of macroblocks within the first interleaved sequence of concatenated frames, the motion vector field being formed using the image processor by at least one remaining macroblock with an assigned motion vectors upon removal of the background pixels. 13. The non-transitory computer readable medium of claim 12 , wherein, the resulting motion vector field of macroblocks is associated with a gesture based on an output from an adaptive predictor apparatus, and the at least one target object comprises at least one of a human, inanimate object, portions of a human, or portions of an inanimate object. 14. The non-transitory computer readable medium of claim 8 , wherein the first and second imaging cameras are separated spatially by a nonzero distance. 15. A system for motion detection and distance measurement of at least one target object, comprising: a memory having computer readable instructions thereon; and at least one processor configured to execute the computer readable instructions to, receive a first image frame from a first imaging camera and a second image frame from a second imaging camera, the first image frame being captured contemporaneous to the second image frame; combine the first and second image frames to create a first combined frame; receive a third image frame from the first imaging camera and a fourth image frame from the second imaging camera, the third image frame being captured contemporaneous to the fourth and subsequent to the first and second image fames; genera

Assignees

Inventors

Classifications

  • Recognition of hand or arm movements, e.g. recognition of deaf sign language (static hand signs G06V40/113) · CPC title

  • H04N19/51Primary

    Motion estimation or motion compensation · CPC title

  • from motion · CPC title

  • Stereoscopic video; Stereoscopic image sequence · CPC title

  • using a sequence of stereo image pairs · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10820009B2 cover?
Frame sequences from multiple image sensors may be combined in order to form, for example, an interleaved frame sequence. Individual frames of the combined sequence may be configured a by combination (e.g., concatenation) of frames from one or more source sequences. The interleaved/concatenated frame sequence may be encoded using a motion estimation encoder. Output of the video encoder may be p…
Who is the assignee on this patent?
Brain Corp
What technology area does this patent fall under?
Primary CPC classification H04N19/51. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 27 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).