Equidistant-temporal aggregation for moving object segmentation
US-2024425042-A1 · Dec 26, 2024 · US
US9854168B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9854168-B2 |
| Application number | US-201514642469-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 9, 2015 |
| Priority date | Mar 7, 2014 |
| Publication date | Dec 26, 2017 |
| Grant date | Dec 26, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device is disclosed comprising a memory configured for holding video and a processor coupled to the memory. The memory contains computer-executable instructions that, when executed by the processor, cause the device to perform operations to stabilize the video, the operations comprising buffering consecutive original video frames, determining transformation matrices from subsets of the original video frames, wherein the transformation matrices represent estimates of stable camera motion, using the transformation matrices to warp the original video frames and generate video that is stabilized relative to the original video frames, and adjusting a size of a subset of original video frames in response to detecting a condition.
Opening claim text (preview).
What is claimed is: 1. A device, comprising: a memory configured for storing a buffer of video frames and computer-executable instructions; and a processor coupled to the memory, wherein the computer-executable instructions are executed by the processor to cause the device to perform operations to stabilize the video, wherein the operations comprise: receiving a plurality of consecutive video frames, wherein the plurality of consecutive video frames comprises a current video frame and a plurality of previous video frames; storing the plurality of consecutive video frames in the buffer; calculating a global motion estimate for the current video frame, wherein the global motion estimate is a 3×3 transformation matrix, wherein the global motion estimate describes a camera's relative motion between the current video frame and an adjacent video frame, and wherein the adjacent video frame is a video frame that was received before the current video frame; calculating a long-term camera motion estimate for the current video frame, wherein the long-term camera motion estimate is a 3×3 transformation matrix, wherein the long-term camera motion estimate is a geometric mean of an accumulation of global motion estimates, and wherein the accumulation of global motion estimates comprises the global motion estimate for the current video frame and a calculated global motion estimate for each of the previous video frames in the buffer; calculating a smoothed long-term camera motion estimate for the current video frame by applying a Kalman filter to the long-term camera motion estimate, wherein the smoothed long-term camera motion estimate is a 3×3 transformation matrix; and warping the current video frame according to the smoothed long-term camera motion estimate. 2. The device of claim 1 , wherein the instructions, when executed by the processor, further cause the device to perform additional operations comprising: matching a plurality of feature points between the current video frame and the adjacent video frame; calculating a ratio between a number of inlier feature points and a total number of features points; and using the ratio to determine a process noise covariance parameter and a measurement noise covariance parameter, wherein the process noise covariance parameter and the measurement noise covariance parameter are used by the Kalman filter to control a relative weight between an a priori estimation and a measurement value in an a posteriori estimation. 3. The device of claim 1 , wherein the instructions, when executed by the processor, further cause the device to perform additional operations, prior to warping the current video frame, comprising adjusting the smoothed long-term camera motion estimate according to the formula: c =(1− F )× c wherein c is a coefficient of the smoothed long-term camera motion estimate and F is a forgetting factor between 0 and 1. 4. The device of claim 1 , wherein the instructions, when executed by the processor, further cause the device to perform additional operations comprising: generating a first transformation matrix of transformation matrices for a first subset of the plurality of consecutive video frames using a first motion model selected from a plurality of motion models; using the first transformation matrix to determine a first set of warped video frames from original video frames in the first subset; when the first set of warped video frames satisfies a condition, generating a second transformation matrix of the transformation matrices for a second subset of the subsets using the first motion model; when the first set of warped video frames does not satisfy the condition, and when the first subset can be divided into smaller subsets: dividing the first subset into a second subset and a third subset; generating a second transformation matrix and a third transformation matrix for the second subset and the third subset, respectively, using the first motion model; and using the second transformation matrix and the third transformation matrix to determine sets of warped video frames from the original video frames in the second subset and the third subset, respectively; and when the first set of warped video frames does not satisfy the condition and when the first subset cannot be divided into smaller subsets, repeating the generating and the using for the first subset using a second motion model of the plurality of motion models. 5. The device of claim 4 , wherein the instructions, when executed by the processor, further cause the device to perform additional operations comprising: determining inter-frame transformation matrices between pairs of consecutive frames in the first subset; deriving corrective transformation matrices that change the inter-frame transformation matrices to match the first transformation matrix; and applying the corrective transformation matrices to the original video frames in the first subset to determine the first set of warped video frames. 6. The device of claim 4 , wherein the condition comprises a constraint for out-of-bound area size and a constraint for amount of skewness of a warped video frame. 7. The device of claim 4 , wherein the plurality of motion models comprise one or more of a homography model with eight degrees-of-freedom, an affine transformation model with five degrees-of-freedom, and a similarity transformation model with four degrees-of-freedom. 8. The device of claim 1 , wherein a number of frames in a subset of the plurality of consecutive video frames is adjusted in response to information comprising one or more of input from a sensor on the device, input from a user, and information indicating how the device is being used. 9. A method of stabilizing video, the method comprising: receiving a plurality of consecutive video frames, wherein the plurality of consecutive video frames comprises a current video frame and a plurality of previous video frames; storing the plurality of consecutive video frames in a buffer; calculating a global motion estimate for the current video frame, wherein the global motion estimate is a 3×3 transformation matrix, wherein the global motion estimate describes a camera's relative motion between the current video frame and an adjacent video frame, and wherein the adjacent video frame is a video frame that was received before the current video frame; calculating a long-term camera motion estimate for the current video frame, wherein the long-term camera motion estimate is a 3×3 transformation matrix, wherein the long-term camera motion estimate is a geometric mean of an accumulation of global motion estimates, and wherein the accumulation of global motion estimates comprises the global motion estimate for the current video frame and a calculated global motion estimate for each of the previous video frames in the buffer; calculating a smoothed long-term camera motion estimate for the current video frame by applying a Kalman filter to the long-term camera motion estimate, wherein the smoothed long-term camera motion estimate is a 3×3 transformation matrix; warping the current video frame according to the smoothed long-term camera motion estimate. 10. The method of claim 9 , further comprising: matching a plurality of feature points between the current video frame and the adjacent video frame; calculating a ratio between a number of inlier feature points and a total number of features points; and using the ratio to determine a process noise covariance parameter and a measurement noise covariance parameter, wherein the process noise covariance parameter and the measurement noise covariance parameter are used by the Kalman filter to control a relative weight between an a
using feature-based methods, e.g. the tracking of corners or segments · CPC title
performed by a processor, e.g. controlling the readout of an image memory · CPC title
Motion detection · CPC title
Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform · CPC title
Video; Image sequence · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.