Systems and methods for 3-dimensional (3d) positioning of imaging device
US-2021203855-A1 · Jul 1, 2021 · US
US11755643B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11755643-B2 |
| Application number | US-202016921248-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 6, 2020 |
| Priority date | Jul 6, 2020 |
| Publication date | Sep 12, 2023 |
| Grant date | Sep 12, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A video indexing system identifies groups of frames within a video frame sequence captured by a static camera during a same scene. Context metadata is generated for each frame in each group based on an analysis of fewer than all frames in the group. The frames are indexed in a database in association with the generated context metadata.
Opening claim text (preview).
What is claimed is: 1. A method comprising: analyzing a sequence of video frames to determine whether the sequence was captured by a static camera; responsive to determining that the sequence of frames was captured by the static camera, subjecting the sequence of frames to a first series of processing operations that includes generating context metadata for each frame in the sequence based on an analysis of fewer than all frames in the sequence; and responsive to the determining that a select frame of the sequence of frames was captured by a moving camera, subjecting the select frame to a second series of processing operations for generating the context metadata that is different than the first series of processing operations. 2. The method of claim 1 , wherein the second series of processing operations generates context metadata for the select frame based on an image analysis that is limited to the select frame. 3. The method of claim 1 , wherein generating context metadata for each frame in the sequence based on the analysis of fewer than all frames in the sequence further comprises: selecting a keyframe from the sequence; generating at least a portion of the context metadata based on an analysis of the keyframe without analyzing other frames of the sequence; and indexing the other frames of the sequence in association with the generated context metadata. 4. The method of claim 3 , wherein generating the context metadata further comprises: generating descriptors for multiple objects present in the keyframe. 5. The method of claim 4 , wherein generating the context metadata further comprises: generating a scene label generated based on the descriptors generated for the keyframe. 6. The method of claim 3 , wherein generating the context metadata further comprises: generating a region of interest (ROI) mask for the keyframe; applying the ROI mask to each of the other frames in the sequence; and omitting an area defined by the ROI mask from subsequent processing operations performed on each of the other frames of the sequence. 7. The method of claim 1 , further comprising: responsive to determining that the sequence of frames was captured by a static camera, determining a size of an object present in multiple frames of the sequence by analyzing a position of the object relative to a fixed reference point that appears within each of the multiple frames. 8. The method of claim 1 , further comprising: responsive to determining that the sequence of frames was captured by the static camera, executing object tracking logic that assumes a fixed camera frame of reference. 9. A video indexing system comprising: a frame classifier that classifies video frames received as part of a sequence into different static camera scene groups; a context metadata generation engine that: receives a group of frames classified as comprising a same static camera scene group of the different static camera scene groups; and analyzes fewer than all frames in the group to generate context metadata for each frame in the group; and an indexing engine that indexes each frame in the group in a database in association with the generated context metadata. 10. The video indexing system of claim 9 , wherein the context metadata generation engine is further adapted to: determine that a select frame of the sequence was captured by a moving camera, based on a classification of the frame classifier; and responsive to the determination, generate context metadata for the select frame based on an image analysis limited to the select frame. 11. The video indexing system of claim 9 , wherein the context metadata generation engine is further adapted to: select a keyframe from the group; and generate at least a portion of the context metadata for each frame in the group based on an analysis of the keyframe without analyzing other frames of the group. 12. The video indexing system of claim 11 , wherein the context metadata generation engine is adapted to: generate a region of interest (ROI) mask for the keyframe of the group; apply the ROI mask to each frame in the group; and omit an area defined by the ROI mask from subsequent processing operations performed on the frames of the group. 13. The video indexing system of claim 11 , wherein the context metadata includes descriptors for multiple objects present in the keyframe. 14. The video indexing system of claim 13 , wherein the context metadata includes a scene label generated based on the descriptors generated for the keyframe. 15. The video indexing system of claim 9 , further comprising: an object tracker that executes logic to track movements of objects throughout multiple frames of the group, the logic being adapted to implement constraints that assume a fixed frame of reference for a camera capturing the video. 16. One or more tangible computer-readable storage media encoding computer-executable instructions for executing a computer processes comprising: classifying video frames within a sequence into different static camera scene groups; generating context metadata for each frame in a group of the different static camera scene groups, the context metadata being generated based on an analysis of fewer than all frames in the group; and indexing each frame in the group in a database in association with the generated context metadata. 17. The one or more tangible computer-readable storage media of claim 16 , wherein generating the context metadata further comprises: selecting a keyframe from the group; and generating at least a portion of the context metadata for each frame in the group based on an analysis of the keyframe without analyzing other frames of the group. 18. The one or more tangible computer-readable storage media of claim 16 , wherein the computer process further comprises: determining that a select frame of the sequence was captured by a moving camera; responsive to the determination, generating context metadata for the select frame based on an image analysis limited to the select frame. 19. The one or more tangible computer-readable storage media of claim 17 , wherein the computer process further comprises: generating a region of interest (ROI) mask for the keyframe of the group; applying the ROI mask to each frame in the group; and omitting an area defined by the ROI mask from subsequent processing operations performed on the frames of the group.
by using information signals recorded by the same method as the main recording {(G11B27/22 takes precedence)} · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
using metadata automatically derived from the content · CPC title
Detecting features for summarising video content · CPC title
using objects detected or recognised in the video content · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.