Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G11B27/28. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 12 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Metadata generation for video indexing

US11755643B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11755643-B2
Application number	US-202016921248-A
Country	US
Kind code	B2
Filing date	Jul 6, 2020
Priority date	Jul 6, 2020
Publication date	Sep 12, 2023
Grant date	Sep 12, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A video indexing system identifies groups of frames within a video frame sequence captured by a static camera during a same scene. Context metadata is generated for each frame in each group based on an analysis of fewer than all frames in the group. The frames are indexed in a database in association with the generated context metadata.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: analyzing a sequence of video frames to determine whether the sequence was captured by a static camera; responsive to determining that the sequence of frames was captured by the static camera, subjecting the sequence of frames to a first series of processing operations that includes generating context metadata for each frame in the sequence based on an analysis of fewer than all frames in the sequence; and responsive to the determining that a select frame of the sequence of frames was captured by a moving camera, subjecting the select frame to a second series of processing operations for generating the context metadata that is different than the first series of processing operations. 2. The method of claim 1 , wherein the second series of processing operations generates context metadata for the select frame based on an image analysis that is limited to the select frame. 3. The method of claim 1 , wherein generating context metadata for each frame in the sequence based on the analysis of fewer than all frames in the sequence further comprises: selecting a keyframe from the sequence; generating at least a portion of the context metadata based on an analysis of the keyframe without analyzing other frames of the sequence; and indexing the other frames of the sequence in association with the generated context metadata. 4. The method of claim 3 , wherein generating the context metadata further comprises: generating descriptors for multiple objects present in the keyframe. 5. The method of claim 4 , wherein generating the context metadata further comprises: generating a scene label generated based on the descriptors generated for the keyframe. 6. The method of claim 3 , wherein generating the context metadata further comprises: generating a region of interest (ROI) mask for the keyframe; applying the ROI mask to each of the other frames in the sequence; and omitting an area defined by the ROI mask from subsequent processing operations performed on each of the other frames of the sequence. 7. The method of claim 1 , further comprising: responsive to determining that the sequence of frames was captured by a static camera, determining a size of an object present in multiple frames of the sequence by analyzing a position of the object relative to a fixed reference point that appears within each of the multiple frames. 8. The method of claim 1 , further comprising: responsive to determining that the sequence of frames was captured by the static camera, executing object tracking logic that assumes a fixed camera frame of reference. 9. A video indexing system comprising: a frame classifier that classifies video frames received as part of a sequence into different static camera scene groups; a context metadata generation engine that: receives a group of frames classified as comprising a same static camera scene group of the different static camera scene groups; and analyzes fewer than all frames in the group to generate context metadata for each frame in the group; and an indexing engine that indexes each frame in the group in a database in association with the generated context metadata. 10. The video indexing system of claim 9 , wherein the context metadata generation engine is further adapted to: determine that a select frame of the sequence was captured by a moving camera, based on a classification of the frame classifier; and responsive to the determination, generate context metadata for the select frame based on an image analysis limited to the select frame. 11. The video indexing system of claim 9 , wherein the context metadata generation engine is further adapted to: select a keyframe from the group; and generate at least a portion of the context metadata for each frame in the group based on an analysis of the keyframe without analyzing other frames of the group. 12. The video indexing system of claim 11 , wherein the context metadata generation engine is adapted to: generate a region of interest (ROI) mask for the keyframe of the group; apply the ROI mask to each frame in the group; and omit an area defined by the ROI mask from subsequent processing operations performed on the frames of the group. 13. The video indexing system of claim 11 , wherein the context metadata includes descriptors for multiple objects present in the keyframe. 14. The video indexing system of claim 13 , wherein the context metadata includes a scene label generated based on the descriptors generated for the keyframe. 15. The video indexing system of claim 9 , further comprising: an object tracker that executes logic to track movements of objects throughout multiple frames of the group, the logic being adapted to implement constraints that assume a fixed frame of reference for a camera capturing the video. 16. One or more tangible computer-readable storage media encoding computer-executable instructions for executing a computer processes comprising: classifying video frames within a sequence into different static camera scene groups; generating context metadata for each frame in a group of the different static camera scene groups, the context metadata being generated based on an analysis of fewer than all frames in the group; and indexing each frame in the group in a database in association with the generated context metadata. 17. The one or more tangible computer-readable storage media of claim 16 , wherein generating the context metadata further comprises: selecting a keyframe from the group; and generating at least a portion of the context metadata for each frame in the group based on an analysis of the keyframe without analyzing other frames of the group. 18. The one or more tangible computer-readable storage media of claim 16 , wherein the computer process further comprises: determining that a select frame of the sequence was captured by a moving camera; responsive to the determination, generating context metadata for the select frame based on an image analysis limited to the select frame. 19. The one or more tangible computer-readable storage media of claim 17 , wherein the computer process further comprises: generating a region of interest (ROI) mask for the keyframe of the group; applying the ROI mask to each frame in the group; and omitting an area defined by the ROI mask from subsequent processing operations performed on the frames of the group.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G11B27/28Primary
by using information signals recorded by the same method as the main recording {(G11B27/22 takes precedence)} · CPC title
G06F16/71Primary
Indexing; Data structures therefor; Storage structures · CPC title
G06F16/783
using metadata automatically derived from the content · CPC title
G06V20/47
Detecting features for summarising video content · CPC title
G06F16/7837
using objects detected or recognised in the video content · CPC title

Patent family

Related publications grouped by family.

View patent family 76011994

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11755643B2 cover?: A video indexing system identifies groups of frames within a video frame sequence captured by a static camera during a same scene. Context metadata is generated for each frame in each group based on an analysis of fewer than all frames in the group. The frames are indexed in a database in association with the generated context metadata.
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G11B27/28. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 12 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for 3-dimensional (3d) positioning of imaging device

Using Domain Constraints And Verification Points To Monitor Task Performance

Technologies for dynamic performance of image analysis

Automatic animation triggering from video

Video analysis techniques for improved editing, navigation, and summarization

Method and apparatus to generate haptic feedback from video content analysis

Frequently asked questions