Who is the assignee on this patent?

Kezele Irina, Shahabinejad Mostafa, Nabavi Seyed Shahabeddin, and 6 more

What technology area does this patent fall under?

Primary CPC classification G06V40/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method, device, and medium for adaptive inference in compressed video domain

US12062252B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12062252-B2
Application number	US-202117538516-A
Country	US
Kind code	B2
Filing date	Nov 30, 2021
Priority date	Nov 30, 2021
Publication date	Aug 13, 2024
Grant date	Aug 13, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, devices and computer-readable media for processing a compressed video to perform an inference task are disclosed. Processing the compressed video may include selecting a subset of frame encodings of the compressed video, or zero or more modalities (RGB, motion vectors, residuals) of a frame encoding, for further processing to perform the inference task. Pre-existing motion vector and/or residual information in frame encodings of the compressed video are leveraged to adaptively and efficiently perform the inference task. In some embodiments, the inference task is an action recognition task, such as a human action recognition task.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for selecting a subset of frames decoded from a compressed video for further processing to perform an action recognition task or to train a model to perform the action recognition task, the method comprising: obtaining a plurality of inter frame encodings of the compressed video representative of a temporal sequence of frames, the plurality of inter frame encodings comprising: a first inter frame encoding representative of a first inter frame at the beginning of the temporal sequence of frames; a second inter frame encoding representative of a second inter frame at the end of the temporal sequence of frames; and a plurality of intermediate inter frame encodings, each representative of an inter frame between the first inter frame and the second inter frame in the temporal sequence of frames; and each intermediate inter frame encoding comprising: motion information of the respective intermediate inter frame relative to a respective reference frame in the temporal sequence of frames; processing the motion information of the plurality of intermediate inter frame encodings to generate cumulative motion information representative of motion between the first inter frame and the second inter frame; processing the cumulative motion information to generate decision information, the decision information indicating whether the second inter frame should be included in the subset of frames; and selecting the subset of frames based on the decision information. 2. The method of claim 1 , wherein: processing the motion information of the plurality of intermediate inter frame encodings to generate cumulative motion information comprises: for each frame encoding of the plurality of intermediate inter frame encodings, processing the motion information to generate a motion vector field; processing the motion vector fields of all frame encodings of the plurality of intermediate inter frame encodings to generate a cumulative motion vector field; and processing the cumulative motion vector field to generate a maximum absolute magnitude of the cumulative motion vector field; and processing the cumulative motion information to generate decision information comprises: comparing the maximum absolute magnitude of the cumulative motion vector field to a motion threshold to determine whether the second inter frame should be included in the subset of frames. 3. The method of claim 2 , further comprising, after selecting the subset of frames: storing the subset of frames for subsequent processing: by a trained inference model to perform the action recognition task; or to train an inference model to perform the action recognition task. 4. A non-transitory processor-readable medium having tangibly stored thereon instructions that, when executed by a processor of a device, cause the device to perform the method of claim 1 .

Assignees

Inventors

Classifications

G06V10/778
Active pattern-learning, e.g. online learning of image or video features · CPC title
G06V30/19127
Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods · CPC title
G06V30/1912
Selecting the most significant subset of features (G06V30/19127 takes precedence) · CPC title
G06V10/94
Hardware or software architectures specially adapted for image or video understanding · CPC title
G06V10/62
relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking · CPC title

Patent family

Related publications grouped by family.

View patent family 86500517

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12062252B2 cover?: Methods, devices and computer-readable media for processing a compressed video to perform an inference task are disclosed. Processing the compressed video may include selecting a subset of frame encodings of the compressed video, or zero or more modalities (RGB, motion vectors, residuals) of a frame encoding, for further processing to perform the inference task. Pre-existing motion vector and/o…
Who is the assignee on this patent?: Kezele Irina, Shahabinejad Mostafa, Nabavi Seyed Shahabeddin, and 6 more
What technology area does this patent fall under?: Primary CPC classification G06V40/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and device for encoding or decoding based on inter-frame prediction

Augmentation of video datasets for machine learning training

Video encoding

Frequently asked questions