Compressed content object and action detection

US11568545B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11568545-B2
Application numberUS-201916728714-A
CountryUS
Kind codeB2
Filing dateDec 27, 2019
Priority dateNov 20, 2017
Publication dateJan 31, 2023
Grant dateJan 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments of a framework which allow, as an alternative to resource-taxing decompression, efficient computation of feature maps using a compressed content data subset, such as video, by exploiting the motion information, such as a motion vector, present in the compressed video. This framework allows frame-specific object recognition and action detection algorithms to be applied to compressed video and other media files by executing only on I-frames in a Group of Pictures and linearly interpolating the results. Training and machine learning increases recognition accuracy. Yielding significant computational gains, this approach accelerates frame-wise feature extraction I-frame/P-frame/P-frame videos as well as I-frame/P-frame/B-frame videos. The present techniques may also be used for segmentation to identify and label respective regions for objects in a video.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving two or more compressed image frames comprising at least one complete image frame and at least one partial image frame; using a neural network to extract a feature from the at least one complete image frame; estimating a feature map for the at least one partial image frame; using an iterative detector routine to detect features in the at least one partial image frame; calculating a motion vector for a feature location in the at least one partial image frame based on the at least one complete image frame and the iterative detector routine; and estimating the feature location in the at least one partial image frame using the calculated motion vector. 2. The computer-implemented method of claim 1 , further comprising: receiving compressed image frames for training the neural network prior to receiving the at least one complete image frame and the at least one partial image frame. 3. The computer-implemented method of claim 1 , further comprising: executing one or more of an event detection module, action detection module, or video segmentation module on the feature map. 4. The computer-implemented method of claim 1 , wherein the at least one complete image frame consists of an Intracoded frame (I-frame), and the at least one partial image frame consists of a Predictive frame (P-frame) or a Bidirectional frame (B-frame). 5. The computer-implemented method of claim 1 , further comprising: applying an object detection network to detect the features in the at least one partial image frame. 6. The computer-implemented method of claim 5 , wherein the object detection network generates one or more proposed locations within a bounding box and iteratively checks each of the one or more proposed locations to determine if the feature is present at the proposed location. 7. The computer-implemented method of claim 1 , further comprising: using an existing motion vector to calculate the motion vector for the feature location in the at least one partial image frame. 8. The computer-implemented method of claim 1 , further comprising: a training element for improving detection accuracy through machine learning. 9. A computing system, comprising: at least one processor; and memory including instructions that, when executed by the at least one processor, cause the computing system to: obtain compressed content including two or more compressed frames comprising at least one complete frame and at least one partial frame; extract an object from the at least one complete frame using a neural network; estimate an object map for the at least one partial frame; detect objects in the at least one partial frame using an iterative detection algorithm; use the at least one complete frame to calculate a motion estimation vector for the object in the at least one partial frame; estimate the location of the object in the at least one partial frame using the calculated motion estimation vector; propose the estimated location of the object for verification. 10. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: obtain, prior to obtaining the compressed content, obtain compressed training content for training the neural network. 11. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: execute at least one an event detection module, action detection module, or video segmentation module on top of the feature map. 12. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: apply an object detection algorithm to detect the object in the at least one partial frame. 13. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: apply an object detection algorithm that generates one or more potential object locations within a bounding box and iteratively checks each of the one or more potential object locations to determine if the feature is present at each potential object location. 14. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: use an existing motion vector compressed with the at least one complete frame to calculate the motion estimation vector for the object in the at least one partial frame. 15. A method, comprising: receiving a plurality of compressed key frames comprising at least one complete key frame and at least one partial key frame; using a neural network with machine learning algorithms to extract a feature from the at least one complete key frame; using the at least one complete key frame to estimate a feature map for the at least one partial key frame; using an iterative feature detection algorithm to detect features in the at least one partial key frame; calculating a motion vector for a feature location in the at least one partial key frame based on the at least one complete key frame and the feature map; and estimating the feature location in the at least one partial key frame using the calculated motion vector. 16. The method of claim 15 , further comprising: using the feature map to execute at least one of an event detection module, action detection module, or video segmentation module. 17. The method of claim 15 , wherein the neural network with machine learning algorithms has been previously trained using compressed training key frames for improved feature detection accuracy. 18. The method of claim 15 , further comprising: applying an object detection network to detect the features in the at least one partial key frame. 19. The method of claim 15 , further comprising: generating one or more proposed feature locations within a bounding box and iteratively checking each of the one or more proposed feature locations to determine if the feature is present at the each proposed feature location. 20. The method of claim 15 , further comprising: using an existing motion vector compressed with the at least one complete key frame to calculate the motion vector for the feature location in the at least one partial key frame.

Assignees

Inventors

Classifications

  • Interaction with lists of selectable items, e.g. menus · CPC title

  • graphically representing goods, e.g. 3D product representation · CPC title

  • Interaction techniques based on cursor appearance or behaviour, e.g. being affected by the presence of displayed objects · CPC title

  • G06T7/20Primary

    Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

  • Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11568545B2 cover?
Various embodiments of a framework which allow, as an alternative to resource-taxing decompression, efficient computation of feature maps using a compressed content data subset, such as video, by exploiting the motion information, such as a motion vector, present in the compressed video. This framework allows frame-specific object recognition and action detection algorithms to be applied to com…
Who is the assignee on this patent?
A9 Com Inc
What technology area does this patent fall under?
Primary CPC classification G06Q30/0643. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).