Method for analysing media content
US-2019251360-A1 · Aug 15, 2019 · US
US11568545B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11568545-B2 |
| Application number | US-201916728714-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 27, 2019 |
| Priority date | Nov 20, 2017 |
| Publication date | Jan 31, 2023 |
| Grant date | Jan 31, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Various embodiments of a framework which allow, as an alternative to resource-taxing decompression, efficient computation of feature maps using a compressed content data subset, such as video, by exploiting the motion information, such as a motion vector, present in the compressed video. This framework allows frame-specific object recognition and action detection algorithms to be applied to compressed video and other media files by executing only on I-frames in a Group of Pictures and linearly interpolating the results. Training and machine learning increases recognition accuracy. Yielding significant computational gains, this approach accelerates frame-wise feature extraction I-frame/P-frame/P-frame videos as well as I-frame/P-frame/B-frame videos. The present techniques may also be used for segmentation to identify and label respective regions for objects in a video.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: receiving two or more compressed image frames comprising at least one complete image frame and at least one partial image frame; using a neural network to extract a feature from the at least one complete image frame; estimating a feature map for the at least one partial image frame; using an iterative detector routine to detect features in the at least one partial image frame; calculating a motion vector for a feature location in the at least one partial image frame based on the at least one complete image frame and the iterative detector routine; and estimating the feature location in the at least one partial image frame using the calculated motion vector. 2. The computer-implemented method of claim 1 , further comprising: receiving compressed image frames for training the neural network prior to receiving the at least one complete image frame and the at least one partial image frame. 3. The computer-implemented method of claim 1 , further comprising: executing one or more of an event detection module, action detection module, or video segmentation module on the feature map. 4. The computer-implemented method of claim 1 , wherein the at least one complete image frame consists of an Intracoded frame (I-frame), and the at least one partial image frame consists of a Predictive frame (P-frame) or a Bidirectional frame (B-frame). 5. The computer-implemented method of claim 1 , further comprising: applying an object detection network to detect the features in the at least one partial image frame. 6. The computer-implemented method of claim 5 , wherein the object detection network generates one or more proposed locations within a bounding box and iteratively checks each of the one or more proposed locations to determine if the feature is present at the proposed location. 7. The computer-implemented method of claim 1 , further comprising: using an existing motion vector to calculate the motion vector for the feature location in the at least one partial image frame. 8. The computer-implemented method of claim 1 , further comprising: a training element for improving detection accuracy through machine learning. 9. A computing system, comprising: at least one processor; and memory including instructions that, when executed by the at least one processor, cause the computing system to: obtain compressed content including two or more compressed frames comprising at least one complete frame and at least one partial frame; extract an object from the at least one complete frame using a neural network; estimate an object map for the at least one partial frame; detect objects in the at least one partial frame using an iterative detection algorithm; use the at least one complete frame to calculate a motion estimation vector for the object in the at least one partial frame; estimate the location of the object in the at least one partial frame using the calculated motion estimation vector; propose the estimated location of the object for verification. 10. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: obtain, prior to obtaining the compressed content, obtain compressed training content for training the neural network. 11. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: execute at least one an event detection module, action detection module, or video segmentation module on top of the feature map. 12. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: apply an object detection algorithm to detect the object in the at least one partial frame. 13. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: apply an object detection algorithm that generates one or more potential object locations within a bounding box and iteratively checks each of the one or more potential object locations to determine if the feature is present at each potential object location. 14. The computing system of claim 9 , wherein the instructions when executed further cause the computing system to: use an existing motion vector compressed with the at least one complete frame to calculate the motion estimation vector for the object in the at least one partial frame. 15. A method, comprising: receiving a plurality of compressed key frames comprising at least one complete key frame and at least one partial key frame; using a neural network with machine learning algorithms to extract a feature from the at least one complete key frame; using the at least one complete key frame to estimate a feature map for the at least one partial key frame; using an iterative feature detection algorithm to detect features in the at least one partial key frame; calculating a motion vector for a feature location in the at least one partial key frame based on the at least one complete key frame and the feature map; and estimating the feature location in the at least one partial key frame using the calculated motion vector. 16. The method of claim 15 , further comprising: using the feature map to execute at least one of an event detection module, action detection module, or video segmentation module. 17. The method of claim 15 , wherein the neural network with machine learning algorithms has been previously trained using compressed training key frames for improved feature detection accuracy. 18. The method of claim 15 , further comprising: applying an object detection network to detect the features in the at least one partial key frame. 19. The method of claim 15 , further comprising: generating one or more proposed feature locations within a bounding box and iteratively checking each of the one or more proposed feature locations to determine if the feature is present at the each proposed feature location. 20. The method of claim 15 , further comprising: using an existing motion vector compressed with the at least one complete key frame to calculate the motion vector for the feature location in the at least one partial key frame.
Interaction with lists of selectable items, e.g. menus · CPC title
graphically representing goods, e.g. 3D product representation · CPC title
Interaction techniques based on cursor appearance or behaviour, e.g. being affected by the presence of displayed objects · CPC title
Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title
Artificial neural networks [ANN] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.