Machine-Learning-Based Adaptation of Coding Parameters for Video Encoding Using Motion and Object Detection
US-2021168408-A1 · Jun 3, 2021 · US
US11627318B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11627318-B2 |
| Application number | US-202117523280-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 10, 2021 |
| Priority date | Dec 7, 2020 |
| Publication date | Apr 11, 2023 |
| Grant date | Apr 11, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems and computer program products, for producing streams of image frames. Image frames in streaming video are segmented into background segments and instance segments. A background image frame containing the background segments is created. At least some of the instance segments are classified into movable objects of interest and movable objects of non-interest. During a background update time period, the background image frame is updated when a movable object of non-interest has moved to reveal a background area, to include the revealed background area in the background image frame. A foreground image containing the movable objects of interest is created. Blocks of pixels of the updated background and foreground image frames are encoded. A stream of encoded foreground image frames having a first frame rate is produced. A stream of encoded updated background image frames a second, lower frame rate is produced.
Opening claim text (preview).
The invention claimed is: 1. A method, in an encoding system, for producing streams of image frames, comprising: segmenting image frames in a stream of image frames into one or more background areas and one or more objects; creating a background image frame that contains the one or more background areas; classifying at least some of the one or more objects into movable objects of interest and into movable objects of non-interest; updating, during a background update time period, the background image frame when a movable object of non-interest has moved to reveal a further background area, to include the further background area in the background image frame; at the end of the background update time period, verifying a completeness of the updates to the background image frame; in response to determining that the background image frame updates are incomplete: determining which movable object of non-interest caused the incompleteness; and including the movable object of non-interest that caused the incompleteness in the foreground image frame; in response to determining that the entire background image frame has been updated: refraining from including the movable object of non-interest in any of the background frame and the foreground frame; creating a foreground image frame that contains the movable objects of interest; encoding blocks of pixels of the updated background image frame; encoding blocks of pixels of the foreground image frame; producing a stream of encoded foreground image frames having a first frame rate; and producing a stream of encoded updated background image frames having a second frame rate that is lower than the first frame rate. 2. The method of claim 1 , wherein the segmenting of image frames is done using panoptic segmentation, wherein pixels in the image frame are either assigned to a background area including a group of objects of a particular type, or assigned to an individual object. 3. The method of claim 1 , further comprising receiving a user selection from a list of object types, the user selection indicating which types of objects should be considered movable objects of interest and movable objects of non-interest. 4. The method of claim 1 , wherein the movable objects of interest include one or more of: humans, vehicles, weapons, bags, and face masks. 5. The method of claim 1 , wherein the movement of the movable object of non-interest is tracked by a motion and object detector during the background update time period, and wherein the background image frame is updated several times before the expiration of the background update time period. 6. The method of claim 1 , wherein encoding the foreground image frame includes encoding pixel data only for pixels corresponding to movable objects of interest, and encoding the remainder of the foreground image frame as black pixels. 7. The method of claim 1 , wherein the first frame rate is thirty image frames per second and the second frame rate is one image frame per minute. 8. The method of claim 1 , further comprising: classifying an object as a stationary object of non-interest; and updating the background image frame to include the stationary object of non-interest. 9. The method of claim 1 , wherein updating the background image frame when a movable object of non-interest has moved to reveal a background area includes: comparing the movement of the movable object of non-interest with one or more of: an area-dependent threshold value, distance-dependent threshold value and a time-dependent threshold value; and when the movement of the movable object of non-interest exceeds at least one threshold value, updating the background image frame. 10. The method of claim 9 , further comprising: setting the threshold values based on available computing resources. 11. The method of claim 10 , wherein setting the threshold values includes: setting the threshold values such that a frequency of the updating of the background image frame is limited to a frequency of updating that can be accommodated by available computing resources. 12. An encoding system for producing streams of image frames, comprising an encoder and a motion and object detector, wherein the motion and object detector is configured to: segment image frames in a stream of image frames into one or more background areas and one or more objects; and classify at least some of the one or more objects into movable objects of interest and into movable objects of non-interest; and wherein the encoder is configured to: create a background image frame that contains the one or more background areas; update, during a background update time period, the background image frame when a movable object of non-interest has moved to reveal a further background area, to include the further background area in the background image frame; at the end of the background update time period, verifying a completeness of the updates to the background image frame; in response to determining that the background image frame updates are incomplete: determining which movable object of non-interest caused the incompleteness; and including the movable object of non-interest that caused the incompleteness in the foreground image frame; in response to determining that the entire background image frame has been updated: refraining from including the movable objects of non-interest in any of the background frame and the foreground frame; create a foreground image frame that contains the movable objects of interest; encode blocks of pixels of the updated background image frame; encode blocks of pixels of the foreground image frame; produce a stream of encoded foreground image frames having a first frame rate; and produce a stream of encoded updated background image frames having a second frame rate that is lower than the first frame rate. 13. A computer program product for producing streams of image frames, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions being executable by a processor to perform a method comprising: segmenting image frames in a stream of image frames into one or more background areas and one or more objects; creating a background image frame that contains the one or more background areas; classifying at least some of the one or more objects into movable objects of interest and into movable objects of non-interest; updating, during a background update time period, the background image frame when a movable object of non-interest has moved to reveal a further background area, to include the further background area in the background image frame; at the end of the background update time period, verifying a completeness of the updates to the background image frame; in response to determining that the background image frame updates are incomplete: determining which movable object of non-interest caused the incompleteness; and including the movable object of non-interest that caused the incompleteness in the foreground image frame; in response to determining that the entire background image frame has been updated: refraining from including the movable objects of non-interest in any of the background frame and the foreground frame; creating a foreground image frame that contains the movable objects of interest; encoding blocks of pixels of the updated background image frame; encoding blocks of pixels of the foreground image frame; producing a stream of encoded foreground image frames having a first
the incoming video signal comprising different parts having originally different frame rate, e.g. video and graphics · CPC title
Position within a video image, e.g. region of interest [ROI] · CPC title
Video; Image sequence · CPC title
Data rate or code amount at the encoder output · CPC title
Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.