High-fidelity generative image compression
US-2024107079-A1 · Mar 28, 2024 · US
US9723315B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9723315-B2 |
| Application number | US-201213443745-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 10, 2012 |
| Priority date | Jul 1, 2011 |
| Publication date | Aug 1, 2017 |
| Grant date | Aug 1, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system an method for determining to select frames from a video sequence that have high visual appeal and can be coded at high quality when frame rates of coded video drop to such low levels that perceptual sensations of moving video are lost. A metric is derived from a candidate input frame, and such metric is used to determine whether to increase or decrease a weight accorded to the candidate input frame. In an embodiment, the metric may be the auto-exposure data associated with the candidate input frame.
Opening claim text (preview).
What is claimed is: 1. A video coding method, comprising, when a coding frame rate drops below a predetermined threshold: buffering a plurality of input video frames generated by a camera, for each buffered input frame, assigning a weight based on a frame quality metric evaluating a quality of the frame, the frame quality metric being a function of a rate of change of auto-exposure settings of the camera during capture of the frame, coding a highest weighted frame of the plurality of buffered input frames, and discarding a plurality of lower-weighted frames of the plurality of buffered input frames from the buffer without coding. 2. The video coding method of claim 1 , wherein the frame quality metric is derived from exposure changes between each buffered input frame and its preceding frame. 3. The video coding method of claim 1 , wherein the frame quality metric is derived from estimated luminance of each buffered input frame. 4. The video coding method of claim 1 , wherein the frame quality metric is derived from estimated face detection performed on each buffered input frame. 5. The video coding method of claim 4 , wherein the frame quality metric further is derived from estimated luminance of a region of a detected face within each input frame. 6. The video coding method of claim 4 , wherein the frame quality metric further is derived from a detected artifact of a face within each input frame. 7. The video coding method of claim 4 , wherein the frame quality metric further is derived from a location of a detected face within each input frame. 8. The video coding method of claim 4 , wherein the frame quality metric further is derived from a confidence score associated with a detected face within each input frame. 9. The video coding method of claim 6 , wherein the artifact is a detected smile. 10. The video coding method of claim 6 , wherein the artifact is detection of open eyes. 11. The video coding method of claim 6 , wherein the frame quality metric is derived from an estimate of spatial complexity within each buffered input frame. 12. The video coding method of claim 1 , wherein the frame quality metric is derived from an estimate of motion of each buffered input frame. 13. The video coding method of claim 1 , wherein the frame quality metric is derived from an estimate of jitter associated with each input frame. 14. The video coding method of claim 1 , wherein the frame quality metric is derived from an estimate of temporal consistency between each input frame and at least one previously coded frame. 15. The video coding method of claim 1 , wherein the coding comprises, for each pixel block of the frame to be coded: performing a motion estimation search between the respective pixel block of the frame to be coded and a plurality of locally-stored reference frames, for each candidate reference frame identified by the search, determining a similarity measure between the respective pixel block to be coded and a matching pixel block from the respective candidate reference frame, scaling the similarity measures according to the candidate reference frames' temporal locations, and selecting a matching pixel block as a prediction reference for the pixel block to be coded based on the scaled similarity measures, and coding the input pixel block with reference to the prediction reference. 16. Video coding apparatus, comprising: a camera, a video coder system, comprising: a buffer to store input frames of a video sequence from the camera, a coding engine to code selected frames from the buffer according to temporal prediction techniques, a reference picture cache to store reconstructed video data of coded reference frames, and a controller to control operation of the video coding sequence to, when a coding frame rate drops below a predetermined threshold: for each buffered input frame, assign a weight based on a frame quality metric evaluating a quality of the frame, the frame quality metric being a function of a rate of change of auto-exposure settings of the camera during capture of the frame, code a highest weighted frame of the plurality of buffered input frames, and discard a plurality of lower-weighted frames of the plurality of buffered input frames from the buffer without coding. 17. The apparatus of claim 16 , wherein the video coder comprises a pre-processor that estimates exposure of buffered frames and the frame quality metric is derived from exposure changes between each buffered input frame and its preceding frame. 18. The apparatus of claim 16 , wherein the video coder comprises a pre-processor that estimates luminance of buffered frames and the frame quality metric is derived from estimated luminance of each buffered input frame. 19. The apparatus of claim 16 , further comprising a face detector, wherein the frame quality metric is derived from estimated face detection performed on each buffered input frame. 20. The apparatus of claim 16 , wherein the video coder comprises a pre-processor that estimates spatial complexity of buffered frames and the frame quality metric is derived from an estimate of spatial complexity within each buffered input frame. 21. The apparatus of claim 16 , further comprising a motion sensor, wherein the frame quality metric is derived from an estimate of motion of each buffered input frame. 22. The apparatus of claim 16 , wherein the frame quality metric is derived from an estimate of jitter associated with each input frame. 23. The apparatus of claim 16 , wherein the frame quality metric is derived from an estimate of temporal consistency between each input frame and at least one previously coded frame. 24. A non-transitory machine-readable storage medium having stored thereon program instructions which, when executed by a processor perform a method, the method comprising: buffering in the storage device a plurality of input video frames generated by a camera; for each buffered input frame, assigning a weight based on a frame quality metric evaluating a quality of the frame, the frame quality metric being a function of a rate of change of auto-exposure settings of the camera during capture of the frame; coding a highest weighted frame of the plurality of buffered input frames; and discarding a plurality of lower-weighted frames of the plurality of buffered input frames from the storage device without coding. 25. The non-transitory storage medium of claim 24 , wherein the frame quality metric is derived from exposure changes between each buffered input frame and its preceding frame. 26. The non-transitory storage medium of claim 24 , wherein the frame quality metric is derived from estimated luminance of each buffered input frame. 27. The non-transitory storage medium of claim 24 , wherein the frame quality metric is derived from estimated face detection performed on each buffered input frame. 28. The non-transitory storage medium of claim 27 , wherein the frame quality metric further is derived from estimated luminance of a region of a detected face within each input frame. 29. The non-transitory storage medium of claim 27 , wherein the frame quality metric further is derived from a detected artifact of a face within each input frame. 30. The non-transitory storage medium of claim 27 , wherein the frame quality metric further is derived from a location
Coding unit complexity, e.g. amount of activity or edge presence estimation (H04N19/146 takes precedence) · CPC title
Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion (use of rate-distortion criteria H04N19/147) · CPC title
Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction · CPC title
Motion inside a coding unit, e.g. average field, frame or block difference · CPC title
the region being a picture, frame or field · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.