End-to-end neural network based video coding
US-2022086463-A1 · Mar 17, 2022 · US
US12555201B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12555201-B2 |
| Application number | US-202117551087-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 14, 2021 |
| Priority date | Dec 14, 2021 |
| Publication date | Feb 17, 2026 |
| Grant date | Feb 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In various embodiments, an image preprocessing application preprocesses images. To preprocess an image, the image preprocessing application executes a trained machine learning model on first data corresponding to both the image and a first set of components of a luma-chroma color space to generate first preprocessed data. The image preprocessing application executes at least a different trained machine learning model or a non-machine learning algorithm on second data corresponding to both the image and a second set of components of the luma-chroma color space to generate second preprocessed data. Subsequently, the image preprocessing application aggregates at least the first preprocessed data and the second preprocessed data to generate a preprocessed image.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for preprocessing images, the method comprising: preprocessing a first image using different preprocessors for components of a luma-chroma color space comprising a luma component, a first chroma component, and a second chroma component, wherein preprocessing the first image comprises: executing a first preprocessor comprising a first trained machine learning model on first data corresponding to both a first image and a first set of components of the luma-chroma color space comprising the first chroma component to generate first preprocessed data, wherein the first preprocessed data comprises a downscaling of the first data; and executing a second preprocessor comprising at least one of a second trained machine learning model or a first non-machine learning algorithm on second data corresponding to both the first image and a second set of components of the luma-chroma color space comprising the second chroma component to generate second preprocessed data, wherein the second preprocessed data comprises a downscaling of the second data; aggregating at least the first preprocessed data and the second preprocessed data to generate a first preprocessed image; and aggregating the first preprocessed image with at least a second preprocessed image to generate a spatially preprocessed video. 2 . The computer-implemented method of claim 1 , wherein the first set of components includes at least one of a blue-difference component or a red-difference component, and the first set of components further includes a luma component. 3 . The computer-implemented method of claim 1 , further comprising executing the first trained machine learning model or a third trained machine learning model on third data corresponding to both the first image and a third set of components of the luma-chroma color space to generate third preprocessed data that is included in the first preprocessed image. 4 . The computer-implemented method of claim 1 , wherein the first image is represented in an RGB color space, and further comprising converting at least a portion of the first image to the luma-chroma color space to determine the first data and the second data. 5 . The computer-implemented method of claim 1 , further comprising identifying the first trained machine learning model based on a downscaling factor and at least one of a chroma subsampling ratio or a denoising type. 6 . The computer-implemented method of claim 1 , wherein a first resolution of the first preprocessed data is not equal to a second resolution of the second preprocessed data. 7 . The computer-implemented method of claim 1 , further comprising extracting the first image from a video. 8 . The computer-implemented method of claim 1 , wherein the first non-machine learning algorithm comprises one of a downscaling algorithm, a chroma subsampling algorithm, or a spatial denoising algorithm. 9 . The computer-implemented method of claim 1 , further comprising: performing one or more temporal preprocessing operations on the spatially preprocessed video to generate a preprocessed video. 10 . The computer-implemented method of claim 9 , further comprising encoding the preprocessed video to generate an encoded video. 11 . One or more non-transitory computer readable media including instructions that, when executed by one or more processors, cause the one or more processors to preprocess images by performing the steps of: preprocessing a first image using different preprocessors for components of a luma-chroma color space comprising a luma component, a first chroma component, and a second chroma component, wherein preprocessing the first image comprises: executing a first preprocessor comprising a first trained machine learning model on first data corresponding to both a first image and a first set of components of the luma-chroma color space comprising the first chroma component to generate first preprocessed data, wherein the first preprocessed data comprises a downscaling of the first data; and preprocessing, by a second preprocessor, second data corresponding to both the first image and a second set of components of the luma-chroma color space comprising the second chroma component to generate second preprocessed data, wherein the second preprocessed data comprises a downscaling of the second data; aggregating at least the first preprocessed data and the second preprocessed data to generate a first preprocessed image; and aggregating the first preprocessed image with at least a second preprocessed image to generate a spatially preprocessed video. 12 . The one or more non-transitory computer readable media of claim 11 , wherein the first set of components includes at least one of a blue-difference component or a red-difference component. 13 . The one or more non-transitory computer readable media of claim 11 , further comprising executing the first trained machine learning model or a second trained machine learning model on third data corresponding to both the first image and a third set of components of the luma-chroma color space to generate third preprocessed data that is included in the first preprocessed image. 14 . The one or more non-transitory computer readable media of claim 11 , wherein the first image is represented in an RGB color space, and further comprising converting at least a portion of the first image to the luma-chroma color space to determine the first data and the second data. 15 . The one or more non-transitory computer readable media of claim 11 , further comprising identifying the first trained machine learning model based on at least one of a downscaling factor or a chroma subsampling ratio. 16 . The one or more non-transitory computer readable media of claim 11 , wherein a first resolution of the first preprocessed data is not equal to a second resolution of the second preprocessed data. 17 . The one or more non-transitory computer readable media of claim 11 , wherein preprocessing the second data comprises: executing a second trained machine learning model on the second data to generate downsampled data corresponding to at least one of a chroma subsampling factor or a downscaling factor; and executing a first non-machine learning algorithm on the downsampled data to generate the second preprocessed data. 18 . The one or more non-transitory computer readable media of claim 11 , further comprising: performing one or more temporal denoising operations on the spatially preprocessed video to generate a preprocessed video. 19 . The one or more non-transitory computer readable media of claim 11 , further comprising encoding the first preprocessed image to generate an encoded image. 20 . A system comprising: one or more memories storing instructions; and one or more processors coupled to the one or more memories that, when executing the instructions, perform the steps of: preprocessing a first image using different preprocessors for components of a luma-chroma color space comprising a luma component, a first chroma component, and a second chroma component, wherein preprocessing the first image comprises: executing a first preprocessor comprising a first trained machine learning model on first data corresponding to both a first image and a first set of components of the luma-chroma color space comprising the first chroma component to generate first preprocessed data, wherein the first preprocessed data comprises a downscaling of the first data; and executing a second
Training; Learning · CPC title
Color image · CPC title
Scaling of whole images or parts thereof, e.g. expanding or contracting · CPC title
Determination of colour characteristics · CPC title
Denoising; Smoothing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.