Decoding with signaling of segmentation information
US-2023336759-A1 · Oct 19, 2023 · US
US2022385907A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022385907-A1 |
| Application number | US-202117645018-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 17, 2021 |
| Priority date | May 21, 2021 |
| Publication date | Dec 1, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are described for compressing and decompressing data using machine learning systems. An example process can include receiving a plurality of images for compression by a neural network compression system. The process can include determining, based on a first image from the plurality of images, a first plurality of weight values associated with a first model of the neural network compression system. The process can include generating a first bitstream comprising a compressed version of the first plurality of weight values. The process can include outputting the first bitstream for transmission to a receiver.
Opening claim text (preview).
What is claimed is: 1 . A method of processing media data, comprising: receiving a plurality of images for compression by a neural network compression system; determining, based on a first image from the plurality of images, a first plurality of weight values associated with a first model of the neural network compression system; generating a first bitstream comprising a compressed version of the first plurality of weight values; and outputting the first bitstream for transmission to a receiver. 2 . The method of claim 1 , wherein at least one layer of the first model includes a positional encoding of a plurality of coordinates associated with the first image. 3 . The method of claim 2 , wherein the first model is configured to determine one or more pixel values corresponding to the plurality of coordinates associated with the first image. 4 . The method of claim 1 , further comprising: determining, based on a second image from the plurality of images, a second plurality of weight values for use by a second model associated with the neural network compression system; generating a second bitstream comprising a compressed version of the second plurality of weight values; and outputting the second bitstream for transmission to a receiver. 5 . The method of claim 4 , wherein the second model is configured to determine an optical flow between the first image and the second image. 6 . The method of claim 5 , further comprising: determining, based on the optical flow, at least one updated weight value from the first plurality of weight values. 7 . The method of claim 1 , further comprising: quantizing the first plurality of weight values under a weight prior to yield a plurality of quantized weight values, wherein the first bitstream comprises a compressed version of the plurality of quantized weight values. 8 . The method of claim 7 , wherein the weight prior is selected to minimize a rate loss associated with sending the first bitstream to the receiver. 9 . The method of claim 7 , wherein generating the first bitstream comprises: entropy encoding the first plurality of weight values using the weight prior. 10 . The method of claim 7 , wherein the first plurality of weight values is quantized using fixed-point quantization. 11 . The method of claim 10 , wherein the fixed-point quantization is implemented using a machine learning algorithm. 12 . The method of claim 1 , further comprising: selecting, based on the first image, a model architecture corresponding to the first model. 13 . The method of claim 12 , further comprising: generating a second bitstream comprising a compressed version of the model architecture; and outputting the second bitstream for transmission to the receiver. 14 . The method of claim 12 , wherein selecting the model architecture comprises: tuning, based on the first image, a plurality of weight values associated with one or more model architectures, wherein each of the one or more model architectures is associated with one or more model characteristics; determining at least one distortion between the first image and reconstructed data output corresponding to each of the one or more model architectures; and selecting the model architecture from the one or more model architectures based on the at least one distortion. 15 . The method of claim 14 , wherein the one or more model characteristics include at least one of a width, a depth, a resolution, a size of a convolution kernel, and an input dimension. 16 . An apparatus comprising: at least one memory; and at least one processor coupled to the at least one memory and configured to: receive a plurality of images for compression by a neural network compression system; determine, based on a first image from the plurality of images, a first plurality of weight values associated with a first model of the neural network compression system; generate a first bitstream comprising a compressed version of the first plurality of weight values; and output the first bitstream for transmission to a receiver. 17 . The apparatus of claim 16 , wherein at least one layer of the first model includes a positional encoding of a plurality of coordinates associated with the first image. 18 . The apparatus of claim 17 , wherein the first model is configured to determine one or more pixel values corresponding to the plurality of coordinates associated with the first image. 19 . The apparatus of claim 16 , wherein the at least one processor is further configured to: determine, based on a second image from the plurality of images, a second plurality of weight values for use by a second model associated with the neural network compression system; generate a second bitstream comprising a compressed version of the second plurality of weight values; and output the second bitstream for transmission to a receiver. 20 . The apparatus of claim 19 , wherein the second model is configured to determine an optical flow between the first image and the second image. 21 . The apparatus of claim 20 , wherein the at least one processor is further configured to: determine, based on the optical flow, at least one updated weight value from the first plurality of weight values. 22 . The apparatus of claim 16 , wherein the at least one processor is further configured to: quantize the first plurality of weight values under a weight prior to yield a plurality of quantized weight values, wherein the first bitstream comprises a compressed version of the plurality of quantized weight values. 23 . The apparatus of claim 22 , wherein the weight prior is selected to minimize a rate loss associated with sending the first bitstream to the receiver. 24 . The apparatus of claim 22 , wherein to generate the first bitstream the at least one processor is further configured to: entropy encode the first plurality of weight values using the weight prior. 25 . The apparatus of claim 22 , wherein the first plurality of weight values are quantized using fixed-point quantization. 26 . The apparatus of claim 25 , wherein the fixed-point quantization is implemented using a machine learning algorithm. 27 . The apparatus of claim 16 , wherein the at least one processor is further configured to: select, based on the first image, a model architecture corresponding to the first model. 28 . The apparatus of claim 27 , wherein the at least one processor is further configured to: generate a second bitstream comprising a compressed version of the model architecture; and output the second bitstream for transmission to the receiver. 29 . The apparatus of claim 27 , wherein to select the model architecture the at least one processor is further configured to: tune, based on the first image, a plurality of weight values associated with one or more model architectures, wherein each of the one or more model architectures is associated with one or more model characteristics; determine at least one distortion between the first image and reconstructed data output corresponding to each of the one or more model architectures; and select the model architecture from the one or more model architectures based on the at least one distortion. 30 . The apparatus of claim 29 , wherein the one or more model characteristics include a
Combinations of networks · CPC title
Motion estimation other than block-based · CPC title
Learning methods · CPC title
Entropy coding, e.g. variable length coding [VLC] or arithmetic coding · CPC title
Data rate or code amount at the encoder output · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.