Hybrid spatio-temporal neural models for video compression

US12598317B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12598317-B2
Application numberUS-202318532638-A
CountryUS
Kind codeB2
Filing dateDec 7, 2023
Priority dateDec 7, 2023
Publication dateApr 7, 2026
Grant dateApr 7, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided for decoding visual content using a hybrid framework based on convolutional and neural radiance networks. A decoder receives bitstreams of model parameters, a sequence level representation, and a cross-resolution representation for reconstructing a sequence of frames. The model parameters comprise neural radiance network parameters. The decoder decodes the bitstreams of the model parameters, the sequence level representation, and the cross-resolution representation. The decoder generates, via a channel transformer, a combined representation based on the sequence level representation and the cross-resolution representation. The decoder adapts a neural network model based on the neural radiance network parameters. The decoder reconstructs the sequence of frames by determining, via the adapted neural network model based on the combined representation, pixel attribute information for each frame of the reconstructed sequence of frames. The decoder generates, for display at a client device, the reconstructed sequence of frames.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: receiving, at a client device, bitstreams of a plurality of model parameters, a sequence level representation, and a cross-resolution representation for reconstructing a sequence of frames, wherein the plurality of model parameters comprises neural radiance network parameters; decoding the bitstreams of the plurality of model parameters, the sequence level representation, and the cross-resolution representation; generating, via a channel transformer, a combined representation based on the sequence level representation and the cross-resolution representation; adapting a neural network model based on the neural radiance network parameters; reconstructing the sequence of frames by determining, via the adapted neural network model based on the combined representation, pixel attribute information for each frame of the reconstructed sequence of frames; and generating, for display at the client device, the reconstructed sequence of frames, wherein the cross-resolution representation comprises latent features corresponding to each frame of the sequence of frames. 2 . The method of claim 1 , wherein the adapted neural network model is trained to output pixel attribute information based on a plurality of pixel coordinates. 3 . The method of claim 2 , wherein the plurality of pixel coordinates comprises a timeline corresponding to the sequence of frames. 4 . The method of claim 1 , wherein determining, via the adapted neural network model, pixel attribute information for each frame of the reconstructed sequence of frames comprises determining a plurality of color values corresponding to a pixel of each frame based on respective spatio-temporal coordinates of the pixel. 5 . The method of claim 1 , wherein adapting the neural network model based on the plurality of neural radiance network parameters comprises selecting a network configuration from a plurality of pre-determined network configurations. 6 . The method of claim 5 , wherein the network configuration is selected based on a target reconstruction quality for the reconstructed sequence of frames. 7 . The method of claim 1 , wherein the client device is an extended reality (XR) device, and wherein the sequence of frames collectively defines a field of view (FoV) at the XR device. 8 . The method of claim 7 , wherein the sequence of frames corresponds to one or more display refresh cycles at the XR device. 9 . A system comprising: communications circuitry configured to receive, at a client device, bitstreams of a plurality of model parameters, a sequence level representation, and a cross-resolution representation for reconstructing a sequence of frames, wherein the plurality of model parameters comprises neural radiance network parameters; and control circuitry configured to: decode the bitstreams of the plurality of model parameters, the sequence level representation, and the cross-resolution representation; generate, via a channel transformer, a combined representation based on the sequence level representation and the cross-resolution representation; adapt a neural network model based on the neural radiance network parameters; reconstruct the sequence of frames by determining, via the adapted neural network model based on the combined representation, pixel attribute information for each frame of the reconstructed sequence of frames; and generate, for display at the client device, the reconstructed sequence of frames, wherein the cross-resolution representation comprises latent features corresponding to each frame of the sequence of frames. 10 . The system of claim 9 , wherein the sequence of frames corresponds to a first resolution, and wherein the control circuitry is further configured to use a convolutional network model comprising a plurality of residual spatial attention blocks to generate the combined representation at the first resolution. 11 . The system of claim 10 , wherein the control circuitry is configured to output, using the adapted neural network model, pixel attribute information based on a plurality of pixel coordinates. 12 . The system of claim 11 , wherein the plurality of pixel coordinates comprises a timeline corresponding to the sequence of frames. 13 . The system of claim 9 , wherein the control circuitry is further configured to determine a plurality of color values corresponding to a pixel of each frame based on respective spatio-temporal coordinates of the pixel. 14 . The system of claim 13 , wherein the control circuitry is further configured to select a network configuration from a plurality of pre-determined network configurations. 15 . The system of claim 14 , wherein the control circuitry is configured to select the network configuration based on a target reconstruction quality for the reconstructed sequence of frames. 16 . The system of claim 9 , wherein the client device is an extended reality (XR) device, and the sequence of frames collectively defines a field of view (FoV) at the XR device. 17 . The system of claim 16 , wherein the sequence of frames corresponds to one or more display refresh cycles at the XR device. 18 . A method comprising: receiving, at a client device, bitstreams of a plurality of model parameters, a sequence level representation, and a cross-resolution representation for reconstructing a sequence of frames, wherein the plurality of model parameters comprises neural radiance network parameters; decoding the bitstreams of the plurality of model parameters, the sequence level representation, and the cross-resolution representation; generating, via a channel transformer, a combined representation based on the sequence level representation and the cross-resolution representation; adapting a neural network model based on the neural radiance network parameters; reconstructing the sequence of frames by determining, via the adapted neural network model based on the combined representation, pixel attribute information for each frame of the reconstructed sequence of frames; and generating, for display at the client device, the reconstructed sequence of frames, wherein the sequence of frames corresponds to a first resolution, the method further comprising using a convolutional network model comprising a plurality of residual spatial attention blocks to generate the combined representation at the first resolution. 19 . The method of claim 18 , wherein the cross-resolution representation comprises latent features corresponding to each frame of the sequence of frames. 20 . The method of claim 18 , wherein the adapted neural network model is trained to output pixel attribute information based on a plurality of pixel coordinates.

Assignees

Inventors

Classifications

  • H04N19/172Primary

    the region being a picture, frame or field · CPC title

  • H04N19/42Primary

    characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation (H04N19/635 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12598317B2 cover?
Systems and methods are provided for decoding visual content using a hybrid framework based on convolutional and neural radiance networks. A decoder receives bitstreams of model parameters, a sequence level representation, and a cross-resolution representation for reconstructing a sequence of frames. The model parameters comprise neural radiance network parameters. The decoder decodes the bitst…
Who is the assignee on this patent?
Adeia Guides Inc
What technology area does this patent fall under?
Primary CPC classification H04N19/172. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Apr 07 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).