Transmitting image data

US12598331B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12598331-B2
Application numberUS-202318137352-A
CountryUS
Kind codeB2
Filing dateApr 20, 2023
Priority dateFeb 14, 2023
Publication dateApr 7, 2026
Grant dateApr 7, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of transmitting video data. A sequence of video frames is received. A warp operation for a first frame and a reference frame of the sequence of video frames is determined, wherein the warp operation defines a transformation of the reference frame to give an approximation of the first frame. One or more regions of interest of the first frame are identified. Encoded image data from the image data of the one of more regions of interest of the first frame is generated using an image encoder. The warp operation and the encoded image data are transmitted.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: receiving a sequence of frames; determining a warp operation for a particular frame and a reference frame from the sequence of frames, wherein the warp operation defines a transformation of the reference frame to approximate the particular frame; generating, using an encoder, encoded image data of one or more regions of interest of the particular frame; generating a reconstructed frame using the encoded image data and the warp operation; generating a score that reflects a similarity between the reconstructed frame and the particular frame; determining that the score satisfies a threshold value; and in response to determining that the score satisfies the threshold value, transmitting the reconstructed frame. 2 . The method of claim 1 , in response to determining that the score does not satisfy the threshold value, the method comprises: generating, using an encoder, an encoded frame of the particular frame; transmitting the encoded frame. 3 . The method of claim 2 , further comprising dynamically transmitting, for each frame of the sequence of frames, either the reconstructed frame or the encoded frame of a respective frame. 4 . The method of claim 1 , wherein generating the score that represents a similarity between the reconstructed frame and the particular frame comprises generating the score that represents the similarity between the reconstructed frame and the particular frame using at least one of a structural similarity index metric or a video multimethod assessment fusion method. 5 . The method of claim 1 , wherein the one or more regions of interest comprise at least one of an eye region or a mouth region of a face in the particular frame. 6 . The method of claim 1 , wherein generating, using an encoder, encoded image data of one or more regions of interest of the particular frame further comprises: prior to using the encoder, generating another frame by replacing image data outside the regions of interest of the particular frame with default image data; and generating the encoded image data by encoding the other frame with the encoder. 7 . The method of claim 1 , further comprising estimating, using a neural network, the score of the particular frame without generating the reconstruction frame. 8 . A system comprising: one or more computers and one or more storage devices on which are stored instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving a sequence of video frames; determining a warp operation for a particular frame and a reference frame from the sequence of video frames, wherein the warp operation defines a transformation of the reference frame to approximate the particular frame; generating, using an encoder, encoded image data of one or more regions of interest of the particular frame; generating a reconstructed frame using the encoded image data and the warp operation; generating a score that reflects a similarity between the reconstructed frame and the particular frame; determining that the score satisfies a threshold value; and in response to determining that the score satisfies the threshold value, transmitting the reconstructed frame. 9 . The system of claim 8 , in response to determining that the score does not satisfy the threshold value, the operations comprise: generating, using an encoder, an encoded frame of the particular frame; transmitting the encoded frame. 10 . The system of claim 9 , further comprising dynamically transmitting, for each frame of the sequence of frames, either the reconstructed frame or the encoded frame of a respective frame. 11 . The system of claim 8 , wherein generating the score that represents a similarity between the reconstructed frame and the particular frame comprises generating the score that represents the similarity between the reconstructed frame and the particular frame using at least one of a structural similarity index metric or a video multimethod assessment fusion method. 12 . The system of claim 8 , wherein the one or more regions of interest comprise at least one of an eye region or a mouth region of a face in the particular frame. 13 . The system of claim 8 , wherein generating, using an encoder, encoded image data of one or more regions of interest of the particular frame further comprises: prior to using the encoder, generating another frame by replacing image data outside the regions of interest of the particular frame with default image data; and generating the encoded image data by encoding the other frame with the encoder. 14 . The system of claim 8 , further comprising estimating, using a neural network, the score of the particular frame without generating the reconstruction frame. 15 . One or more non-transitory computer storage media encoded with computer program instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: receiving a sequence of video frames; determining a warp operation for a particular frame and a reference frame from the sequence of video frames, wherein the warp operation defines a transformation of the reference frame to approximate the particular frame; generating, using an encoder, encoded image data of one or more regions of interest of the particular frame; generating a reconstructed frame using the encoded image data and the warp operation; generating a score that reflects a similarity between the reconstructed frame and the particular frame; determining that the score satisfies a threshold value; and in response to determining that the score satisfies the threshold value, transmitting the reconstructed frame. 16 . The one or more non-transitory computer storage media of claim 15 , in response to determining that the score does not satisfy the threshold value, the operations comprise: generating, using an encoder, an encoded frame of the particular frame; transmitting the encoded frame. 17 . The one or more non-transitory computer storage media of claim 16 , further comprising dynamically transmitting, for each frame of the sequence of frames, either the reconstructed frame or the encoded frame of a respective frame. 18 . The one or more non-transitory computer storage media of claim 15 , wherein generating the score that represents a similarity between the reconstructed frame and the particular frame comprises generating the score that represents the similarity between the reconstructed frame and the particular frame using at least one of a structural similarity index metric or a video multimethod assessment fusion method. 19 . The one or more non-transitory computer storage media of claim 15 , wherein the one or more regions of interest comprise at least one of an eye region or a mouth region of a face in the particular frame. 20 . The one or more non-transitory computer storage media of claim 15 , wherein generating, using an encoder, encoded image data of one or more regions of interest of the particular frame further comprises: prior to using the encoder, generating another frame by replacing image data outside the regions of interest of the particular frame with default image data; and generating the encoded image data by encoding the other frame with the encoder.

Assignees

Inventors

Classifications

  • Backpropagation, e.g. using gradient descent · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation (H04N19/635, H04N19/86 take precedence) · CPC title

  • using transform coding · CPC title

  • characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation (H04N19/635 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12598331B2 cover?
A computer-implemented method of transmitting video data. A sequence of video frames is received. A warp operation for a first frame and a reference frame of the sequence of video frames is determined, wherein the warp operation defines a transformation of the reference frame to give an approximation of the first frame. One or more regions of interest of the first frame are identified. Encoded …
Who is the assignee on this patent?
Sony Interactive Entertainment Europe Ltd
What technology area does this patent fall under?
Primary CPC classification H04N19/86. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Apr 07 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).