System and method for image coding using dual image models

US12401801B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12401801-B2
Application numberUS-202318295556-A
CountryUS
Kind codeB2
Filing dateApr 4, 2023
Priority dateApr 11, 2022
Publication dateAug 26, 2025
Grant dateAug 26, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of encoding an image comprises establishing whether objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that the objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data; and encoding any remainder of the image using a generative image model, thereby obtaining second image data, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of encoding an image, the method comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-interest, wherein encoding using the non-generative image model includes encoding in accordance with transform coding or a combination of predictive coding and transform coding; and encoding any remainder of the image only using an encoder neural network of a generative image model, thereby obtaining second image data, wherein the generative image model includes the encoder neural network and a decoder neural network, wherein at least one of the encoder neural network and the decoder neural network is generative, and wherein the encoder neural network and the decoder neural network have been trained in conjunction with each other in accordance with images other than the image to be encoded, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence. 2. The method of claim 1 , wherein use of the non-generative image model enables decoding of the first image data: without inserting information derived from images other than the encoded image, or if the image is a frame in a video sequence, without inserting information derived from images outside the video sequence, or without processing the first image data by a function that depends on information derived from images other than the encoded image, or if the image is a frame in a video sequence, without processing the first image data by a function that depends on information derived from images outside the video sequence. 3. The method of claim 1 , wherein use of the non-generative image model enables non-stochastic decoding of the first image data. 4. The method of claim 1 , wherein the generative image model is adapted for decoding by an artificial neural network which has been trained using information derived from images other than the encoded image or, if the image is a frame in a video sequence, using information derived from images outside the video sequence. 5. The method of claim 1 , wherein the generative image model is adapted for decoding by stochastic sampling from a probability distribution. 6. The method of claim 1 , wherein the generative image model includes an artificial neural network with trainable weights, the method further comprising: in response to establishing that said objects are visible, storing a snapshot of the trainable weights. 7. The method of claim 6 , further comprising: generating a digital signature for a dataset including the snapshot of the trainable weights, the first image data and the second image data. 8. The method of claim 1 , wherein establishing whether said objects are visible includes one or more of: executing a visual object recognition process; executing a visual event recognition process; obtaining data from a detector configured to monitor a scene of the image or a neighborhood of the scene; and obtaining operator input. 9. The method of claim 8 , wherein said establishing provides a location of a recognized object, the method further comprising: defining said at least one region-of-interest on the basis of a location of a recognized object. 10. The method of claim 9 , wherein the defined at least one region-of-interest extends outside the recognized object by a nonzero margin. 11. The method of claim 8 , wherein the object or event recognition process is configured to evaluate a condition that an object constituting one or more of the predefined object types or performing one or more of the predefined event type shall be visible in a predefined image area. 12. The method of claim 1 , wherein: the image is a frame in a video sequence; and the encoding of one or more of said at least one region-of-interest and the remainder includes applying an inter-frame prediction method. 13. The method of claim 1 , further comprising: in any frames where it is established that said objects are visible, encoding audio associated with the video sequence using a non-generative audio model; and in at least some frames where it is not established that said objects are visible, encoding said associated audio using an arbitrary audio model, wherein use of the non-generative audio model enables decoding of the audio without relying on information derived from audio data other than said associated audio. 14. A device comprising processing circuitry which is selectively operable in accordance with a non-generative image model and a generative image model, wherein use of the non-generative image model enables image data representing an encoded image to be decoded without relying on information derived from images other than the encoded image or, if the encoded image is a frame in a video sequence, enables the image data to be decoded without relying on information derived from images outside the video sequence, said device being configured to perform a method of encoding an image, the method comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-interest interest, wherein encoding using the non-generative image model includes encoding in accordance with transform coding or a combination of predictive coding and transform coding; and encoding any remainder of the image only using an encoder neural network of a generative image model, thereby obtaining second image data, wherein the generative image model includes the encoder neural network and a decoder neural network, wherein at least one of the encoder neural network and the decoder neural network is generative, and wherein the encoder neural network and the decoder neural network have been trained in conjunction with each other in accordance with images other than the image to be encoded, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence. 15. A non-transitory computer-readable storage medium having stored thereon instructions for implementing a method, when executed on a device having processing capabilities, the method for encoding an image comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-intere

Assignees

Inventors

Classifications

  • Artificial neural networks [ANN] · CPC title

  • Target detection · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

  • Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title

  • Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12401801B2 cover?
A method of encoding an image comprises establishing whether objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that the objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data; and encoding any rem…
Who is the assignee on this patent?
Axis Ab
What technology area does this patent fall under?
Primary CPC classification G06T9/002. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 26 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).