What technology area does this patent fall under?

Primary CPC classification G06T9/002. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 26 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for image coding using dual image models

US12401801B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12401801-B2
Application number	US-202318295556-A
Country	US
Kind code	B2
Filing date	Apr 4, 2023
Priority date	Apr 11, 2022
Publication date	Aug 26, 2025
Grant date	Aug 26, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of encoding an image comprises establishing whether objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that the objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data; and encoding any remainder of the image using a generative image model, thereby obtaining second image data, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of encoding an image, the method comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-interest, wherein encoding using the non-generative image model includes encoding in accordance with transform coding or a combination of predictive coding and transform coding; and encoding any remainder of the image only using an encoder neural network of a generative image model, thereby obtaining second image data, wherein the generative image model includes the encoder neural network and a decoder neural network, wherein at least one of the encoder neural network and the decoder neural network is generative, and wherein the encoder neural network and the decoder neural network have been trained in conjunction with each other in accordance with images other than the image to be encoded, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence. 2. The method of claim 1 , wherein use of the non-generative image model enables decoding of the first image data: without inserting information derived from images other than the encoded image, or if the image is a frame in a video sequence, without inserting information derived from images outside the video sequence, or without processing the first image data by a function that depends on information derived from images other than the encoded image, or if the image is a frame in a video sequence, without processing the first image data by a function that depends on information derived from images outside the video sequence. 3. The method of claim 1 , wherein use of the non-generative image model enables non-stochastic decoding of the first image data. 4. The method of claim 1 , wherein the generative image model is adapted for decoding by an artificial neural network which has been trained using information derived from images other than the encoded image or, if the image is a frame in a video sequence, using information derived from images outside the video sequence. 5. The method of claim 1 , wherein the generative image model is adapted for decoding by stochastic sampling from a probability distribution. 6. The method of claim 1 , wherein the generative image model includes an artificial neural network with trainable weights, the method further comprising: in response to establishing that said objects are visible, storing a snapshot of the trainable weights. 7. The method of claim 6 , further comprising: generating a digital signature for a dataset including the snapshot of the trainable weights, the first image data and the second image data. 8. The method of claim 1 , wherein establishing whether said objects are visible includes one or more of: executing a visual object recognition process; executing a visual event recognition process; obtaining data from a detector configured to monitor a scene of the image or a neighborhood of the scene; and obtaining operator input. 9. The method of claim 8 , wherein said establishing provides a location of a recognized object, the method further comprising: defining said at least one region-of-interest on the basis of a location of a recognized object. 10. The method of claim 9 , wherein the defined at least one region-of-interest extends outside the recognized object by a nonzero margin. 11. The method of claim 8 , wherein the object or event recognition process is configured to evaluate a condition that an object constituting one or more of the predefined object types or performing one or more of the predefined event type shall be visible in a predefined image area. 12. The method of claim 1 , wherein: the image is a frame in a video sequence; and the encoding of one or more of said at least one region-of-interest and the remainder includes applying an inter-frame prediction method. 13. The method of claim 1 , further comprising: in any frames where it is established that said objects are visible, encoding audio associated with the video sequence using a non-generative audio model; and in at least some frames where it is not established that said objects are visible, encoding said associated audio using an arbitrary audio model, wherein use of the non-generative audio model enables decoding of the audio without relying on information derived from audio data other than said associated audio. 14. A device comprising processing circuitry which is selectively operable in accordance with a non-generative image model and a generative image model, wherein use of the non-generative image model enables image data representing an encoded image to be decoded without relying on information derived from images other than the encoded image or, if the encoded image is a frame in a video sequence, enables the image data to be decoded without relying on information derived from images outside the video sequence, said device being configured to perform a method of encoding an image, the method comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-interest interest, wherein encoding using the non-generative image model includes encoding in accordance with transform coding or a combination of predictive coding and transform coding; and encoding any remainder of the image only using an encoder neural network of a generative image model, thereby obtaining second image data, wherein the generative image model includes the encoder neural network and a decoder neural network, wherein at least one of the encoder neural network and the decoder neural network is generative, and wherein the encoder neural network and the decoder neural network have been trained in conjunction with each other in accordance with images other than the image to be encoded, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence. 15. A non-transitory computer-readable storage medium having stored thereon instructions for implementing a method, when executed on a device having processing capabilities, the method for encoding an image comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-intere

Assignees

Axis Ab

Inventors

Classifications

G06T2207/20084
Artificial neural networks [ANN] · CPC title
G06V2201/07
Target detection · CPC title
G06V10/25
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
G06T7/70
Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title
H04N19/159
Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title

Patent family

Related publications grouped by family.

View patent family 81306805

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12401801B2 cover?: A method of encoding an image comprises establishing whether objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that the objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data; and encoding any rem…
Who is the assignee on this patent?: Axis Ab
What technology area does this patent fall under?: Primary CPC classification G06T9/002. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 26 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method, apparatus, and computer program product for training a signature encoding module and a query processing module to identify objects of interest within an image utilizing digital signatures

Method and apparatus of encoding/decoding image data based on tree structure-based block division

Technologies for region-of-interest video encoding

Generative adversarial neural network assisted video reconstruction

Hybrid Motion-Compensated Neural Network with Side-Information Based Video Coding

Encoding video frames using generated region of interest maps

Region-of-Interest Aware Video Coding

Frequently asked questions