Method, apparatus, and computer program product for training a signature encoding module and a query processing module to identify objects of interest within an image utilizing digital signatures
US-2022188346-A1 · Jun 16, 2022 · US
US12401801B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12401801-B2 |
| Application number | US-202318295556-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 4, 2023 |
| Priority date | Apr 11, 2022 |
| Publication date | Aug 26, 2025 |
| Grant date | Aug 26, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of encoding an image comprises establishing whether objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that the objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data; and encoding any remainder of the image using a generative image model, thereby obtaining second image data, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence.
Opening claim text (preview).
The invention claimed is: 1. A method of encoding an image, the method comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-interest, wherein encoding using the non-generative image model includes encoding in accordance with transform coding or a combination of predictive coding and transform coding; and encoding any remainder of the image only using an encoder neural network of a generative image model, thereby obtaining second image data, wherein the generative image model includes the encoder neural network and a decoder neural network, wherein at least one of the encoder neural network and the decoder neural network is generative, and wherein the encoder neural network and the decoder neural network have been trained in conjunction with each other in accordance with images other than the image to be encoded, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence. 2. The method of claim 1 , wherein use of the non-generative image model enables decoding of the first image data: without inserting information derived from images other than the encoded image, or if the image is a frame in a video sequence, without inserting information derived from images outside the video sequence, or without processing the first image data by a function that depends on information derived from images other than the encoded image, or if the image is a frame in a video sequence, without processing the first image data by a function that depends on information derived from images outside the video sequence. 3. The method of claim 1 , wherein use of the non-generative image model enables non-stochastic decoding of the first image data. 4. The method of claim 1 , wherein the generative image model is adapted for decoding by an artificial neural network which has been trained using information derived from images other than the encoded image or, if the image is a frame in a video sequence, using information derived from images outside the video sequence. 5. The method of claim 1 , wherein the generative image model is adapted for decoding by stochastic sampling from a probability distribution. 6. The method of claim 1 , wherein the generative image model includes an artificial neural network with trainable weights, the method further comprising: in response to establishing that said objects are visible, storing a snapshot of the trainable weights. 7. The method of claim 6 , further comprising: generating a digital signature for a dataset including the snapshot of the trainable weights, the first image data and the second image data. 8. The method of claim 1 , wherein establishing whether said objects are visible includes one or more of: executing a visual object recognition process; executing a visual event recognition process; obtaining data from a detector configured to monitor a scene of the image or a neighborhood of the scene; and obtaining operator input. 9. The method of claim 8 , wherein said establishing provides a location of a recognized object, the method further comprising: defining said at least one region-of-interest on the basis of a location of a recognized object. 10. The method of claim 9 , wherein the defined at least one region-of-interest extends outside the recognized object by a nonzero margin. 11. The method of claim 8 , wherein the object or event recognition process is configured to evaluate a condition that an object constituting one or more of the predefined object types or performing one or more of the predefined event type shall be visible in a predefined image area. 12. The method of claim 1 , wherein: the image is a frame in a video sequence; and the encoding of one or more of said at least one region-of-interest and the remainder includes applying an inter-frame prediction method. 13. The method of claim 1 , further comprising: in any frames where it is established that said objects are visible, encoding audio associated with the video sequence using a non-generative audio model; and in at least some frames where it is not established that said objects are visible, encoding said associated audio using an arbitrary audio model, wherein use of the non-generative audio model enables decoding of the audio without relying on information derived from audio data other than said associated audio. 14. A device comprising processing circuitry which is selectively operable in accordance with a non-generative image model and a generative image model, wherein use of the non-generative image model enables image data representing an encoded image to be decoded without relying on information derived from images other than the encoded image or, if the encoded image is a frame in a video sequence, enables the image data to be decoded without relying on information derived from images outside the video sequence, said device being configured to perform a method of encoding an image, the method comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-interest interest, wherein encoding using the non-generative image model includes encoding in accordance with transform coding or a combination of predictive coding and transform coding; and encoding any remainder of the image only using an encoder neural network of a generative image model, thereby obtaining second image data, wherein the generative image model includes the encoder neural network and a decoder neural network, wherein at least one of the encoder neural network and the decoder neural network is generative, and wherein the encoder neural network and the decoder neural network have been trained in conjunction with each other in accordance with images other than the image to be encoded, wherein use of the non-generative image model enables decoding of the first image data without relying on information derived from images other than the encoded image or, if the image is a frame in a video sequence, enables decoding of the first image data without relying on information derived from images outside the video sequence. 15. A non-transitory computer-readable storage medium having stored thereon instructions for implementing a method, when executed on a device having processing capabilities, the method for encoding an image comprising: establishing whether one or more objects constituting one or more predefined object types or performing one or more predefined event types are visible in the image; in response to establishing that said objects are visible, encoding at least one region-of-interest of the image using a non-generative image model, thereby obtaining first image data, wherein said one or more objects are visible in the region-of-intere
Artificial neural networks [ANN] · CPC title
Target detection · CPC title
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title
Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.