Biometric task network
US-2023260301-A1 · Aug 17, 2023 · US
US12475671B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12475671-B2 |
| Application number | US-202117542239-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 3, 2021 |
| Priority date | Oct 7, 2021 |
| Publication date | Nov 18, 2025 |
| Grant date | Nov 18, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of processing image data is provided. Pixel data for a first image is preprocessed to identify a subset of the pixel data corresponding to a region of interest depicting a scene element. The subset of the pixel data is processed at a first encoder to generate a first data structure representative of the region of interest, the first data structure identifying the scene element depicted in the region of interest. The subset of pixel data is also processed at a second encoder to generate a second data structure representative of the region of interest, the second data structure comprising values for visual characteristics associated with the scene element. The first and second data structures are outputted for use by a decoder to generate a second image approximating the region of interest of the first image.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method of processing image data, the method comprising: receiving pixel data of a first image; preprocessing the received pixel data to identify a subset of the pixel data of the first image, the subset of the pixel data corresponding to a region of interest of the first image depicting at least one scene element; first processing the subset of the pixel data of the first image at a first encoder to generate a first data structure representative of the region of interest of the first image, the first data structure comprising a scene element identifier identifying the at least one scene element depicted in the region of interest of the first image, wherein the scene element identifier is invariant to changes in a configuration of the at least one scene element between different images depicting the at least one scene element; second processing the subset of the pixel data of the first image at a second encoder to generate a second data structure representative of the region of interest of the first image, the second data structure comprising values for one or more visual characteristics associated with the at least one scene element depicted in the region of interest of the first image; and outputting the first data structure and the second data structure for use by a decoder to generate a second image approximating the region of interest of the first image, wherein the one or more visual characteristics, the values of which are to be included in the second data structure, are determined by the second encoder based on the identity of the at least one scene element as determined by the first encoder. 2 . The computer-implemented method of claim 1 wherein the second encoder is configured to determine the one or more visual characteristics by identifying features of the region of interest which are visually salient. 3 . The computer-implemented method of claim 1 , wherein the first encoder comprises a convolutional neural network that uses a differentiable loss function. 4 . The computer-implemented method of claim 3 , wherein the differentiable loss function comprises a triplet loss function. 5 . The computer-implemented method of claim 1 , wherein the first encoder is configured to distinguish between the at least one scene element that is depicted in the region of interest and at least one second scene element, the at least one scene element and the at least one second scene element being of a common scene element type. 6 . The computer-implemented method of claim 1 , wherein the scene element identifier is indicative of generic structural characteristics of content of the region of interest in comparison to other regions of the image and/or other images. 7 . The computer-implemented method of claim 1 , wherein the second encoder comprises a convolutional neural network configured to output a vector comprising the values of the one or more visual characteristics. 8 . The computer-implemented method of claim 1 , wherein the second encoder is configured to determine visual details of the region of interest to which the subset of the pixel data corresponds that are not captured by the first processing at the first encoder. 9 . The computer-implemented method of claim 1 , wherein the second encoder is configured to locate one or more landmarks in the region of interest to which the subset of the pixel data corresponds, wherein the one or more visual characteristics comprise coordinates of the one or more landmarks in the region of interest. 10 . The computer-implemented method of claim 1 , wherein the one or more visual characteristics relate to one or more of: lighting, orientation, movement, and perspective in the region of interest. 11 . The computer-implemented method of claim 1 , comprising generating, using an image generator module, the second image using the scene element identifier and the values of the one or more visual characteristics. 12 . The computer-implemented method of claim 11 , wherein the first encoder and/or the second encoder are trained using back-propagation of errors based on a comparison between the region of interest of the first image and the second image generated by the image generator module. 13 . The computer-implemented method of claim 11 , wherein the first encoder and/or the second encoder are trained using a discriminator function configured to determine whether the second image generated by the image generator module is a real image or a synthesized image, the discriminator function being configured to produce a composite set of loss functions that can be minimized using stochastic gradient descent and backpropagation through the first encoder and/or the second encoder. 14 . The computer-implemented method of claim 13 , wherein the composite set of loss functions are calculated in a latent space of a neural network that takes as inputs the subset of the pixel data corresponding to the region of interest of the first image and the second image generated by the image generator module. 15 . The computer-implemented method of claim 11 , wherein the first encoder and/or the second encoder are trained using one or more optimizing functions configured to score a loss of fidelity between the region of interest of the first image and the second image generated by the image generator module based on one or more of mean absolute error, mean squared error, and/or structural similarity index metrics that can be minimized using stochastic gradient descent and backpropagation through the first encoder and/or the second encoder. 16 . The computer-implemented method of claim 1 , wherein the second image comprises a photorealistic rendering of the region of interest to which the subset of the pixel data corresponds. 17 . A computer-implemented method of generating an image at a decoder, the method comprising: receiving a first data structure representative of a region of interest of a first image, the first data structure generated by a first encoder and comprising a scene element identifier identifying at least one scene element depicted in the region of interest of the first image, wherein the scene element identifier is invariant to changes in a configuration of the at least one scene element between different images depicting the at least one scene element; receiving a second data structure representative of the region of interest of the first image, the second data structure comprising values for one or more visual characteristics associated with the at least one scene element depicted in the region of interest of the first image; and generating for display, using the first data structure and the second data structure, a second image approximating the region of interest of the first image, wherein the one or more visual characteristics, the values of which are to be included in the second data structure, are determined by a second encoder based on the identity of the at least one scene element. 18 . A computing device comprising: a processor; and a memory, wherein the computing device is arranged to perform, using the processor, a method of processing image data, the method comprising: receiving pixel data of a first image; preprocessing the received pixel data to identify a subset of the pixel data corresponding to a region of interest of the first image depicting at least one scene element; first processing the subset of the pixel data of the first image at a first encoder to generate a first data structure representative of the region of interest of
Backpropagation, e.g. using gradient descent · CPC title
Combinations of networks · CPC title
Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching · CPC title
using neural networks · CPC title
Generative networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.