What technology area does this patent fall under?

Primary CPC classification G06V10/774. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method with diffusion-based outlier synthesis for anomaly detection

US12530875B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12530875-B2
Application number	US-202318476614-A
Country	US
Kind code	B2
Filing date	Sep 28, 2023
Priority date	Sep 28, 2023
Publication date	Jan 20, 2026
Grant date	Jan 20, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented system and method relate to anomaly detection. Latent code of a source image is obtained. The latent code is designated as a target image. Source embedding data is generated form the source image. Text data, which is of a different domain than that of the source image, is obtained. Text embedding data is generated from the text data. Additional embedding data is generated using the source embedding data and the text embedding data. The additional embedding data provides guidance for modifying the source image. A modified image is generated via an iterative process that includes at least one iteration, where each iteration includes at least (i) encoding the target image to generate target embedding data, (ii) generating updated embedding data by combining the target embedding data and the additional embedding data, (iii) decoding the updated embedding data to generate a new image, and (iv) assigning the new image as the target image and the modified image. A non-anomalous label is generated for the source image and an anomalous label is generated for the modified image. A machine learning model is trained or fine-tuned using a dataset, which includes at least the source image with the non-anomalous label and the modified image with the anomalous label.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for anomaly detection, the computer-implemented method comprising: receiving a source image associated with a first domain; obtaining a latent code of the source image, the latent code being designated as a target image; encoding, via a first image encoder, the source image to generate source embedding data; obtaining text data associated with a second domain; encoding, via a first text encoder, the text data to generate text embedding data; generating additional embedding data using the source embedding data and the text embedding data, the additional embedding data providing guidance for modifying the source image; generating a modified image via an iterative process that includes at least one iteration, each iteration including: encoding, via a second image encoder, the target image to generate target embedding data, generating updated embedding data by combining the target embedding data and the additional embedding data, decoding, via an image decoder, the updated embedding data to generate a new image, and assigning the new image as the target image and the modified image, generating a dataset that includes at least the source image and the modified image; and training or fine-tuning a machine learning model using the dataset. 2 . The computer-implemented method of claim 1 , wherein: the machine learning model is configured to perform anomaly detection; and the machine learning model comprises a classifier or a regression model. 3 . The computer-implemented method of claim 1 , wherein: the first image encoder is a part of a pretrained vision language model; the first text encoder is a part of the pretrained vision language model; the second image encoder is a part of a pretrained diffusion model; and the image decoder is a part of the pretrained diffusion model. 4 . The computer-implemented method of claim 1 , further comprising: generating directional loss data for the source image using the source embedding data and the text embedding data, wherein the additional embedding data is generated at least by minimizing the directional loss data. 5 . The computer-implemented method of claim 4 , further comprising: computing a first difference term between the text embedding data and the source embedding data; and computing a second difference term between the target embedding data and the source embedding data, wherein the directional loss data is generated via computations that use the first difference term and the second difference term. 6 . The computer-implemented method of claim 1 , further comprising: defining a strength level to indicate an amount of modifying the source image with respect to the text data, wherein the updated embedding data is generated using the strength level to affect an impact of the additional embedding data. 7 . The computer-implemented method of claim 1 , further comprising: employing the machine learning model to generate prediction data in response to receiving current sensor data, the prediction data indicating that the current sensor data is anomalous data or non-anomalous data; and controlling an actuator using the prediction data. 8 . A system for anomaly detection, the system comprising: one or more processors; at least one non-transitory computer readable medium in data communication with the one or more processors, the at least one non-transitory computer readable medium having computer readable data including instructions stored thereon that, when executed by the one or more processors is configured to cause the one or more processors to perform a method that comprises: receiving a source image associated with a first domain; obtaining a noisy image to use as a target image, the noisy image comprising Gaussian noise; encoding, via a first image encoder, the source image to generate source embedding data; obtaining text data associated with a second domain; encoding, via a first text encoder, the text data to generate text embedding data; generating additional embedding data using the source embedding data and the text embedding data, the additional embedding data providing guidance for modifying the source image; generating a modified image via an iterative process that includes at least one iteration, each iteration including: encoding, via a second image encoder, the target image to generate target embedding data, generating updated embedding data by combining the target embedding data and the additional embedding data, decoding, via an image decoder, the updated embedding data to generate a new image, and assigning the new image as the target image and the modified image, generating a dataset that includes at least the source image and the modified image; and training or fine-tuning a machine learning model using the dataset. 9 . The system of claim 8 , wherein: the machine learning model is configured to perform anomaly detection; and the machine learning model comprises a classifier or a regression model. 10 . The system of claim 8 , wherein: the first image encoder is a part of a pretrained vision language model; the first text encoder is a part of the pretrained vision language model; the second image encoder is a part of a pretrained diffusion model; and the image decoder is a part of the pretrained diffusion model. 11 . The system of claim 8 , wherein: generating directional loss data for the source image using the source embedding data and the text embedding data, wherein the additional embedding data is generated at least by minimizing the directional loss data. 12 . The system of claim 11 , further comprising: computing a first difference term between the text embedding data and the source embedding data; and computing a second difference term between the target embedding data and the source embedding data, wherein the directional loss data is generated via computations that use the first difference term and the second difference term. 13 . The system of claim 8 , further comprising: defining a strength level to indicate an amount of modifying the source image with respect to the text data, wherein the updated embedding data is generated using the strength level to affect an impact of the additional embedding data. 14 . The system of claim 8 , further comprising: employing the machine learning model to generate prediction data in response to receiving current sensor data, the prediction data indicating that the current sensor data is anomalous data or non-anomalous data; and controlling an actuator using the prediction data. 15 . A non-transitory computer readable medium having computer readable data including instructions stored thereon, the computer readable data being executable by one or more processors to perform a method that comprises: receiving a source image associated with a first domain; obtaining a noisy image to use as a target image, the noisy image comprising Gaussian noise; encoding, via a first image encoder, the source image to generate source embedding data; obtaining text data associated with a second domain; encoding, via a first text encoder, the text data to generate text embedding data; generating additional embedding data using the source embedding data and the text embedding data, the additional embedding data providing guidance for modifying the source image; generating a modified image via an iterative process that includes at least one iteration, each iteration including: encoding, via a second image encoder, the target image to generate ta

Assignees

Bosch Gmbh Robert

Inventors

Classifications

G06V10/776
Validation; Performance evaluation · CPC title
G06V40/40
Spoof detection, e.g. liveness detection · CPC title
H04N19/46
Embedding additional information in the video signal during the compression process (H04N19/517, H04N19/68, H04N19/70 take precedence) · CPC title
G06V10/82
using neural networks · CPC title
G06V10/774Primary
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

View patent family 94978274

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12530875B2 cover?: A computer-implemented system and method relate to anomaly detection. Latent code of a source image is obtained. The latent code is designated as a target image. Source embedding data is generated form the source image. Text data, which is of a different domain than that of the source image, is obtained. Text embedding data is generated from the text data. Additional embedding data is generated…
Who is the assignee on this patent?: Bosch Gmbh Robert
What technology area does this patent fall under?: Primary CPC classification G06V10/774. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Generating entity contribution scores using configuration instrumentation metrics

Generating and visualizing planar surfaces within a three-dimensional space for modifying objects in a two-dimensional editing interface

System and Method with Language-Guided Self-Supervised Semantic Segmentation

Generating modified two-dimensional images by customizing focal points via three-dimensional representations of the two-dimensional images

Locked-Model Multimodal Contrastive Tuning

Modifying digital images utilizing a language guided image editing model

Semantic image manipulation using visual-semantic joint embeddings

Frequently asked questions