Detecting poisoned training data for artificial intelligence models using causal analysis

US2025077954A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025077954-A1
Application numberUS-202318459117-A
CountryUS
Kind codeA1
Filing dateAug 31, 2023
Priority dateAug 31, 2023
Publication dateMar 6, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for managing artificial intelligence (AI) models are disclosed. To manage AI models, an instance of an AI model may not be re-trained using training data determined to be potentially poisoned. By doing so, malicious attacks intending influence the AI model in a using poisoned training data may be prevented. To do so, a first causal relationship present in historical training data may be compared to a second causal analysis present in a candidate training data set. The first causal relationship and the second causal relationship may be expected to be similar within a threshold. If a difference between the first causal relationship and the second causal relationship is not within the threshold, the candidate training data may be treated as including poisoned training data.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of managing an artificial intelligence (AI) model, the method comprising: obtaining a candidate training data set usable to update an instance of the AI model; identifying a historical training data set, the historical training data set being obtained prior to the candidate training data set and the historical training data set already having been considered as trustworthy; obtaining a quantification of a difference between causal relationships in the candidate training data set and the historical training data set; making a determination regarding whether the quantification is within a threshold for the quantification; and in a second instance of the determination in which the quantification is not within the threshold: treating the candidate training data set as comprising poisoned training data. 2 . The method of claim 1 , further comprising: in a first instance of the determination in which the quantification is within the threshold: obtaining a second instance of the AI model using at least the candidate training data set. 3 . The method of claim 1 , wherein obtaining the quantification comprises: identifying a first causal relationship of the causal relationships in the historical training data set; and identifying a second causal relationship of the causal relationships in the candidate training data set. 4 . The method of claim 3 , wherein the first causal relationship and the second causal relationship relate same features and same labels. 5 . The method of claim 4 , wherein the first causal relationship is based on a first feature present in the historical training data set and a first label present in the historical training data set. 6 . The method of claim 5 , wherein the second causal relationship is based on a second feature present in the candidate training data set and a second label present in the candidate training data set. 7 . The method of claim 6 , wherein the first feature is based on first measurements of a quantity during a first period of time and the second feature is based on second measurements of the quantity during a second period of time, the first period of time being prior to the second period of time. 8 . The method of claim 7 , wherein the first label is based on third measurements of a second quantity during the first period of time and the second label is based on fourth measurements of the second quantity during the second period of time. 9 . The method of claim 1 , wherein the quantification of the difference is based on forms of the causal relationships. 10 . The method of claim 1 , wherein the threshold is based on a level of tolerance for use of poisoned training data in the AI model. 11 . A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations for managing an artificial intelligence (AI) model, the operations comprising: obtaining a candidate training data set usable to update an instance of the AI model; identifying a historical training data set, the historical training data set being obtained prior to the candidate training data set and the historical training data set already having been considered as trustworthy; obtaining a quantification of a difference between causal relationships in the candidate training data set and the historical training data set; making a determination regarding whether the quantification is within a threshold for the quantification; and in a second instance of the determination in which the quantification is not within the threshold: treating the candidate training data set as comprising poisoned training data. 12 . The non-transitory machine-readable medium of claim 11 , further comprising: in a first instance of the determination in which the quantification is within the threshold: obtaining a second instance of the AI model using at least the candidate training data set. 13 . The non-transitory machine-readable medium of claim 11 , wherein obtaining the quantification comprises: identifying a first causal relationship of the causal relationships in the historical training data set; and identifying a second causal relationship of the causal relationships in the candidate training data set. 14 . The non-transitory machine-readable medium of claim 13 , wherein the first causal relationship and the second causal relationship relate same features and same labels. 15 . The non-transitory machine-readable medium of claim 14 , wherein the first causal relationship is based on a first feature present in the historical training data set and a first label present in the historical training data set. 16 . A data processing system, comprising: a processor; and a memory coupled to the processor to store instructions, which when executed by the processor, cause the processor to perform operations for managing an artificial intelligence (AI) model, the operations comprising: obtaining a candidate training data set usable to update an instance of the AI model; identifying a historical training data set, the historical training data set being obtained prior to the candidate training data set and the historical training data set already having been considered as trustworthy; obtaining a quantification of a difference between causal relationships in the candidate training data set and the historical training data set; making a determination regarding whether the quantification is within a threshold for the quantification; and in a second instance of the determination in which the quantification is not within the threshold: treating the candidate training data set as comprising poisoned training data. 17 . The data processing system of claim 16 , further comprising: in a first instance of the determination in which the quantification is within the threshold: obtaining a second instance of the AI model using at least the candidate training data set. 18 . The data processing system of claim 16 , wherein obtaining the quantification comprises: identifying a first causal relationship of the causal relationships in the historical training data set; and identifying a second causal relationship of the causal relationships in the candidate training data set. 19 . The data processing system of claim 18 , wherein the first causal relationship and the second causal relationship relate same features and same labels. 20 . The data processing system of claim 19 , wherein the first causal relationship is based on a first feature present in the historical training data set and a first label present in the historical training data set.

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025077954A1 cover?
Methods and systems for managing artificial intelligence (AI) models are disclosed. To manage AI models, an instance of an AI model may not be re-trained using training data determined to be potentially poisoned. By doing so, malicious attacks intending influence the AI model in a using poisoned training data may be prevented. To do so, a first causal relationship present in historical training…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 06 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).