What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automatically determining poisonous attacks on neural networks

US11645515B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11645515-B2
Application number	US-201916571323-A
Country	US
Kind code	B2
Filing date	Sep 16, 2019
Priority date	Sep 16, 2019
Publication date	May 9, 2023
Grant date	May 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes executing a set of analyses and integrating the results of the analyses into a determination as to whether a training data set is poisonous based on determining if resultant activation clusters are poisoned.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system comprising: a processor operatively coupled to memory; and an artificial intelligence (AI) platform, in communication with the processor, having machine learning (ML) tools to process an untrusted data set, the tools comprising: a training manager configured to train a neural model with the untrusted data set; a ML manager, operatively coupled to the training manager, configured to classify each data point in the untrusted data set using the trained neural model, and to retain activations of one or more designated layers in the trained neural model; a cluster manager, operatively coupled to the ML manager, configured to apply a clustering technique on the retained activations for each label, and for each cluster to assess integrity of data in the cluster, including to analyze information from the untrusted data set and the clustered activations, the information comprising content of the data in the untrusted data set, noise distribution data with respect to the untrusted data set, and evidence of a preliminary cluster classification; and a classification manager, operatively coupled to the cluster manager, the classification manager configured to assign a poisonous classification or a legitimate classification to the assessed cluster, the assigned classification corresponding to the integrity assessment. 2. The system of claim 1 , wherein the integrity assessment of the cluster data further comprises the cluster manager configured to select a preliminary topic assignment or a topic assignment based on the analysis of the analyzed information. 3. The system of claim 2 , wherein the topic assignment based on the analysis further comprises the cluster manager configured to analyze topic text indicative of the poisonous classification or the legitimate classification. 4. The system of claim 1 , wherein the evidence of the preliminary cluster classification further comprises the cluster manager configured to analyze one or more of: known classification data associated with the untrusted data set; and/or determined classification data associated with the clustered activations. 5. The system of claim 1 , wherein the analysis of the noise distribution data further comprises the cluster manager configured to: select the noise distribution data from the group consisting of: noise data extracted through analysis of the untrusted data set and known noise distribution data provided with the untrusted data set. 6. The system of claim 1 , wherein the cluster manager is configured to rank the integrity assessments of the clusters as a function of historical performance. 7. The system of claim 1 , wherein the training manager is configured to retrain the neural model based on one or more of the integrity assessments. 8. A computer program product to utilize machine learning to process an untrusted data set, the computer program product comprising: a computer readable storage medium having program code embodied therewith, the program code executable by a processor to: train a neural model with the untrusted data set; classify each data point in the untrusted data set using the trained neural model; retain activations of one or more designated layers in the trained neural model; apply a clustering technique on the retained activations for each label, and for each cluster assess integrity of data in the cluster, including program code executable by the processor to analyze information from the untrusted data set and the clustered activations, the information comprising content of the data in the untrusted data set, noise distribution data with respect to the untrusted data set, and evidence of a preliminary cluster classification; responsive to the analysis, selectively determine a poisonous classification or a legitimate classification of the untrusted data set; and assign the selectively determined classification to the untrusted data set. 9. The computer program product of claim 8 , wherein integrity assessment of the cluster data further comprises program code executable by the processor to select a preliminary topic assignment or a topic assignment based on the analysis of the analyzed information. 10. The computer program product of claim 9 , wherein the topic assignment based on the analysis further comprises program code executable by the processor to analyze topic text indicative of the poisonous classification or the legitimate classification. 11. The computer program product of claim 8 , wherein the evidence of the preliminary cluster classification further comprises program code executable by the processor to analyze one or more of: known classification data associated with the untrusted data set; and/or determined classification data associated with the clustered activations. 12. The computer program product of claim 8 , wherein analysis of the noise distribution data further comprises program code executable by the processor to: select the noise distribution data from the group consisting of: noise data extracted through analysis of the untrusted data set and known noise distribution data provided with the untrusted data set. 13. The computer program product of claim 8 , further comprising program code executable by the processor to rank the integrity assessments of the clusters as a function of historical performance. 14. A method comprising: receiving, by a neural network, an untrusted data set, each data point of the untrusted data set having a label; training a neural model using the untrusted data set; classifying each data point in the untrusted data set using the trained neural model, and retaining activations of one or more designated layers in the trained neural model; applying a clustering technique on the retained activations for each label; assessing integrity of data in the untrusted data set, including analyzing information from the untrusted data set and the clustered activations, the information comprising content of the data in the untrusted data set, noise distribution data with respect to the untrusted data set, and evidence of a preliminary cluster classification; responsive to the analysis, selectively determining a poisonous classification or a legitimate classification of the untrusted data set; and assigning the selectively determined classification to the untrusted data set. 15. The method of claim 14 , wherein the cluster data includes a preliminary topic assignment or a topic assignment based on the analysis of the analyzed information. 16. The method of claim 15 , wherein the topic assignment based on the analysis includes topic text indicative of the poisonous classification or the legitimate classification. 17. The method of claim 14 , wherein the evidence of the preliminary cluster classification includes one or more of: known classification data associated with the untrusted data set; and/or determined classification data associated with the clustered activations. 18. The method of claim 14 , wherein: the noise distribution data is selected from the group consisting of: noise data extracted through analysis of the untrusted data set and known noise distribution data provided with the untrusted data set. 19. The method of claim 14 , wherein the assessing integrity of data in the untrusted data set comprises conducting a plurality of integrity assessments, and wherein the method further comprises ranking the integrity assessments of the clusters as a function of historical performance. 20. The method of claim 14 ,

Assignees

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0499
Feedforward networks · CPC title
G06V10/762
using clustering, e.g. of similar faces in social networks · CPC title
G06V10/771
Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title
G06V10/776
Validation; Performance evaluation · CPC title

Patent family

Related publications grouped by family.

View patent family 74869702

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11645515B2 cover?: Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the la…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).