Automatically Determining Whether an Activation Cluster Contains Poisonous Data
US-2021081708-A1 · Mar 18, 2021 · US
US11188789B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11188789-B2 |
| Application number | US-201816057706-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 7, 2018 |
| Priority date | Aug 7, 2018 |
| Publication date | Nov 30, 2021 |
| Grant date | Nov 30, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One embodiment provides a method comprising receiving a training set comprising a plurality of data points, where a neural network is trained as a classifier based on the training set. The method further comprises, for each data point of the training set, classifying the data point with one of a plurality of classification labels using the trained neural network, and recording neuronal activations of a portion of the trained neural network in response to the data point. The method further comprises, for each classification label that a portion of the training set has been classified with, clustering a portion of all recorded neuronal activations that are in response to the portion of the training set, and detecting one or more poisonous data points in the portion of the training set based on the clustering.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: receiving a training set comprising a plurality of data points, wherein a neural network is trained as a classifier based on the training set; for each data point of the training set: classifying the data point with one of a plurality of classification labels using the trained neural network; and recording neuronal activations of a portion of the trained neural network in response to the data point; and for each classification label that a portion of the training set has been classified with: clustering a portion of all recorded neuronal activations that are in response to the portion of the training set; and detecting one or more poisonous data points in the portion of the training set based on the clustering. 2. The method of claim 1 , further comprising: training an initial neural network based on the training set, resulting in the trained neural network. 3. The method of claim 1 , wherein the training set is an untrusted data set. 4. The method of claim 1 , wherein the neural network is a convolutional neural network. 5. The method of claim 4 , wherein the portion of the neural network is a last hidden layer in the neural network. 6. The method of claim 1 , wherein the neural network is a region-based convolutional neural network (R-CNN). 7. The method of claim 6 , wherein the portion of the neural network is a last hidden layer corresponding to a proposed region of interest in the R-CNN. 8. The method of claim 1 , further comprising: segmenting all the recorded neuronal activations into one or more segments in accordance with the plurality of classification labels; and for each segment, clustering neuronal activations included in the segment. 9. The method of claim 8 , wherein clustering neuronal activations included in the segment comprises: applying a clustering method that clusters the neuronal activations included in the segment into two clusters. 10. The method of claim 9 , further comprising: classifying a smallest cluster of the two clusters as poisonous, wherein, for each neuronal activation included in the smallest cluster, a data point in the training set that resulted in the neuronal activation is identified as a poisonous data point. 11. The method of claim 8 , wherein clustering neuronal activations included in the segment comprises: applying a clustering method that clusters the neuronal activations included in the segment into a set of clusters; and determining a total number of clusters included in the set of clusters. 12. The method of claim 11 , further comprising: classifying the training set as legitimate in response to determining the total number of clusters is one. 13. The method of claim 11 , further comprising: in response to determining the total number of clusters is more than one: classifying a largest cluster of the set of clusters as legitimate; and classifying each remaining cluster of the set of clusters as poisonous, wherein, for each neuronal activation included in the remaining cluster, a data point in the training set that resulted in the neuronal activation is identified as a poisonous data point. 14. The method of claim 8 , further comprising: for each cluster generated in response to the clustering: for each neuronal activation included in the cluster, identifying a data point in the training set that resulted in the neuronal activation; generating an average of all data points identified; and providing the average to a user to determine whether all the data points identified are poisonous or legitimate. 15. A system comprising: at least one processor; and a non-transitory processor-readable memory device storing instructions that when executed by the at least one processor causes the at least one processor to perform operations including: receiving a training set comprising a plurality of data points, wherein a neural network is trained as a classifier based on the training set; for each data point of the training set: classifying the data point with one of a plurality of classification labels using the trained neural network; and recording neuronal activations of a portion of the trained neural network in response to the data point; and for each classification label that a portion of the training set has been classified with: clustering a portion of all recorded neuronal activations that are in response to the portion of the training set; and detecting one or more poisonous data points in the portion of the training set based on the clustering. 16. The system of claim 15 , wherein the operations further comprise: segmenting all the recorded neuronal activations into one or more segments in accordance with the plurality of classification labels; and for each segment, clustering neuronal activations included in the segment. 17. The system of claim 16 , wherein clustering neuronal activations included in the segment comprises: applying a clustering method that clusters the neuronal activations included in the segment into two clusters; and classifying a smallest cluster of the two clusters as poisonous, wherein, for each neuronal activation included in the smallest cluster, a data point in the training set that resulted in the neuronal activation is identified as a poisonous data point. 18. The system of claim 16 , wherein clustering neuronal activations included in the segment comprises: applying a clustering method that clusters the neuronal activations included in the segment into a set of clusters; determining a total number of clusters included in the set of clusters; in response to determining the total number of clusters is one, classifying the training set as legitimate; and in response to determining the total number of clusters is more than one: classifying a largest cluster of the set of clusters as legitimate; and classifying each remaining cluster of the set of clusters as poisonous, wherein, for each neuronal activation included in the remaining cluster, a data point in the training set that resulted in the neuronal activation is identified as a poisonous data point. 19. The system of claim 16 , wherein the operations further comprise: for each cluster generated in response to the clustering: for each neuronal activation included in the cluster, identifying a data point in the training set that resulted in the neuronal activation; generating an average of all data points identified; and providing the average to a user to determine whether all the data points identified are poisonous or legitimate. 20. A computer program product comprising a computer-readable hardware storage medium having program code embodied therewith, the program code being executable by a computer to implement a method comprising: receiving a training set comprising a plurality of data points, wherein a neural network is trained as a classifier based on the training set; for each data point of the training set: classifying the data point with one of a plurality of classification labels using the trained neural network; and recording neuronal activations of a portion of the trained neural network in response to the data point; and for each classification label that a portion of the training set has been classified with: clustering a portion of all recorded neuronal activations that are in response to the portion of the training set; and detecting one or more poisonous data points in the portion of the training set based on the clustering.
Character recognition · CPC title
Classification techniques · CPC title
using neural networks · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.