Learning device, control device, learning method, and recording medium
US-2021181728-A1 · Jun 17, 2021 · US
US12112268B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12112268-B2 |
| Application number | US-202318136830-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 19, 2023 |
| Priority date | Mar 6, 2019 |
| Publication date | Oct 8, 2024 |
| Grant date | Oct 8, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for generating a dual-class dataset is disclosed. A single-class dataset and a context dataset are obtained. The context dataset can be labeled. A model can be trained using the combination of the single-class dataset and the labeled context dataset. The model can be run on the context dataset. The data points that are classified the same as the data points included in the single-class dataset, can be removed from the labeled context dataset and added to the single-class dataset. These steps can be repeated until no data points are classified by the model.
Opening claim text (preview).
The invention claimed is: 1. A non-transitory computer-accessible medium having stored thereon computer-executable instructions for generating a first dual-class dataset, wherein, when a computing hardware arrangement executes the instructions, the computing arrangement is configured to perform procedures comprising: (a) accessing a first dataset including data points belonging to a first category of data points; (b) accessing a second dataset including data points belonging to the first category of data points and a second category of data points; (c) labeling each data point in the first dataset with a first label to generate a first labeled dataset, and labeling each data point in the second dataset with a second label to generate a second labeled dataset; (d) training a classification model using the first labeled dataset and the second labeled dataset; (e) using the classification model, classifying each data point in the second labeled dataset as belonging to one of the first category of data points or the second category of data points; (f) for each data point in the second labeled dataset classified as belonging to the first category of data points, removing the data point from the second dataset and adding the data point to the first dataset; and (g) generating the first dual-class dataset using the first dataset and the second dataset. 2. The non-transitory computer-accessible medium of claim 1 , further configured to perform procedures comprising: repeating steps (c)-(g) to generate a second dual-class dataset. 3. The non-transitory computer-accessible medium of claim 2 , wherein in repeating the step (d), a second classification model is used. 4. The non-transitory computer-accessible medium of claim 1 , further configured to perform procedures comprising: continue repeating steps (c)-(g) to generate a new dual-class dataset until the new dual-class dataset is the same as the dual-class dataset from a prior run. 5. The non-transitory computer-accessible medium of claim 1 , further configured to perform procedures comprising: prior to accessing a second dataset, generating the second dataset by scarping data from Internet pages, Internet websites or databases. 6. The non-transitory computer-accessible medium of claim 1 , wherein the first dataset includes data points relating to fraudulent transactions. 7. The non-transitory computer-accessible medium of claim 1 , wherein the first dataset includes data points relating to phone numbers. 8. The non-transitory computer-accessible medium of claim 1 , wherein the first category of data includes telephone numbers and the second category of data includes non-telephone number text. 9. The non-transitory computer-accessible medium of claim 1 , further configured to perform procedures comprising: sampling the first dual-class dataset according to a sampling technique. 10. The non-transitory computer-accessible medium of claim 9 , wherein the sampling technique is undersampling or oversampling the data points in the first dataset. 11. The non-transitory computer-accessible medium of claim 9 , wherein the sampling technique is undersampling or oversampling the data points in the second dataset. 12. The non-transitory computer-accessible medium of claim 9 , wherein the sampling technique is Synthetic Minority Oversampling Technique. 13. The non-transitory computer-accessible medium of claim 9 , wherein the sampling technique is Modified Synthetic Minority Oversampling Technique. 14. The non-transitory computer-accessible medium of claim 9 , wherein the sampling technique is Random Undersampling. 15. The non-transitory computer-accessible medium of claim 9 , wherein the sampling technique is Random Oversampling. 16. The non-transitory computer-accessible medium of claim 1 , further configured to perform procedures comprising: calculating a performance value for the classification model. 17. The non-transitory computer-accessible medium of claim 16 , wherein the performance value is an area under a curve. 18. The non-transitory computer-accessible medium of claim 16 , wherein the performance value is an accuracy rate. 19. The non-transitory computer-accessible medium of claim 16 , wherein the performance value is a precision rate. 20. The non-transitory computer-accessible medium of claim 16 , wherein the performance value is a recall rate.
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Feedforward networks · CPC title
Supervised learning · CPC title
Combinations of networks · CPC title
Text processing (natural language analysis G06F40/20; semantic analysis G06F40/30; processing or translation of natural language G06F40/40) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.