Systems and methods for evaluating networks
US-9654503-B1 · May 16, 2017 · US
US11481672B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11481672-B2 |
| Application number | US-201916423315-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 28, 2019 |
| Priority date | Nov 29, 2018 |
| Publication date | Oct 25, 2022 |
| Grant date | Oct 25, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A database including various datasets and metadata associated with each respective dataset is provided. These datasets were used to train predictive models. The database stores a performance value associated with the model trained with each dataset. When provided with a new dataset, a server can determine various metadata for the new dataset. Using the metadata, the server can search the database and retrieve datasets which have similar metadata values. The server can narrow the search based on the performance value associated with the dataset. Based on the retrieved datasets, the server can recommend at least one sampling technique. The sampling technique can be determined based on the one or more sampling techniques that were used in association with the retrieved datasets.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: receiving, by a transceiver of a server, a first dataset including labeled data points belonging to two classes, a first number of labeled data points belonging to a first class is larger than a second number of labeled data points belonging to a second class; calculating, using a processor of the server, a first metadata value for the first dataset, wherein the first metadata value is a weighted average of a standard deviation, an average and a median of the labeled data points; selecting, using the processor, a selected sampling technique associated with a selected dataset, wherein: the selected dataset is one of a plurality of datasets stored in a database; each of the plurality of datasets includes selected data points and is associated with a metadata value, a sampling technique and a performance value, wherein the metadata value is another weighted average of another standard deviation, another average and another median of the selected data points of the respective dataset; the performance value is a measure of efficacy of a predictive model trained with the respective dataset and is specificity; and the first metadata value matches the metadata value associated with the selected dataset; and sampling, using the processor, the first dataset using the selected sampling technique to generate a new subset, wherein the selected sampling technique is a combination of Random Under-Sampling of the first class of data points by discarding a plurality of the labeled data points of the first class and Modified Synthetic Minority Over-Sampling the second class of data points by multiplying a plurality of the labeled data points of the second class. 2. The method of claim 1 , wherein the selected sampling technique is one of the following: the sampling technique associated with the selected dataset; or based on the sampling technique associated with the selected dataset. 3. The method of claim 1 , wherein the performance value associated with the selected dataset is higher than a threshold value. 4. The method of claim 3 , wherein the performance value is additionally one of accuracy, precision, recall, or area under a curve. 5. The method of claim 4 , wherein the performance value is an area under a curve and the threshold value is 0.8. 6. The method of claim 1 , further comprising providing the new subset to a classifier as training data. 7. The method of claim 6 , wherein the classifier uses the training data to train a predictive model. 8. The device of claim 1 , wherein the selected sampling technique further includes one or a combination of the following: Synthetic Minority Over-sampling Technique; or Random Over-Sampling. 9. The method of claim 1 , wherein the sampling technique is: Synthetic Minority Over-sampling Technique; Modified synthetic minority oversampling technique; Random Under-Sampling; or Random Over-Sampling. 10. The method of claim 1 , wherein the first metadata value matches the metadata value associated with the selected dataset only if: the first metadata value is equal to the metadata value; or the first metadata value is within a tolerance range of the metadata value. 11. A device comprising: a processor, a memory, a reader, a transceiver and a display, wherein: the transceiver is configured to receive a payment request including a payment amount and an account number from a terminal; and the transceiver is configured to transmit a message to the terminal, wherein: the message is created by the processor using a predictive model and the predictive model was trained using training data, the training data was a subset of a first dataset sampled according to a selected sampling technique; the selected sampling technique is associated with a selected dataset, wherein in the selected dataset, a first number of labeled data points belonging to a first class is larger than a second number of labeled data points belonging to a second class; and the selected dataset is one of a plurality of datasets stored in a database, each dataset including selected data points and being associated with a sampling technique, a metadata value and a performance value, wherein: the performance value is a measure of efficacy of a model trained with the sampling technique associated with the respective dataset and is specificity; the metadata value is a weighted average of a standard deviation, an average and a median of the selected data points of the respective dataset; the selected sampling technique is a combination of Random Under-Sampling of the first class of data points by discarding a plurality of the labeled data points of the first class and Modified Synthetic Minority Over-Sampling the second class of data points by multiplying a plurality of the labeled data points of the second class; and a first metadata value of the first dataset matches the metadata value of the selected dataset. 12. The device of claim 11 , wherein the selected sampling technique is: the sampling technique associated with the selected dataset; or based on the sampling technique associated with the selected dataset. 13. The device of claim 11 , wherein the performance value associated with the selected dataset is higher than a threshold value. 14. The device of claim 11 , wherein the selected sampling technique is one or more of the following: Synthetic Minority Over-sampling Technique; or Random Over-Sampling. 15. A system comprising: a server; and a terminal including a processor, a memory, a reader, a transceiver and a display, wherein: the reader is configured to scan a payment card for an account number; the transceiver is configured to transmit a payment request including a payment amount and the account number to the server; and the transceiver is configured to receive a message from the server, wherein: the message is created by the server using a predictive model and the predictive model was trained using training data, the training data was a subset of a first dataset sampled according to a selected sampling technique; the selected sampling technique is associated with a selected dataset, wherein in the selected dataset, a first number of labeled data points belonging to a first class is larger than a second number of labeled data points belonging to a second class; the selected dataset is one of a plurality of datasets stored in a database, each dataset including selected data points and being associated with a sampling technique, a metadata value and a performance value; the performance value is a measure of efficacy of a model trained with the sampling technique associated with the respective dataset and is specificity; a first metadata value of the first dataset matches the metadata value of the selected dataset; wherein the selected sampling technique is a combination of Random Under-Sampling of the first class of data points by discarding a plurality of the labeled data points of the first class and Modified Synthetic Minority Over-Sampling the second class of data points by multiplying a plurality of the labeled data points of the second class; and the first metadata value is a weighted average of a standard deviation, an average and a median of the selected data points of the respective dataset.
Machine learning · CPC title
using ranking · CPC title
using data annotations, e.g. user-defined metadata · CPC title
Inference or reasoning models · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.