Benchmark test method and device for supervised learning algorithm in distributed environment

US2019019111A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019019111-A1
Application numberUS-201816134939-A
CountryUS
Kind codeA1
Filing dateSep 18, 2018
Priority dateMar 18, 2016
Publication dateJan 17, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

There is provided a benchmark test method and device for a supervised learning algorithm in a distributed environment. The method includes: acquiring a first benchmark test result determined according to output data in a benchmark test; acquiring a distributed performance indicator in the benchmark test, and determining the distributed performance indicator as a second benchmark test result; and obtaining a combined benchmark test result by combining the first benchmark test result and the second benchmark test result.

First claim

Opening claim text (preview).

1 . A benchmark test method for a supervised learning algorithm in a distributed environment, comprising: acquiring a first benchmark test result determined according to output data in a benchmark test; acquiring a distributed performance indicator in the benchmark test, and determining the distributed performance indicator as a second benchmark test result; and obtaining a combined benchmark test result by combining the first benchmark test result and the second benchmark test result. 2 . The method according to claim 1 , wherein before the first benchmark test result is acquired, the method further comprises: determining a to-be-tested supervised learning algorithm; performing a benchmark test on the to-be-tested supervised learning algorithm according to an assessment model to obtain output data; and determining the first benchmark test result according to the output data in the benchmark test. 3 . The method according to claim 2 , wherein performing the benchmark test on the to-be-tested supervised learning algorithm comprises one of the following: performing the benchmark test on the to-be-tested supervised learning algorithm according to a cross-validation model to obtain output data; performing the benchmark test on the to-be-tested supervised learning algorithm according to a Label proportional distribution model to obtain output data; or, performing the benchmark test on the to-be-tested supervised learning algorithm according to a cross-validation model and a Label proportional distribution model to obtain output data respectively. 4 . The method according to claim 3 , wherein performing the benchmark test on the to-be-tested supervised learning algorithm according to the cross-validation model to obtain the output data comprises: obtaining a test data sample; equally dividing data in the test data sample into N portions; and executing M rounds of benchmark tests on the N portions of data, wherein each round of benchmark test comprises the following: determining, in the N portions of data, N−1 portions as training data and the remaining one portion as prediction data, wherein in the M rounds of benchmark tests, each portion of data has one chance to be determined as prediction data, and M and N are positive integers; providing the determined N−1 portions of training data to the to-be-tested supervised learning algorithm for learning to obtain a function; and providing input data in the determined one portion of prediction data to the function to obtain the output data. 5 . The method according to claim 3 , wherein performing the benchmark test on the to-be-tested supervised learning algorithm according to the Label proportional distribution model to obtain the output data comprises: obtaining a test data sample comprising data having a first label and data having a second label; equally dividing the data having the first label and the data having the second label in the test data sample into N portions respectively; and executing M rounds of benchmark tests on the 2N portions of data obtained through the equal division, wherein each round of benchmark test comprises the following: determining, in the N portions of data having the first label, one portion as training data and remaining one or more portions as prediction data, and determining, in the N portions of data having the second label, one portion as training data and remaining one or more portions as prediction data, wherein M and N are positive integers; providing the determined training data having the first label and the second label to the to-be-tested supervised learning algorithm for learning to obtain a function; and providing input data in the determined prediction data having the first label and the second label to the function to obtain the output data. 6 . The method according to claim 2 , wherein the first benchmark test result comprises at least one of the following indicators: true positive rate (TP), true negative rate (TN), false positive rate (FP), false negative rate (FN), precision (Precision), recall rate (Recall), or accuracy (Accuracy); and the second benchmark test result comprises at least one of the following indicators: processor usage (CPU) of the to-be-tested supervised learning algorithm, memory usage (MEM) of the to-be-tested supervised learning algorithm, an iteration count (Iterate) of the to-be-tested supervised learning algorithm, or usage time (Duration) of the to-be-tested supervised learning algorithm. 7 . The method according to claim 2 , wherein after obtaining the combined benchmark test result, the method further comprises: determining an F1 score according to the first benchmark test result; and performing a performance assessment on the to-be-tested supervised learning algorithm by: in response to F1 scores being identical or close to each other, determining that a to-be-tested supervised learning algorithm having a smaller Iterate value has better performance; and, in response to F1 indicators being identical, determining that a to-be-tested supervised learning algorithm having a smaller CPU, MEM, Iterate, or Duration value has better performance. 8 . A benchmark test system for a supervised learning algorithm in a distributed environment, comprising: one or more memories configured to store executable program code; and one or more processors configured to read the executable program code stored in the one or more memories to cause the benchmark test system to perform: acquiring a first benchmark test result determined according to output data in a benchmark test; acquiring a distributed performance indicator in the benchmark test; determining the distributed performance indicator as a second benchmark test result; and obtaining a combined benchmark test result by combining the first benchmark test result and the second benchmark test result. 9 . The system according to claim 8 , wherein the one or more processors are configured to read the executable program code to cause the benchmark test system to further perform: determining a to-be-tested supervised learning algorithm before the first benchmark test result determined according to the output data in the benchmark test is acquired; and performing a benchmark test on the to-be-tested supervised learning algorithm according to an assessment model to obtain the output data. 10 . The system according to claim 9 , wherein the one or more processors are configured to read the executable program code to cause the benchmark test system to further perform one of the following: performing a benchmark test on the to-be-tested supervised learning algorithm according to a cross-validation model to obtain the output data; performing a benchmark test on the to-be-tested supervised learning algorithm according to a Label proportional distribution model to obtain the output data; or performing a benchmark test on the to-be-tested supervised learning algorithm respectively according to a cross-validation model and a Label proportional distribution model to obtain the output data. 11 . The system according to claim 10 , wherein the one or more processors are configured to read the executable program code to cause the benchmark test system to further perform: obtaining a test data sample; equally dividing data in the test data sample into N portions; in each round of benchmark test, determining, in the N portions of data, N−1 portions as training data and the remaining one portion as prediction data, wherein in the M rounds of benchmark tests, each portion of data has one chance to be determined as prediction data, and M and N are positive integers; in each round of benchmark test,

Assignees

Inventors

Classifications

  • G06N99/005Primary

    Physics · mapped topic

  • Benchmarking · CPC title

  • where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title

  • for test execution, e.g. scheduling of test suites · CPC title

  • Monitoring · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019019111A1 cover?
There is provided a benchmark test method and device for a supervised learning algorithm in a distributed environment. The method includes: acquiring a first benchmark test result determined according to output data in a benchmark test; acquiring a distributed performance indicator in the benchmark test, and determining the distributed performance indicator as a second benchmark test result; an…
Who is the assignee on this patent?
Alibaba Group Holding Ltd
What technology area does this patent fall under?
Primary CPC classification G06N99/005. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).