Method and system for query performance prediction
US-2019171742-A1 · Jun 6, 2019 · US
US11222046B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11222046-B2 |
| Application number | US-202016888575-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 29, 2020 |
| Priority date | Mar 15, 2018 |
| Publication date | Jan 11, 2022 |
| Grant date | Jan 11, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Implementations of the present specification provide abnormal sample prediction methods and apparatuses. The method includes: obtaining a sample to be tested, wherein the sample to be tested comprises feature data with a given dimension, and wherein the given dimension is a first quantity; performing dimension reduction processing on the sample to be tested by using multiple dimension reduction methods to obtain multiple processed samples; inputting the multiple processed samples to multiple corresponding processing models to obtain scores of the multiple processed samples, wherein an ith processing model Mi in the multiple processing models scores the corresponding processed sample Si based on a hypersphere Qi; determining a comprehensive score of the sample to be tested based on scores of the multiple processed samples; and classifying, based on the comprehensive score, the sample to be tested as an abnormal sample.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for sample prediction, comprising: obtaining a historical sample set, wherein a sample dimension of the historical sample set is a first quantity; training a first processing model, wherein training the first processing model comprises: processing the historical sample set as a historical sample set of a first dimension by using a first dimension reduction process, wherein the first dimension is less than the first quantity; determining a first hypersphere in a first space corresponding to the first dimension by using a support vector domain description (SVDD) method in the first space, wherein a first quantity of samples in the historical sample set are surrounded by the first hypersphere, and wherein a radius of the first hypersphere satisfies a first predetermined condition; training a second processing model, wherein training the second processing model comprises: processing the historical sample set as a historical sample set of a second dimension by using a different second dimension reduction process that is different than the first dimension reduction process, wherein the second dimension is less than the first quantity; determining a second hypersphere in a second space corresponding to the second dimension by using a SVDD method in the second space, wherein a second quantity of samples in the historical sample set are surrounded by the second hypersphere, and wherein a radius of the second hypersphere satisfies a second predetermined condition; obtaining a sample to be tested, wherein the sample to be tested comprises feature data with a given dimension, and wherein the given dimension is the first quantity; performing the first dimension reduction process on the sample to be tested; generating a first processed sample of the first dimension that is less than the first quantity based on performing the first dimension reduction process on the sample to be tested; performing the different second dimension reduction process on the sample to be tested, wherein performing the second dimension reduction process and the first dimension reduction process alleviates information loss caused by dimension reduction; generating a second processed sample of the second dimension that is less than the first quantity based on performing the second dimension reduction process on the sample to be tested; inputting the first processed sample of the first dimension to the first processing model, wherein the first processing model generates a first score of the first processed sample based on the first hypersphere determined by using the SVDD method in the first space corresponding to the first dimension; inputting the second processed sample of the second dimension to the second processing model, where the second processing model generates a second score of the second processed sample based on the second hypersphere determined by using the SVDD method in the second space corresponding to the second dimension; determining a comprehensive score of the sample to be tested based on at least the first score of the first processed sample and the second score of the second processed sample; and classifying, based on the comprehensive score, the sample to be tested. 2. The computer-implemented method of claim 1 , wherein the first dimension reduction process and the second dimension reduction process comprise at least one of an operation dimension reduction method and a feature sampling dimension reduction method. 3. The computer-implemented method of claim 2 , wherein the operation dimension reduction method comprises one or more of the following: a principal component analysis (PCA) method, a least absolute shrinkage and selection operator (LASSO) method, a linear discriminant analysis (LDA) method, and a wavelet analysis method. 4. The computer-implemented method of claim 2 , wherein the feature sampling dimension reduction method comprises one or more of the following: a random sampling method, a hash sampling method, a filter feature selection method, and a wrapper feature selection method. 5. The computer-implemented method of claim 1 , wherein determining the comprehensive score of the sample to be tested comprises: determining a relative location of the first processed sample relative to the first hypersphere in the first space corresponding to the first dimension; and determining the first score of the first processed sample based on the relative location. 6. The computer-implemented method of claim 5 , wherein the relative location comprises one of the following: outside, inside, or above the first hypersphere; a distance between the first processed sample and a center of the first hypersphere in the first space corresponding to the first dimension; and a distance between the first processed sample and a surface of the first hypersphere in the first space corresponding to the first dimension. 7. The computer-implemented method of claim 1 , wherein determining the comprehensive score of the sample to be tested comprises: performing weighted summation on at least the first score of the first processed sample and the second score of the second processed sample. 8. A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations comprising: obtaining a historical sample set, wherein a sample dimension of the historical sample set is a first quantity; training a first processing model, wherein training the first processing model comprises: processing the historical sample set as a historical sample set of a first dimension by using a first dimension reduction process, wherein the first dimension is less than the first quantity; determining a first hypersphere in a first space corresponding to the first dimension by using a support vector domain description (SVDD) method in the first space, wherein a first quantity of samples in the historical sample set are surrounded by the first hypersphere, and wherein a radius of the first hypersphere satisfies a first predetermined condition; training a second processing model, wherein training the second processing model comprises: processing the historical sample set as a historical sample set of a second dimension by using a different second dimension reduction process that is different than the first dimension reduction process, wherein the second dimension is less than the first quantity; determining a second hypersphere in a second space corresponding to the second dimension by using a SVDD method in the second space, wherein a second quantity of samples in the historical sample set are surrounded by the second hypersphere, and wherein a radius of the second hypersphere satisfies a second predetermined condition; obtaining a sample to be tested, wherein the sample to be tested comprises feature data with a given dimension, and wherein the given dimension is the first quantity; performing the first dimension reduction process on the sample to be tested; generating a first processed sample of the first dimension that is less than the first quantity based on performing the first dimension reduction process on the sample to be tested; performing the different second dimension reduction process on the sample to be tested, wherein performing the second dimension reduction process and the first dimension reduction process alleviates information loss caused by dimension reduction; generating a second processed sample of the second dimension that is less than the first quantity based on performing the second dimension reduction process on the sample to be tested; inputting the first processed sample of the first dimension to the first processing model, wherein the first pr
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Clustering or classification · CPC title
Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem" (market predictions or forecasting for commercial activities G06Q30/0202) · CPC title
Ensemble learning · CPC title
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.