Mapping search nodes to a search head using a tenant identifier
US-11275733-B1 · Mar 15, 2022 · US
US2022300518A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022300518-A1 |
| Application number | US-202117206335-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 19, 2021 |
| Priority date | Mar 19, 2021 |
| Publication date | Sep 22, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and a computer program product are used generating an index of a scoring payload dataset. Correlation coefficients for correlations between input data values and output data values of the machine learning model provided by the scoring payload datasets as well as performance data values of the processes provided by process datasets are calculated. Features of which feature values are used as input data values are ranked according to their importance using the correlation coefficients. For the features of a set of highest-ranking features feature value sets with feature values of the respective features are selected from the scoring payload datasets and a database index of the selected feature value sets is generated.
Opening claim text (preview).
What is claimed is: 1 . A method for generating an index of a scoring payload dataset, the method comprising: providing a set of scoring payload datasets; providing a set of process datasets, the process datasets being assigned to processes of a plurality of processes, the process datasets comprising performance data values providing performance measures for the processes to which the process datasets are assigned; combining the provided scoring payload datasets and the provided process datasets assigned to a same process to provide a set of combined datasets; calculating correlation coefficients for correlations between higher features of a first set of features and output data values; ranking the features of the first set of features using the correlation coefficients, wherein the higher the features are ranked, the larger the correlation coefficients calculated for the features are; selecting a set of highest-ranking features; identifying the features of the set of highest-ranking features feature value sets from the scoring payload datasets, the selected features value sets comprising the feature values of the scoring payload datasets assigned to the features of the set of highest-ranking features; and generating a database index of the identified feature value sets. 2 . The method of claim 1 , further comprising assigning the scoring payload datasets to processes of the plurality of processes, wherein: the method is executed at runtime; the machine learning model is trained to predict process results for the processes of the plurality of processes; and the scoring payload datasets comprise: first sets of first feature values provided to the machine learning model as input data values for predicting process results of the processes to which the scoring payload datasets are assigned, the first feature values being assigned to features of the first set of features; and output data values received from the machine learning model as output in response to providing the first sets of first feature values of the scoring payload datasets as input, the output data values of the scoring payload datasets describing the process results predicted for the processes to which the scoring payload datasets are assigned. 3 . The method of claim 1 : further comprising pre-processing a dataset chosen from the group consisting of the scoring payload datasets, the process datasets, and the combined datasets; and wherein the pre-processing comprises converting non-numerical data values comprised by the combined datasets to numerical data values. 4 . The method of claim 1 , further comprising splitting the set of combined datasets into batches according to a classification of the combined datasets, the batches comprising subsets of the combined datasets with combined datasets assigned to a same class, wherein the calculating of the correlation coefficients, the ranking of the features of the first set of features, the selecting of the set of highest-ranking features, the selecting of the feature value sets, and the generating of the database index are performed batchwise. 5 . The method of claim 4 , wherein the batchwise performance is executed for a plurality of the batches in parallel. 6 . The method of claim 4 , wherein the batchwise performance is executed subsequently for one batch after another. 7 . The method of claim 4 , wherein the first feature values are used for the classification of the combined datasets. 8 . The method of claim 1 , wherein the processed datasets further comprise second sets of second feature values, the second feature values being assigned to features of a second set of features characterizing the processes to which the process datasets are assigned. 9 . The method of claim 8 , wherein the second feature values are used for the classification of the combined datasets. 10 . The method of claim 1 : wherein the correlation coefficients are further calculated for correlations between the features of the first set of features and the performance data values using the combined datasets; wherein the correlation coefficients are calculated as part of a correlation matrix, the correlation matrix being calculated using the second feature values in addition to the first feature values, the output data values, and the performance data values; and further comprising extracting the correlation coefficients for the correlations between the features of the first set of features and the output data values as well as the correlation coefficients for the correlations between the features of the first set of features and the performance data values from the correlation matrix for the ranking of the features of the first set of features. 11 . The method of claim 1 , further comprising displaying the correlation coefficients of the selected set of highest-ranking features. 12 . The method of claim 1 , further comprising storing the correlation coefficients of the selected set of highest-ranking features. 13 . The method of claim 1 , wherein the database index further indexes the correlation coefficients of the selected set of highest-ranking features. 14 . The method of claim 1 , further comprising executing a data analysis with the selected feature value sets, the data analysis comprising executing one or more searches using the database index. 15 . A computer program product for selecting feature value sets from a set of scoring payload datasets of a machine learning model for indexing, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the scoring payload datasets being assigned to processes of a plurality of processes, the machine learning model being trained to predict process results for the processes of the plurality of processes, the scoring payload datasets comprising first sets of first feature values provided to the machine learning model as input data values for predicting process results of the processes to which the scoring payload datasets are assigned, the first feature values being assigned to features of a first set of features, the scoring payload datasets further comprising output data values received from the machine learning model as output in response to providing the first sets of first feature values of the scoring payload datasets as input, the output data values of the scoring payload datasets describing the process results predicted for the processes to which the scoring payload datasets are assigned, the program instructions being executable by a processor of a computer system to cause the computer system to: provide the set of scoring payload datasets; provide a set of process datasets, the process datasets being assigned to the processes of the plurality of processes, the process datasets comprising performance data values providing performance measures for the processes to which the process datasets are assigned; combine provided scoring payload datasets and provided process datasets assigned to the same process to provide a set of combined datasets; calculate correlation coefficients for correlations between the features of the first set of features and the output data values as well as correlations between the features of the first set of features and the performance data values using the combined datasets; rank the features of the first set of features according to their importance using the correlation coefficients, wherein the features are ranked the higher, the larger the correlation coefficients calculated for the features are; select a set of highest-ranking fea
Indexing structures · CPC title
Machine learning · CPC title
Column-oriented storage; Management thereof · CPC title
Tablespace storage structures; Management thereof · CPC title
using ranking · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.