Avoidance of intermediate data skew in a massive parallel processing environment
US-2015186466-A1 · Jul 2, 2015 · US
US10956434B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10956434-B2 |
| Application number | US-201615388065-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2016 |
| Priority date | Dec 22, 2016 |
| Publication date | Mar 23, 2021 |
| Grant date | Mar 23, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Example embodiments of a system and method for analyzing and causing presentation of an impact or influence of a value of a dimension on a data set are described. In an example embodiment, a data set organized according to a first dimension is accessed from a data store. An influence value indicating an influence on the data set of at least one value of the first dimension is calculated. At least one of an influence rating and an influence ranking of the at least one value of the first dimension is determined based on the calculated influence value. The influence rating or ranking of the at least one value relative to other values of the first dimension is caused to be presented in conjunction with at least a portion of the data set organized according to the first dimension.
Opening claim text (preview).
What is claimed is: 1. A system comprising: one or more hardware processors; and a memory storing instructions that, when executed by at least one of the one or more hardware processors, cause the system to perform operations comprising: retrieving, from a data store, a data set organized according to a first dimension; calculating an influence value indicating an influence, on the data set, of a dimension value of the first dimension, the calculating of the influence value based on one or more factors, the one or more factors determined by employing at least one classification tree model, the one or more factors including an effect of the dimension value on a measure of the data set; determining at least one of an influence rating and an influence ranking of the dimension value of the first dimension based on the calculated influence value, the at least one of the influence rating and the influence ranking providing a ranking or a rating of the dimension value relative to one or more other values of the data set; causing, via a display device, presentation of the at least one of the influence rating and the influence ranking of the dimension value of the first dimension relative to the one or more other values of the retrieved data set in conjunction with a presentation of at least a portion of the retrieved data set; and allowing modification of the dimension value to visualize, an effect of the modification on the one or more other values of the retrieved data set based on the influence dimension value. 2. The system of claim 1 , the calculating of the influence value comprising: multiplying, for each of the one or more factors, the factor by a corresponding coefficient to yield a corresponding influence term; and summing the corresponding influence terms. 3. The system of claim 2 , the calculating of the influence value further comprising: dividing the summed corresponding influence terms to produce the influence value. 4. The system of claim 1 , the operations further comprising: generating, for the dimension value of the first dimension, the at least one classification tree model; and determining, based on a weight associated with each leaf node of the at least one classification tree model, each of the one or more factors. 5. The system of claim 4 , wherein a target variable of the at least one classification tree model comprises a classification of a group of measures of the data set, the group of measures including the measure of the data set. 6. The system of claim 4 , wherein a root node of the at least one classification tree model comprises the first dimension. 7. The system of claim 4 , wherein the data set is organized according to a plurality of dimensions comprising the first dimension and an input variable of the at least one classification tree comprises a second dimension of the plurality of dimensions. 8. The system of claim 1 , the operations further comprising: employing a regression tree model to determine an additional factor of the influence value. 9. The system of claim 1 , the calculating of the influence value comprising: determining a number of times at least one conditional visual format is applied to the dimension value of the first dimension. 10. The system of claim 1 , the calculating of the influence value comprising: determining a number of presentation filters being based on the dimension value of the first dimension, the influence value being based on the number of presentation filters. 11. The system of claim 1 , the calculating of the influence value comprising: determining a number of times the dimension value of the first dimension is employed in one or more data set visualizations to be presented to a user, the influence value being based on the number of times the dimension value of the first dimension is employed in the one or more data set visualizations. 12. The system of claim 1 , the calculating of the influence value comprising: determining a number of times the dimension value of the first dimension is employed to merge a plurality of data tables of the data set, the influence value being based on the number of times the dimension value of the first dimension is employed to merge the plurality of data tables of the data set. 13. The system of claim 1 , the calculating of the influence value comprising: determining a number of times the dimension value of the first dimension is employed in one or more geographic dimension hierarchies of the data set, the influence value being based on the number of times the dimension value of the first dimension is employed in the one or more geographic dimension hierarchies of the data set. 14. The system of claim 1 , the calculating of the influence value comprising: determining a number of times the dimension value of the first dimension is employed in one or more user-defined dimension hierarchies of the data set, the influence value being based on the number of times the dimension value of the first dimension is employed in the one or more user-defined dimension hierarchies of the data set. 15. The system of claim 1 , the operations further comprising: receiving a user selection of one of the dimension value of the first dimension; and causing, via the display device, presentation of an indication of at least one factor of a plurality of numerical factors upon which the influence rating of the selected one of the dimension value of the first dimension is based. 16. The system of claim 1 , the first dimension comprising a geographic dimension. 17. The system of claim 1 , the system further comprising: a communication network interface configured to communicate via a communication network with a client device, the client device comprising the display device. 18. The system of claim 1 , the calculating of the influence value comprising: determining a number of times the dimension value of the first dimension is employed in one or more user interface pages to be presented to a user, the influence value being based on the number of times the dimension value of the first dimension is employed in the one or more user interface pages. 19. A method comprising: retrieving from a data store, a data set organized according to a first dimension; calculating, using at least one hardware processor of a machine, an influence value indicating an influence of a dimension value of the first dimension, the calculating of the influence value based on one or more factors, the one or more factors determined by employing at least one classification tree model, the one or more factors including an effect of the dimension value on a measure of the data set; determining at least one of an influence rating and an influence ranking of the dimension value of the first dimension based on the calculated influence value, the at least one of the influence rating and the influence ranking providing a ranking or a rating of the dimension value relative to one or more other values of the data set; causing, via a display device, presentation of the at least one of the influence rating and the influence ranking of the dimension value of the first dimension relative to the one or more other values of the retrieved data set in conjunction with a presentation of at least a portion of the retrieved data set; and allowing modification of the dimension value to visualize an effect of the modification on the one or more other values of the retrieved data set based on the influence dimension value. 20. A non-transitory computer-readable storage medium storing ins
Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title
Presentation of query results · CPC title
Filtering based on additional data, e.g. user or group profiles (filtering in web context G06F16/9535, G06F16/9536) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.