Data cube generation

US9965503B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9965503-B2
Application numberUS-201514825132-A
CountryUS
Kind codeB2
Filing dateAug 12, 2015
Priority dateAug 12, 2015
Publication dateMay 8, 2018
Grant dateMay 8, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are a computer-implemented method for generating a data cube from data, a system and a computer program product. The method comprises selecting a candidate granularity from a plurality of candidate granularities determined for a dimension of the data cube, where a data distribution obtained in the selected candidate granularity satisfies a predetermined condition; and generating the data cube based on the selected candidate granularity for the dimension.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for generating a data cube from data, comprising: selecting a candidate granularity from a plurality of candidate granularities determined for a dimension of the data cube, where a data distribution obtained in the selected candidate granularity satisfies a predetermined condition, wherein the selecting a candidate granularity from the plurality of candidate granularities for the dimension comprises: calculating an index of a data distribution obtained in each candidate granularity; determining whether the index satisfies the predetermined condition; and selecting the candidate granularity in response to determining that the index satisfies the predetermined condition; and generating the data cube based on the selected candidate granularity for the dimension. 2. The method of claim 1 , wherein the data distribution obtained in each candidate granularity is obtained by aggregating the data based on the each candidate granularity, and the index is periodicities of the data distribution for different time periods. 3. The method of claim 2 , wherein the determining whether the index satisfies the predetermined condition comprises: determining that the index satisfies the predetermined condition in response to the periodicities of the data distribution for the different time periods having a similarity degree greater than a first threshold. 4. The method of claim 1 , wherein the data distribution obtained in each candidate granularity is obtained by aggregating the data based on the each candidate granularity, and the index is a distinction degree between different segments of the data distribution. 5. The method of claim 4 , wherein the determining whether the index satisfies the predetermined condition comprises: determining that the index satisfies the predetermined condition in response to the distinction degree between the different segments of the data distribution being greater than a second threshold. 6. The method of claim 1 , wherein the data distribution obtained in each candidate granularity includes data distributions associated with a same time period, which correspond to different division units of the each candidate granularity, and the index is a correlation degree between the data distributions associated with the same time period. 7. The method of claim 6 , wherein the determining whether the index satisfies the predetermined condition comprises: determining that the index satisfies the predetermined condition in response to the correlation degree satisfying a predetermined relation. 8. A computer system for generating a data cube from data, comprising: one or more processors; a memory coupled to at least one of the processors; a set of computer program instructions stored in the memory, which, when executed by at least one of the processors, perform actions of: selecting a candidate granularity from a plurality of candidate granularities determined for a dimension of the data cube, where a data distribution obtained in the selected candidate granularity satisfies a predetermined condition, wherein the selecting a candidate granularity from the plurality of candidate granularities for the dimension comprises: calculating an index of a data distribution obtained in each candidate granularity; determining whether the index satisfies the predetermined condition; and selecting the candidate granularity in response to determining that the index satisfies the predetermined condition; and generating the data cube based on the selected candidate granularity for the dimension. 9. The computer system of claim 8 , wherein the data distribution obtained in each candidate granularity is obtained by aggregating the data based on the each candidate granularity, and the index is periodicities of the data distribution for different time periods. 10. The computer system of claim 9 , wherein the determining whether the index satisfies the predetermined condition comprises: determining that the index satisfies the predetermined condition in response to the periodicities of the data distribution for the different time periods having a similarity degree greater than a first threshold. 11. The computer system of claim 8 , wherein the data distribution obtained in each candidate granularity is obtained by aggregating the data based on the each candidate granularity, and the index is a distinction degree between different segments of the data distribution. 12. The computer system of claim 11 , wherein the determining whether the index satisfies the predetermined condition comprises: determining that the index satisfies the predetermined condition in response to the distinction degree between the different segments of the data distribution being greater than a second threshold. 13. The computer system of claim 8 , wherein the data distribution obtained in each candidate granularity includes data distributions associated with a same time period, which correspond to different division units of the each candidate granularity, and the index is a correlation degree between the data distributions associated with the same time period. 14. The computer system of claim 13 , wherein the determining whether the index satisfies the predetermined condition comprises: determining that the index satisfies the predetermined condition in response to the correlation degree satisfying a predetermined relation. 15. A computer program product for generating a data cube from data, comprising: a computer readable storage medium having thereon: first program instructions executable by a processor to cause the processor to select a candidate granularity from a plurality of candidate granularities determined for a dimension of the data cube, where a data distribution obtained in the selected candidate granularity satisfies a predetermined condition, wherein the selecting a candidate granularity from the plurality of candidate granularities for the dimension comprises: calculating an index of a data distribution obtained in each candidate granularity; determining whether the index satisfies the predetermined condition; and selecting the candidate granularity in response to determining that the index satisfies the predetermined condition; and second program instructions executable by the processor to cause the processor to generate the data cube based on the selected candidate granularity for the dimension. 16. The computer program product of claim 15 , wherein the data distribution obtained in each candidate granularity is obtained by aggregating the data based on the each candidate granularity, and the index is periodicities of the data distribution for different time periods or a distinction degree between different segments of the data distribution. 17. The computer program product of claim 15 , the data distribution obtained in each candidate granularity includes data distributions associated with a same time period, which correspond to different division units of the each candidate granularity, and the index is a correlation degree between the data distributions associated with the same time period.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9965503B2 cover?
Disclosed are a computer-implemented method for generating a data cube from data, a system and a computer program product. The method comprises selecting a candidate granularity from a plurality of candidate granularities determined for a dimension of the data cube, where a data distribution obtained in the selected candidate granularity satisfies a predetermined condition; and generating the d…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/2264. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 08 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).