Automated Remote Music Identification and Publishing System and Method
US-2024427820-A1 · Dec 26, 2024 · US
US2017046367A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2017046367-A1 |
| Application number | US-201514821915-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 10, 2015 |
| Priority date | Aug 10, 2015 |
| Publication date | Feb 16, 2017 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Conventionally, in addition to indexing, a synopsis of a base table of a database is used to skip and compress data. However, scanning of the entire synopsis for all queries is required, which takes a long time when the synopsis gets significantly big in a large data warehouse. A method for efficient data skipping and compression through vertical partitioning of data is provided to eliminate the cost of synopsis storage overhead while enabling the synopsis search functionality.
Opening claim text (preview).
What is claimed is: 1 . A method for scanning and skipping data blocks, the method comprising: partitioning projection of each data value of a set of data values into a plurality of data types, wherein the date types include numerical and/or comparable bytes value; and storing the plurality of data types in a set of separate columns, wherein there is a separate column for each data type; wherein: at least the step of storing the plurality of data types is performed by computer software running on computer hardware. 2 . The method of claim 1 , further comprising: retrieving a data block, the data block comprising a subset of the plurality of data types; applying a plurality of predicates on the data block, the plurality of predicates corresponding to the plurality of data types; skipping the data block, conditioned upon the failure of any predicate of the plurality of predicates; and returning the data block, conditioned upon the passing of each predicate of the plurality of predicates. 3 . The method of claim 1 , wherein the step of partitioning projection of each data value includes: transforming each data value into a transformed value through the use of a custom formula that includes a geospatial grid; and dividing the transformed value into a set of most significant digits and least significant digits. 4 . The method of claim 1 , further comprising: identifying a set of correlated values from the plurality of data types. 5 . The method of claim 4 , wherein the set of correlated values includes a prefix, a postfix, and/or a set of substrings. 6 . The method of claim 4 , further comprising: storing only once the set of correlated values. 7 . The method of claim 1 , further comprising: compressing the plurality of data types in the set of separate columns. 8 . The method of claim 1 , further comprising: sorting the plurality of data types in the set of separate columns. 9 . The method of claim 1 , further comprising: generating a set of range summaries of the plurality of data types. 10 . A computer program product for scanning and skipping data blocks, the computer program product comprising a computer readable storage medium having stored thereon: first program instructions programmed to partition projection of each data value of a set of data values into a plurality of data types, wherein the date types include numerical and/or comparable bytes value; and second program instructions programmed to store the plurality of data types in a set of separate columns, wherein there is a separate column for each data type; wherein: at least the step of storing the plurality of data types is performed by computer software running on computer hardware. 11 . The computer program product of claim 10 , further comprising: third program instructions programmed to retrieve a data block, the data block comprising a subset of the plurality of data types; fourth program instructions programmed to apply a plurality of predicates on the data block, the plurality of predicates corresponding to the plurality of data types; fifth program instructions programmed to skip the data block, conditioned upon the failure of any predicate of the plurality of predicates; and sixth program instructions programmed to return the data block, conditioned upon the passing of each predicate of the plurality of predicates. 12 . The computer program product of claim 10 , further comprising: third program instructions programmed to identify a set of correlated values from the plurality of data types. 13 . The computer program product of claim 12 , wherein the set of correlated values includes a prefix, a postfix, and/or a set of substrings. 14 . The computer program product of claim 12 , further comprising: fourth program instructions programmed to store only once the set of correlated values. 15 . A computer system for scanning and skipping data blocks, the computer system comprising: a processor(s) set; and a computer readable storage medium; wherein: the processor set is structured, located, connected, and/or programmed to run program instructions stored on the computer readable storage medium; and the program instructions include: first program instructions programmed to partition projection of each data value of a set of data values into a plurality of data types, wherein the date types include numerical and/or comparable bytes value; and second program instructions programmed to store the plurality of data types in a set of separate columns, wherein there is a separate column for each data type; wherein: at least the step of storing the plurality of data types is performed by computer software running on computer hardware. 16 . The computer system of claim 15 , further comprising: third program instructions programmed to retrieve a data block, the data block comprising a subset of the plurality of data types; fourth program instructions programmed to apply a plurality of predicates on the data block, the plurality of predicates corresponding to the plurality of data types; fifth program instructions programmed to skip the data block, conditioned upon the failure of any predicate of the plurality of predicates; and sixth program instructions programmed to return the data block, conditioned upon the passing of each predicate of the plurality of predicates. 17 . The computer system of claim 15 , further comprising: third program instructions programmed to compress the plurality of data types in the set of separate columns. 18 . The computer system of claim 15 , further comprising: third program instructions programmed to sort the plurality of data types in the set of separate columns. 19 . The computer system of claim 15 , further comprising: third program instructions programmed to generate a set of range summaries of the plurality of data types. 20 . The computer system of claim 15 , further comprising: third program instructions programmed to identify a set of correlated values from the plurality of data types.
Ensuring data consistency and integrity · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Unary operations; Data partitioning operations · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.