Mutations in a column store

US2016328429A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016328429-A1
Application numberUS-201615149128-A
CountryUS
Kind codeA1
Filing dateMay 7, 2016
Priority dateMar 17, 2015
Publication dateNov 10, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Columnar storage provides many performance and space saving benefits for analytic workloads, but previous mechanisms for handling single row update transactions in column stores suffer from poor performance. A columnar data layout facilitates both low-latency random access capabilities together with high-throughput analytical access capabilities, simplifying Hadoop architectures for use cases involving real-time data. In disclosed embodiments, mutations within a single row are executed atomically across columns and do not necessarily include the entirety of a row. This allows for faster updates without the overhead of reading or rewriting larger columns.

First claim

Opening claim text (preview).

I/We claim: 1 . A system facilitating low-latency random access capabilities together with high-throughput analytical access capabilities in connection with a request for processing the stored data, the system comprising: a database table distributing data partitioned into a plurality of horizontal tablets, each horizontal tablet in the plurality of horizontal tablets storing the data in a plurality of rows; the database table including a plurality of columns arranged according to a pre-defined schema; a column in the plurality of columns including a primary key column that stores a key uniquely identifying each row in the plurality of rows by mapping each row to exclusively a single tablet in the plurality of tablets, wherein each tablet in the plurality of tablets comprises: a plurality of DiskRowSets for storing the data, each DiskRowSet in the plurality of DiskRowSets including: a base data module existing in disk and storing a subset of rows in the plurality of rows according to a column-organized representation based upon writing each column in the plurality of columns as a single contiguous block, a Bloom filter of the set of keys included in the primary key column for detecting membership of the set of keys in the each DiskRowSet, a delta store module existing in memory and maintaining a mapping for mutating the subset of rows included in the each DiskRowSet, and a single MemRowSet existing in memory and implemented as a concurrent Binary tree (B-tree), the single MemRowSet receiving new data to be inserted into the database table, buffering the new data as a recently-inserted row, and flushing the recently-inserted row to a DiskRowSet in the plurality of DiskRowSets. 2 . The system of claim 1 , wherein the plurality of tablets are hosted on one or more tablet servers, the one or more tablet servers lacking Hadoop Distributed File System (HDFS) data storage capabilities. 3 . The system of claim 1 , wherein the key is the sole index for manipulating the each row in the plurality of rows. 4 . The system of claim 1 , wherein the each DiskRowSet is disjointed from another DiskRowSet in the plurality of DiskRowSets. 5 . The system of claim 1 , wherein the primary key is included in at most one DiskRowSet in the tablet. 6 . The system of claim 1 , wherein the single MemRowSet is a first MemRowSet, database table is configured for: concurrent to the flushing of the first MemRowSet, providing access to the first MemRowSet based on a mapping in the B-tree of the first MemRowSet; and generating a second MemRowSet in the memory by replacing the first MemRowSet. 7 . The system of claim 6 , wherein the database table is configured for: determining, based on a query to the Bloom filter in the each DiskRowSet that no key in the set of keys overlaps with a key associated with the newly inserted row. 8 . The system of claim 1 , wherein the flushing the recently-inserted row to a DiskRowSet in the plurality of DiskRowSets is according to a predetermined schedule defined by a compaction policy. 9 . The system of claim 1 , wherein the pre-defined schema supports one or more of the following data types: STRING, TIMESTAMP (INT 64), FLOAT, BINARY, DOUBLE, INT8, INT16, INT32, and INT 64. 10 . The system of claim 1 , wherein the mapping for mutating a row in the subset of rows is based on an ordinal index of the row within the DiskRowSet, a MVCC timestamp indicating a time when an operation corresponding to the updating the row was received, and a binary-encoded list of changes to the row. 11 . The system of claim 1 , wherein the single MemRowSet buffers the data corresponding to the recently-inserted row in a row-wise layout. 12 . A method for facilitating low-latency random access capabilities together with high-throughput analytical access capabilities in connection with a request for processing the stored data, the method comprising: distributing, into a database table, data partitioned into a plurality of horizontal tablets, each horizontal tablet in the plurality of horizontal tablets storing the data in a plurality of rows; the database table including a plurality of columns arranged according to a pre-defined schema; a column in the plurality of columns including a primary key column that stores a key uniquely identifying each row in the plurality of rows by mapping each row to exclusively a single tablet in the plurality of tablets, wherein each tablet in the plurality of tablets comprises a plurality of DiskRowSets existing in disk, each DiskRowSet in the plurality of DiskRowSets including: a base data module existing in disk and storing a subset of rows in the plurality of rows according to a column-organized representation based upon writing each column in the plurality of columns as a single contiguous block, a Bloom filter of the set of keys included in the primary key column for detecting membership of the set of keys in the each DiskRowSet, a delta store module existing in memory and maintaining a mapping for mutating the subset of rows included in the each DiskRowSet, a single MemRowSet existing in memory and implemented as a concurrent Binary tree (B-tree), and when the request for processing the stored data is related to an insert operation, receiving, at the single MemRowSet, new data to be inserted, buffering, at the single MemRowSet, the new data as a recently-inserted row, and flushing, from the single MemRowSet, the recently-inserted row to a DiskRowSet in the plurality of DiskRowSets. 13 . The method of claim 12 , wherein the plurality of tablets are hosted on one or more tablet servers, the one or more tablet servers lacking Hadoop Distributed File System (HDFS™)) data storage capabilities. 14 . The method of claim 12 , wherein any row in the plurality of rows is included in exactly one DiskRowSet in the plurality of DiskRowSets. 15 . The method of claim 12 , wherein one or more mutations to the data includes a singly linked list comprising one or more nodes and stored in the single MemRowSet, wherein each of the one or more nodes is defined according to the one or more mutations to the data, the head of the linked list pointing to a row in a DiskRowSet in the plurality of DiskRowSets. 16 . The method of claim 15 , wherein the each node includes a transaction ID that monotonically increases for the each tablet in the plurality of tablets. 17 . The method of claim 12 , wherein the delta store module includes a plurality of UNDO files and a plurality of REDO files, wherein the plurality of REDO files include mutations that were applied to the subset of rows stored in the base data module after a time when the subset of rows was last flushed or compacted, and wherein the plurality of UNDO files include mutations that were applied to the subset of rows stored in the base data module prior to a time when the subset of rows was last flushed or compacted. 18 . The method of claim 12 , wherein mutations to the row in the subset of rows row are executed atomically across one or more columns without including an entirety of the row. 19 . A non-transitory computer-readable medium comprising a set of instructions that, when executed by one or more processors, cause a machine to perform the operations of: distributing, into a database table, data partitioned into a plurality of horizontal tablets, each horizontal tablet in the plurality of horizontal tablets storing the data in a plurality of rows; the database table including a plurality of columns arranged according to a pre-defined sch

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016328429A1 cover?
Columnar storage provides many performance and space saving benefits for analytic workloads, but previous mechanisms for handling single row update transactions in column stores suffer from poor performance. A columnar data layout facilitates both low-latency random access capabilities together with high-throughput analytical access capabilities, simplifying Hadoop architectures for use cases i…
Who is the assignee on this patent?
Cloudera Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/221. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Nov 10 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).