Table scan predicate with integrated semi-join filter
US-2024419650-A1 · Dec 19, 2024 · US
US2016328429A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016328429-A1 |
| Application number | US-201615149128-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 7, 2016 |
| Priority date | Mar 17, 2015 |
| Publication date | Nov 10, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Columnar storage provides many performance and space saving benefits for analytic workloads, but previous mechanisms for handling single row update transactions in column stores suffer from poor performance. A columnar data layout facilitates both low-latency random access capabilities together with high-throughput analytical access capabilities, simplifying Hadoop architectures for use cases involving real-time data. In disclosed embodiments, mutations within a single row are executed atomically across columns and do not necessarily include the entirety of a row. This allows for faster updates without the overhead of reading or rewriting larger columns.
Opening claim text (preview).
I/We claim: 1 . A system facilitating low-latency random access capabilities together with high-throughput analytical access capabilities in connection with a request for processing the stored data, the system comprising: a database table distributing data partitioned into a plurality of horizontal tablets, each horizontal tablet in the plurality of horizontal tablets storing the data in a plurality of rows; the database table including a plurality of columns arranged according to a pre-defined schema; a column in the plurality of columns including a primary key column that stores a key uniquely identifying each row in the plurality of rows by mapping each row to exclusively a single tablet in the plurality of tablets, wherein each tablet in the plurality of tablets comprises: a plurality of DiskRowSets for storing the data, each DiskRowSet in the plurality of DiskRowSets including: a base data module existing in disk and storing a subset of rows in the plurality of rows according to a column-organized representation based upon writing each column in the plurality of columns as a single contiguous block, a Bloom filter of the set of keys included in the primary key column for detecting membership of the set of keys in the each DiskRowSet, a delta store module existing in memory and maintaining a mapping for mutating the subset of rows included in the each DiskRowSet, and a single MemRowSet existing in memory and implemented as a concurrent Binary tree (B-tree), the single MemRowSet receiving new data to be inserted into the database table, buffering the new data as a recently-inserted row, and flushing the recently-inserted row to a DiskRowSet in the plurality of DiskRowSets. 2 . The system of claim 1 , wherein the plurality of tablets are hosted on one or more tablet servers, the one or more tablet servers lacking Hadoop Distributed File System (HDFS) data storage capabilities. 3 . The system of claim 1 , wherein the key is the sole index for manipulating the each row in the plurality of rows. 4 . The system of claim 1 , wherein the each DiskRowSet is disjointed from another DiskRowSet in the plurality of DiskRowSets. 5 . The system of claim 1 , wherein the primary key is included in at most one DiskRowSet in the tablet. 6 . The system of claim 1 , wherein the single MemRowSet is a first MemRowSet, database table is configured for: concurrent to the flushing of the first MemRowSet, providing access to the first MemRowSet based on a mapping in the B-tree of the first MemRowSet; and generating a second MemRowSet in the memory by replacing the first MemRowSet. 7 . The system of claim 6 , wherein the database table is configured for: determining, based on a query to the Bloom filter in the each DiskRowSet that no key in the set of keys overlaps with a key associated with the newly inserted row. 8 . The system of claim 1 , wherein the flushing the recently-inserted row to a DiskRowSet in the plurality of DiskRowSets is according to a predetermined schedule defined by a compaction policy. 9 . The system of claim 1 , wherein the pre-defined schema supports one or more of the following data types: STRING, TIMESTAMP (INT 64), FLOAT, BINARY, DOUBLE, INT8, INT16, INT32, and INT 64. 10 . The system of claim 1 , wherein the mapping for mutating a row in the subset of rows is based on an ordinal index of the row within the DiskRowSet, a MVCC timestamp indicating a time when an operation corresponding to the updating the row was received, and a binary-encoded list of changes to the row. 11 . The system of claim 1 , wherein the single MemRowSet buffers the data corresponding to the recently-inserted row in a row-wise layout. 12 . A method for facilitating low-latency random access capabilities together with high-throughput analytical access capabilities in connection with a request for processing the stored data, the method comprising: distributing, into a database table, data partitioned into a plurality of horizontal tablets, each horizontal tablet in the plurality of horizontal tablets storing the data in a plurality of rows; the database table including a plurality of columns arranged according to a pre-defined schema; a column in the plurality of columns including a primary key column that stores a key uniquely identifying each row in the plurality of rows by mapping each row to exclusively a single tablet in the plurality of tablets, wherein each tablet in the plurality of tablets comprises a plurality of DiskRowSets existing in disk, each DiskRowSet in the plurality of DiskRowSets including: a base data module existing in disk and storing a subset of rows in the plurality of rows according to a column-organized representation based upon writing each column in the plurality of columns as a single contiguous block, a Bloom filter of the set of keys included in the primary key column for detecting membership of the set of keys in the each DiskRowSet, a delta store module existing in memory and maintaining a mapping for mutating the subset of rows included in the each DiskRowSet, a single MemRowSet existing in memory and implemented as a concurrent Binary tree (B-tree), and when the request for processing the stored data is related to an insert operation, receiving, at the single MemRowSet, new data to be inserted, buffering, at the single MemRowSet, the new data as a recently-inserted row, and flushing, from the single MemRowSet, the recently-inserted row to a DiskRowSet in the plurality of DiskRowSets. 13 . The method of claim 12 , wherein the plurality of tablets are hosted on one or more tablet servers, the one or more tablet servers lacking Hadoop Distributed File System (HDFS™)) data storage capabilities. 14 . The method of claim 12 , wherein any row in the plurality of rows is included in exactly one DiskRowSet in the plurality of DiskRowSets. 15 . The method of claim 12 , wherein one or more mutations to the data includes a singly linked list comprising one or more nodes and stored in the single MemRowSet, wherein each of the one or more nodes is defined according to the one or more mutations to the data, the head of the linked list pointing to a row in a DiskRowSet in the plurality of DiskRowSets. 16 . The method of claim 15 , wherein the each node includes a transaction ID that monotonically increases for the each tablet in the plurality of tablets. 17 . The method of claim 12 , wherein the delta store module includes a plurality of UNDO files and a plurality of REDO files, wherein the plurality of REDO files include mutations that were applied to the subset of rows stored in the base data module after a time when the subset of rows was last flushed or compacted, and wherein the plurality of UNDO files include mutations that were applied to the subset of rows stored in the base data module prior to a time when the subset of rows was last flushed or compacted. 18 . The method of claim 12 , wherein mutations to the row in the subset of rows row are executed atomically across one or more columns without including an entirety of the row. 19 . A non-transitory computer-readable medium comprising a set of instructions that, when executed by one or more processors, cause a machine to perform the operations of: distributing, into a database table, data partitioned into a plurality of horizontal tablets, each horizontal tablet in the plurality of horizontal tablets storing the data in a plurality of rows; the database table including a plurality of columns arranged according to a pre-defined sch
Column-oriented storage; Management thereof · CPC title
Trees, e.g. B+trees · CPC title
Tablespace storage structures; Management thereof · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.