Storage engine for hybrid data processing

US11789936B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11789936-B2
Application numberUS-202117462853-A
CountryUS
Kind codeB2
Filing dateAug 31, 2021
Priority dateAug 31, 2021
Publication dateOct 17, 2023
Grant dateOct 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure describes storage techniques for hybrid transactional and analytical processing. Data captured by a first processing engine may be received. The first processing engine may be configured to perform online transactional processing). Multiple replicas of logical logs generated based on the data may be distributed to a Delta Store by applying a quorum protocol on the multiple replicas. Data in the Delta Store are stored in a row format and are visible to a query for online analytical processing performed by a second processing engine. Data may be flushed from the Delta Store to a Base Store based on one or more predetermined rules. Data in the Base Store are stored in a columnar format and may be accessible by the second processing engine.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving data captured by a first processing engine, wherein the first processing engine is configured to perform online transactional processing; distributing multiple replicas of logical logs generated based on the data to a Delta Store in a storage system by applying a quorum protocol on the multiple replicas, wherein data in the Delta Store are stored in a row format and visible to a query for online analytical processing performed by a second processing engine; flushing data from the Delta Store to a Base Store in the storage system based on one or more predetermined rules, wherein data in the Base Store are stored in a columnar format and accessible by the second processing engine, the data in the Base Store are partitioned into a plurality of partitions based on partition keys, and each partition in the Base Store comprises a plurality of data blocks; wherein the storage system is configured to persist a same data in different formats to be consumed by the first processing engine and the second processing engine, respectively; and wherein the storage system, the first processing engine, and the second processing engine are configured to be decoupled from each other. 2. The method of claim 1 , further comprising: updating the data in the Delta Store by performing a delete operation and an insert operation. 3. The method of claim 1 , wherein the Delta Store comprises an insertion Delta list and a deletion Delta list, and the insertion Delta list and the deletion Delta list are sorted based on Log Sequential Numbers. 4. The method of claim 1 , wherein each of the plurality of data blocks comprises a plurality of column files and a metadata file, and the metadata file comprises metadata associated with the plurality of column files. 5. The method of claim 1 , further comprising: applying a delete bitmap to at least one batch of data flushed from the Delta Store to the Base Store, wherein the delete bitmap comprises information indicative of rows that are removed after flushing from the Delta Store. 6. The method of claim 1 , further comprising: applying a groom operation on the Base Store to merge data blocks in one of the plurality of partitions. 7. The method of claim 1 , further comprising: scanning the plurality of data blocks by applying predicate filters and projection operations. 8. The method of claim 1 , further comprising: in response to determining that an online analytical processing query comprises an aggregate operator indicative of an aggregation, pushing down at least one part of the aggregation to a scan of the Delta Store and the Base Store. 9. The method of claim 1 , further comprising: performing a continuous scan on the Delta Store and the Base Store based on a map of scan identifications and iterator instances; and returning scan results in batches. 10. A system, comprising: at least one processor; and at least one memory communicatively coupled to the at least one processor and comprising instructions that upon execution by the at least one processor cause the system to perform operations comprising: receiving data captured by a first processing engine, wherein the first processing engine is configured to perform online transactional processing; distributing multiple replicas of logical logs generated based on the data to a Delta Store in a storage system by applying a quorum protocol on the multiple replicas, wherein data in the Delta Store are stored in a row format and visible to a query for online analytical processing performed by a second processing engine; flushing data from the Delta Store to a Base Store in the storage system based on one or more predetermined rules, wherein data in the Base Store are stored in a columnar format and accessible by the second processing engine, the data in the Base Store are partitioned into a plurality of partitions based on partition keys, and each partition in the Base Store comprises a plurality of data blocks; wherein the storage system is configured to persist a same data in different formats to be consumed by the first processing engine and the second processing engine, respectively; and wherein the storage system, the first processing engine, and the second processing engine are configured to be decoupled from each other. 11. The system of claim 10 , the operations further comprising: updating the data in the Delta Store by performing a delete operation and an insert operation. 12. The system of claim 10 , wherein the Delta Store comprises an insertion Delta list and a deletion Delta list, and the insertion Delta list and the deletion Delta list are sorted based on Log Sequential Numbers (LSNs). 13. The system of claim 10 , wherein each of the plurality of data blocks comprises a plurality of column files and a metadata file, and the metadata file comprises metadata associated with the plurality of column files. 14. The system of claim 10 , the operations further comprising: applying a delete bitmap to at least one batch of data flushed from the Delta Store to the Base Store, wherein the delete bitmap comprises information indicative of rows that are removed after flushing from the Delta Store. 15. The system of claim 10 , the operations further comprising: applying a groom operation on the Base Store to merge data blocks in one of the plurality of partitions. 16. The system of claim 10 , the operations further comprising: scanning the plurality of data blocks by applying predicate filters and projection operations. 17. The system of claim 10 , the operations further comprising: in response to determining that an online analytical processing query comprises an aggregate operator indicative of an aggregation, pushing down at least one part of the aggregation to a scan of the Delta Store and the Base Store. 18. The system of claim 10 , the operations further comprising: performing a continuous scan on the Delta Store and the Base Store based on a map of scan identifications and iterator instances; and returning scan results in batches. 19. A non-transitory computer-readable storage medium, comprising computer-readable instructions that upon execution by a system cause the system to implement operations comprising: receiving data captured by a first processing engine, wherein the first processing engine is configured to perform online transactional processing; distributing multiple replicas of logical logs generated based on the data to a Delta Store in a storage system by applying a quorum protocol on the multiple replicas, wherein data in the Delta Store are stored in a row format and visible to a query for online analytical processing performed by a second processing engine; flushing data from the Delta Store to a Base Store in the storage system based on one or more predetermined rules, wherein data in the Base Store are stored in a columnar format and accessible by the second processing engine, the data in the Base Store are partitioned into a plurality of partitions based on partition keys, and each partition in the Base Store comprises a plurality of data blocks; wherein the storage system is configured to persist a same data in different formats to be consumed by the first processing engine and the second processing engine, respectively; and wherein the storage system, the first processing engine, and the second processing engine are configured to be decoupled from each other. 20. The non-transitory computer-readable storage medium of claim 19 ,

Assignees

Inventors

Classifications

  • Updates performed during online database operations; commit processing · CPC title

  • Databases characterised by their database models, e.g. relational or object models · CPC title

  • Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11789936B2 cover?
The present disclosure describes storage techniques for hybrid transactional and analytical processing. Data captured by a first processing engine may be received. The first processing engine may be configured to perform online transactional processing). Multiple replicas of logical logs generated based on the data may be distributed to a Delta Store by applying a quorum protocol on the multipl…
Who is the assignee on this patent?
Lemon Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2379. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).