Non-uniform pagination of columnar data

US11080187B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11080187-B2
Application numberUS-202016900702-A
CountryUS
Kind codeB2
Filing dateJun 12, 2020
Priority dateDec 10, 2018
Publication dateAug 3, 2021
Grant dateAug 3, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer implemented system and method of memory management for an in-memory database. The system implements a paged data vector using non-uniform compression of its chunks. In this manner, the system achieves greater compression than systems that use uniform compression.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method of memory management for an in-memory database, the method comprising: storing, in a secondary storage, a paged data vector, wherein the paged data vector includes at least one node, wherein the at least one node has a plurality of chunks, wherein the plurality of chunks are compressed using non-uniform compression, wherein the plurality of chunks are logically arranged in the paged data vector as a plurality of pages; receiving a data request; identifying a subset of the plurality of pages that relate to the data request; loading, from the secondary storage to a main memory, at least one page of the subset of the plurality of pages that have been identified as relating to the data request; and executing the data request using the at least one page of the subset of the plurality of pages in the main memory, wherein the paged data vector is generated by a method including: calculating a chunk size for a data vector, wherein the chunk size is calculated to minimally target a target compression ratio. 2. The method of claim 1 , wherein for the non-uniform compression, at least a first chunk is compressed using a first compression type and at least a second chunk is compressed using a second compression type, wherein the first chunk differs from the second chunk, and wherein the first compression type differs from the second compression type. 3. The method of claim 1 , wherein the paged data vector is generated by a method including: encoding the data vector according to the chunk size to form a paged uniform-partition tree data structure corresponding to the paged data vector. 4. The method of claim 1 , wherein calculating the chunk size includes: selecting an initial chunk size; and setting the target compression ratio. 5. The method of claim 1 , wherein calculating the chunk size comprises: selecting an initial chunk size; partitioning the data vector into a plurality of preliminary chunks; compressing each of the plurality of preliminary chunks using a respective selected compression type, and calculating a plurality of compression ratios; and calculating a target space amount based on the compression ratios, and calculating a page size based on a smallest fitting page that fits the target space amount. 6. The method of claim 1 , wherein calculating the chunk size comprises: selecting an initial chunk size; partitioning the data vector into a plurality of preliminary chunks; compressing each of the plurality of preliminary chunks using a respective selected compression type, and calculating a plurality of compression ratios; and setting a target compression ratio based on comparing the compression ratios and an error tolerance. 7. The method of claim 1 , wherein calculating the chunk size comprises: selecting an initial chunk size; partitioning the data vector into a plurality of preliminary chunks; compressing each of the plurality of preliminary chunks using a respective selected compression type, and calculating a plurality of compression ratios; setting a target compression ratio based on comparing the compression ratios and an error tolerance; and calculating a target space amount based on the compression ratios, and calculating a page size based on a smallest fitting page that fits the target space amount. 8. The method of claim 1 , wherein identifying the subset of the plurality of pages that relate to the data request comprises: traversing the plurality of chunks in the paged data vector, starting at a root node, one chunk at a time. 9. The method of claim 1 , wherein the paged data vector has a root node and at least one child node. 10. The method of claim 9 , wherein the root node corresponds to a logical representation of the plurality of chunks, and wherein a child node corresponds to a single chunk of the plurality of chunks of the root node. 11. The method of claim 9 , wherein the at least one child node corresponds to at least one oversize chunk, wherein a particular child node corresponds to a particular oversize chunk. 12. The method of claim 9 , wherein the at least one child node corresponds to a plurality of child nodes including a first child node and a second child node, wherein the second child node is a child of the first child node. 13. The method of claim 1 , wherein the paged data vector has a root node that is a single node that contains the plurality of chunks. 14. A non-transitory computer readable medium storing a computer program for controlling a computer system to execute processing for memory management for an in-memory database, the processing comprising: storing, in a secondary storage, a paged data vector, wherein the paged data vector includes at least one node, wherein the at least one node has a plurality of chunks, wherein the plurality of chunks are compressed using non-uniform compression, wherein the plurality of chunks are logically arranged in the paged data vector as a plurality of pages; receiving a data request; identifying a subset of the plurality of pages that relate to the data request; loading, from the secondary storage to a main memory, at least one page of the subset of the plurality of pages that have been identified as relating to the data request; and executing the data request using the at least one page of the subset of the plurality of pages in the main memory, wherein the paged data vector is generated by processing that includes: calculating a chunk size for a data vector, wherein the chunk size is calculated to minimally target a target compression ratio. 15. The non-transitory computer readable medium of claim 14 , wherein the paged data vector is generated by processing that includes: encoding the data vector according to the chunk size to form a paged uniform-partition tree data structure corresponding to the paged data vector. 16. A system for memory management for an in-memory database, the system comprising: at least one processor that is configured to control the system to receive a data request; a main memory; a secondary storage that is configured to store a paged data vector, wherein the paged data vector includes at least one node, wherein the at least one node has a plurality of chunks, wherein the plurality of chunks are compressed using non-uniform compression, wherein the plurality of chunks are logically arranged in the paged data vector as a plurality of pages; a decoder component that is configured to identify a subset of the plurality of pages that relate to the data request; and a page loader component that is configured to load, from the secondary storage to the main memory, at least one page of the subset of the plurality of pages that have been identified as relating to the data request, wherein the at least one processor is configured to control the system to execute the data request using the at least one page of the subset of the plurality of pages in the main memory, the system further comprising: a chunk size calculator component that is configured to calculate a chunk size for a data vector, wherein the chunk size is calculated to minimally target a target compression ratio. 17. The system of claim 16 , further comprising: an encoder component that is configured to encode the data vector according to the chunk size to form a paged uniform-partition tree data structure corresponding to the paged data vector. 18. The system of claim 16 , further comprising: an encoder component, wherein the chunk size calculator component is configured to select

Assignees

Inventors

Classifications

  • Trees, e.g. B+trees · CPC title

  • Column-oriented storage; Management thereof · CPC title

  • Tablespace storage structures; Management thereof · CPC title

  • Data format conversion from or to a database · CPC title

  • with main memory updating (G06F12/0806 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11080187B2 cover?
A computer implemented system and method of memory management for an in-memory database. The system implements a paged data vector using non-uniform compression of its chunks. In this manner, the system achieves greater compression than systems that use uniform compression.
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06F16/2282. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 03 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).