Compressing a multi-version database
US-9305046-B2 · Apr 5, 2016 · US
US10042552B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10042552-B2 |
| Application number | US-201414553435-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 25, 2014 |
| Priority date | Nov 25, 2014 |
| Publication date | Aug 7, 2018 |
| Grant date | Aug 7, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
As part of a columnar in-memory database, value identifiers are inserted into a backing array in-memory until such time that it is determined that such backing array does not have sufficient capacity. A new backing array is then generated that includes the value identifiers in the old backing array and which has sufficient capacity. The old backing array can be flushed from memory when there are no active operations using such backing array. Such an arrangement allows for readers and non-structural writers to operate concurrently.
Opening claim text (preview).
What is claimed is: 1. A method comprising: mapping, in a columnar in-memory database, each distinct value in a column to a different value identifier, a set of value identifiers comprising the different value identifiers; populating a first backing array of an index vector by inserting at each given position in the first backing array, the value identifier corresponding to the value that the corresponding row has for the column, the first backing array having a predefined chunk of allocated memory, each position in the index vector being logically n-bits wide, wherein each different value identifier in the set of value identifiers has a binary representation that is less than or equal to n bits; determining that the first backing array does not have capacity in the predefined chunk of allocated memory for a subsequent value identifier to be inserted therein; generating, based on the determining, a second backing array in a different chunk of allocated memory including the set of value identifiers and having capacity for the subsequent value identifier to be inserted therein; and inserting the subsequent value identifier in the second backing array. 2. The method of claim 1 , wherein the determining is based on there being no empty row positions for the subsequent value identifier. 3. The method of claim 1 , wherein the determining is based on the subsequent value identifier having a binary representation that exceeds n bits. 4. The method of claim 1 , further comprising: accessing, by at least one writer the index vector; and accessing, by at least one reader, the index vector, where the at least one writer and the at least one reader access the index vector concurrently. 5. The method of claim 1 further comprising: flushing the first backing array from memory after the subsequent value identifier is inserted into the second backing array. 6. The method of claim 5 , wherein the first backing array is flushed when there are no outstanding readers that registered prior to the establishment of the second backing array. 7. The method of claim 6 , wherein the readers register with a garbage collector. 8. The method of claim 1 , wherein there are a plurality of instances of index vectors and the second backing array is generated on an instance-by-instance basis. 9. The method of claim 1 further comprising: associating a semaphore with the index vector; and assigning the semaphore to a first writer seeking to generate the second backing array. 10. The method of claim 9 further comprising: copying, by the first writer, the value identifiers from the first backing array into the second backing array; and releasing, by the first writer, the semaphore after copying the value identifiers from the first backing array into the second backing array. 11. The method of claim 10 further comprising: assigning the released semaphore to the second writer, the second writer seeking to execute a write functor; and executing the write functor by the second writer on the second backing array. 12. The method of claim 1 , wherein only one writer is allowed to perform structural changes to the first backing array at a given time. 13. The method of claim 12 , wherein the writer allowed to perform structural changes to the first backing array at the given time is provided ownership of an exclusion mechanism that prevents other writers from performing structural changes. 14. The method of claim 13 , wherein there are two or more writers concurrently seeking to perform structural changes to the first backing array, and the writer or writers without the exclusion mechanism wait for the release and subsequent provision of ownership of the exclusion mechanism before structural changes are made to the second backing array. 15. The method of claim 14 , wherein there is at least one writer concurrently seeking to perform non-structural changes to the first backing array while another writer owns the exclusion mechanism, and the writer without the exclusion mechanism waits for the release of the exclusion mechanism before non-structural changes are made to the second backing array. 16. The method of claim 13 further comprising: querying, by a writer, the exclusion mechanism to either obtain ownership of the exclusion mechanism or to put the writer to sleep until the exclusion mechanism is available to such writer. 17. The method of claim 13 , wherein the exclusion mechanism is selected from a group consisting of: semaphores, mutexes, and spinlocks. 18. A non-transitory computer program product storing instructions which, when executed by at least one data processor forming part of at least one computing system, result in operations comprising: mapping, in a columnar in-memory database, each distinct value in a column to a different value identifier, a set of value identifiers comprising the different value identifiers; populating a first backing array of an index vector by inserting at each given position in the first backing array, the value identifier corresponding to the value that the corresponding row has for the column, the first backing array having a predefined chunk of allocated memory; determining that the first backing array does not have capacity in the predefined chunk of allocated memory for a subsequent value identifier to be inserted therein; generating, based on the determining, a second backing array in a different chunk of allocated memory including the set of value identifiers and having capacity for the subsequent value identifier to be inserted therein; and inserting the subsequent value identifier in the second backing array. 19. A system comprising: at least one data processor; and memory storing instructions which, when executed by the at least one data processor, result in operations comprising: mapping, in a columnar in-memory database, each distinct value in a column to a different value identifier, a set of value identifiers comprising the different value identifiers; populating a first backing array of an index vector by inserting at each given position in the first backing array, the value identifier corresponding to the value that the corresponding row has for the column, the first backing array having a predefined chunk of allocated memory; determining that the first backing array does not have capacity in the predefined chunk of allocated memory for a subsequent value identifier to be inserted therein; generating, based on the determining, a second backing array in a different chunk of allocated memory including the set of value identifiers and having capacity for the subsequent value identifier to be inserted therein; and inserting the subsequent value identifier in the second backing array.
Single storage device · CPC title
Indexing structures · CPC title
Column-oriented storage; Management thereof · CPC title
Databases characterised by their database models, e.g. relational or object models · CPC title
Management of blocks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.