Versioned Insert Only Hash Table for In-Memory Columnar Stores
US-2016147750-A1 · May 26, 2016 · US
US10528539B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10528539-B2 |
| Application number | US-201615200556-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 1, 2016 |
| Priority date | Jul 1, 2016 |
| Publication date | Jan 7, 2020 |
| Grant date | Jan 7, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In an example, there is disclosed an apparatus, comprising: a data store comprising a hash table having for at least some rows a hash entry indexed by a hash value, and comprising a hash chain of one or more pointers to a history buffer, and a spill counter; and one or more logic elements, including at least one hardware logic element, comprising a data compressor to: inspect a string0 comprising n bytes at position p in a data file; get the spill counter from a hash entry corresponding to string0; inspect a string1 comprising n bytes at p+k, wherein k is a positive integer; get the spill counter from a hash entry corresponding to string1; determine that the spill counter for string1 is less than the spill counter for string0; and search a chain1 (the hash chain of a hash entry corresponding to string1) for a matching string of size at least n+k with an offset of −k.
Opening claim text (preview).
What is claimed is: 1. An apparatus, comprising: a data store comprising a hash table having, for at least some rows, a hash entry indexed by a hash value, wherein the hash entry comprises a hash chain of one or more pointers to a history buffer, and further comprises a spill counter; and hardware circuitry of a data compressor to: inspect a string0 comprising n bytes of data at position p bytes in a data file; get the spill counter from a first hash entry corresponding to string0; inspect a string1 comprising n bytes of data at p+k, wherein k is a positive integer number of bytes; get the spill counter from a second hash entry corresponding to string1; determine that the spill counter for string1 is less than the spill counter for string0; and search the particular hash chain of the second hash entry corresponding to string1 for a matching string of size at least n+k with an offset of −k. 2. The apparatus of claim 1 , wherein the spill counter comprises a counter for a number of spill rows used in a spill table. 3. The apparatus of claim 1 , wherein the spill counter comprises a function of a length of the particular hash chain. 4. The apparatus of claim 1 , wherein the data compressor is to update the spill counter of a hash entry upon appropriating a new row of a spill table for the hash entry. 5. The apparatus of claim 1 , wherein the data compressor is to update the spill counter of a hash entry upon searching a hash chain for the hash entry. 6. The apparatus of claim 1 , wherein determining that the spill counter for the particular hash chain is less than that spill counter for a chain of the first hash entry comprises determining that the spill counter for the particular hash chain is at least two less than the spill counter for the chain of the first hash entry. 7. The apparatus of claim 1 , wherein the data compressor is further to: inspect a string2 comprising n bytes at p+2k; get the spill counter for a hash entry corresponding to string2; determine that the spill counter for string2 is less than the spill counter for string1; and search the hash chain of a hash entry corresponding to string2 for a matching string of size at least n+2k with an offset of −2k. 8. The apparatus of claim 7 , wherein the data compressor is further to determine that no matching string of at least n+2k was found, and to fall back to a found match of a smaller size. 9. The apparatus of claim 1 , wherein the data compressor is further to determine that the search of the particular hash chain yielded no matches of size at least n+k, and search the hash chain of the first hash entry for a matching string of at least size n. 10. The apparatus of claim 8 , wherein the data compressor is further to: determine that a ratio of search failures involving the particular hash chain is higher than a threshold value; determine that chain-select is not efficient based on determining that the ratio of search failures is higher than the threshold value; and disable chain-select based on determining that the chain-select is not efficient. 11. The apparatus of claim 9 , wherein the ratio of search failures comprises a ratio of search failures involving the particular hash chain to searches involving the particular hash chain. 12. The apparatus of claim 10 , wherein the data compressor is further to re-enable chain-select after processing a number of input bytes. 13. The apparatus of claim 1 , wherein the data compressor is implemented at least partly in microcode. 14. The apparatus of claim 1 , wherein k=1. 15. A method of providing a data compressor, comprising: providing a hash table having, for at least some rows, a hash entry indexed by a hash value, wherein the hash entry comprises a hash chain of one or more pointers to a history buffer, and further comprises a spill counter; inspecting a string0 comprising n bytes of data at position p bytes in a data file; getting the spill counter from a first hash entry corresponding to string0; inspecting a string1 comprising n bytes of data at p+k, wherein k is a positive integer number of bytes; getting the spill counter from a second hash entry corresponding to string1; determining that the spill counter for string1 is less than the spill counter for string0; and searching the particular hash chain of the second hash entry corresponding to string1 for a matching string of size at least n+k with an offset of −k. 16. The method of claim 15 , wherein the spill counter comprises a counter for a number of spill rows used in a spill table. 17. The method of claim 15 , wherein the spill counter comprises a function of a length of the particular hash chain. 18. The method of claim 15 , further comprising updating the spill counter of a hash entry upon appropriating a new row of a spill table for the hash entry. 19. The method of claim 15 , further comprising updating the spill counter of a hash entry upon searching a hash chain for the hash entry. 20. The method of claim 15 , wherein determining that the spill counter for the particular hash chain is less than that spill counter for a chain of the first hash entry comprises determining that the spill counter for the particular hash chain is at least two less than the spill counter for the chain of the first hash entry. 21. The method of claim 15 , further comprising: inspecting a string2 comprising n bytes at p+2k; getting the spill counter for a hash entry corresponding to string2; determining that the spill counter for string2 is less than the spill counter for string1; and searching a chain2 (the hash chain of a hash entry corresponding to string2) for a matching string of size at least n+2k with an offset of −2k. 22. The method of claim 21 , further comprising determining that no matching string of at least n+2k was found, and to fall back to a found match of a smaller size. 23. The method of claim 15 , further comprising determining that the search of the particular hash chain yielded no matches of size at least n+k, and search the hash chain of the first hash entry for a matching string of at least size n. 24. The method of claim 23 , further comprising: determining that a ratio of search failures involving the particular hash chain is higher than a threshold value; determining that chain-select is not efficient based on determining that the ratio of search failures is higher than the threshold value; and disabling chain-select based on determining that the chain-select is not efficient. 25. The method of claim 24 , wherein the ratio of search failures comprises a ratio of search failures involving the particular hash chain to searches involving the particular hash chain. 26. The method of claim 25 , further comprising re-enabling chain-select after processing a number of input bytes. 27. The method of claim 15 , wherein k=1. 28. One or more tangible, non-transitory computer-readable mediums having stored thereon executable instructions for: providing a hash table having, for at least some rows, a hash entry indexed by a hash value, wherein the hash entry comprises a hash chain of one or more pointers to a history buffer, and further comprises a spill counter; inspecting a string0 comprising n bytes of data at position p bytes in a data file; getting the spill counter from a first hash entry corresponding to string0; inspecting a string1 comprisin
Query execution · CPC title
Data buffering arrangements · CPC title
Single storage device · CPC title
Error detection codes other than CRC and single parity bit codes · CPC title
Hash tables · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.