Cache for efficient record lookups in an lsm data structure

US2022188317A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022188317-A1
Application numberUS-202217653820-A
CountryUS
Kind codeA1
Filing dateMar 7, 2022
Priority dateJan 30, 2018
Publication dateJun 16, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed relating to maintaining a cache usable to locate data stored in a data structure. A computer system, in various embodiments, maintains a data structure having a plurality of levels that store files for a database. The files may include one or more records that each have a key and corresponding data. The computer system may also maintain a cache for the database whose entries store, for a key, an indication of a location of a corresponding record in a file of the data structure. In some embodiments, the computer system receives a request to access a particular record stored in the data structure where the request specifies a key usable to locate the particular record. The computer system may retrieve, from the cache via the key, a particular indication of a location of the particular record and may use the particular indication to access the particular record.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising: maintaining a log-structured merge-tree (LSM tree) that includes a plurality of levels that store files of a database, wherein a first one of the files is included in a first level of the LSM tree and stores a record having data and a key; maintaining a cache whose entries store, for keys, indications of locations of corresponding records in the LSM tree, wherein a particular one of the entries stores, for the key, an indication of a location of the record of the first file in the first level of the LSM tree; performing a merge operation in which the record is copied into a second file of a second level of the LSM tree; receiving a request to access the record using the key; retrieving, from the cache based on the key, the indication of the location of the record in the first file; and accessing, based on the indication, the record from the first file of the first level instead of the second file of the second level. 2 . The method of claim 1 , further comprising: performing an atomic store operation to store, in the particular entry of the cache, an alias name for the first file, wherein the alias name is the indication of the location of the record and is smaller in memory size than a file name of the first file. 3 . The method of claim 2 , further comprising: subsequent to retrieving the indication of the location, identifying the file name of the first file based on a mapping between file names and alias names, wherein the record is accessed from the first file using the file name of the first file. 4 . The method of claim 1 , further comprising: invalidating the particular entry that includes the indication in response to determining that the first file is inaccessible through the LSM tree. 5 . The method of claim 1 , further comprising: invalidating the particular entry that includes the indication in response to determining that a new record has been written that updates content of the record. 6 . The method of claim 1 , further comprising: prior to the particular entry storing the indication of the location of the record: performing a search of the LSM tree for the record; and in response to locating the record, storing the indication in the particular entry. 7 . The method of claim 1 , further comprising: in response to receiving another request to access the record using a different key: scanning an index structure using the different key to obtain the key; and accessing, based on the particular entry and the key, the record from the first file of the first level. 8 . A non-transitory computer-readable medium having program instructions stored thereon that are capable of causing a computer system to perform operations comprising: maintaining a log-structured merge-tree (LSM tree) that includes a plurality of levels that store files of a database, wherein a first one of the files is included in a first level of the LSM tree and stores a record having data and a key; maintaining a cache whose entries store, for keys, indications of locations of corresponding records in the LSM tree, wherein a particular one of the entries stores, for the key, an indication of a location of the record of the first file in the first level of the LSM tree; performing a merge operation in which the record is copied into a second file of a second level of the LSM tree; receiving a request to access the record using the key; retrieving, from the cache using the key, the indication of the location of the record in the first file; and accessing, based on the indication, the record from the first file of the first level instead of the second file of the second level. 9 . The medium of claim 8 , wherein the operations further comprise: performing a hash derivation function using the key to derive a hash value indicative of the particular entry to be used for storing the indication of the location of the record; and storing, based on the hash value, the indication in the particular entry. 10 . The medium of claim 8 , wherein the operations further comprise: overwriting, in the particular entry of the cache, the indication of the location of the record with an indication of a location of a different record based on a key of the different record hashing to a same hash value associated with the record of the first file. 11 . The medium of claim 8 , wherein the operations further comprise: invalidating the particular entry that includes the indication in response to determining that the first file is inaccessible through the LSM tree. 12 . The medium of claim 8 , wherein the operations further comprise: writing an additional file to a top level of the LSM tree, wherein the additional file includes a different record associated with the key; and subsequent to writing the additional file, invalidating the particular entry of the cache that includes the indication. 13 . The medium of claim 8 , wherein indications of locations are stored within the entries of the cache using atomic store instructions. 14 . The medium of claim 8 , wherein the operations further comprise: prior to the particular entry storing the indication of the location of the record: performing a search of the LSM tree for the record; and in response to locating the record, storing the indication in the particular entry. 15 . A system, comprising: at least one processor; and memory having program instructions stored thereon that are capable of causing a computer system to perform operations comprising: maintaining a log-structured merge-tree (LSM tree) that includes a plurality of levels that store files of a database, wherein a first one of the files is included in a first level of the LSM tree and stores a record having data and a key; maintaining a cache whose entries store, for keys, indications of locations of corresponding records in the LSM tree, wherein a particular one of the entries stores, for the key, an indication of a location of the record of the first file in the first level of the LSM tree; performing a merge operation in which the record is copied into a second file of a second level of the LSM tree; receiving a request to access the record using the key; retrieving, from the cache using the key, the indication of the location of the record in the first file; and accessing, based on the indication, the record from the first file of the first level instead of the second file of the second level. 16 . The system of claim 15 , wherein the operations further comprise: invalidating the particular entry that includes the indication in response to determining that a new record has been written that updates content of the record. 17 . The system of claim 15 , wherein the indication is an alias name of the first file, and wherein the operations further comprise identifying a file name of the first file based on a mapping of alias names to file names. 18 . The system of claim 17 , wherein the alias name of the first file is smaller in memory size than the file name of the first file. 19 . The system of claim 15 , wherein the operations further comprise: in response to determining that no in-progress threads reading the LSM tree can access the record: invalidating the particular entry of the cache; and deleting the record of the first file in the first level. 20 . The system of claim 15 , wherein information is stored within the entries of the cache using atomic store instructions.

Assignees

Inventors

Classifications

  • Database cache management · CPC title

  • Caching, prefetching or hoarding of files · CPC title

  • using clearing, invalidating or resetting means · CPC title

  • Trees, e.g. B+trees · CPC title

  • Mapping to a database · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022188317A1 cover?
Techniques are disclosed relating to maintaining a cache usable to locate data stored in a data structure. A computer system, in various embodiments, maintains a data structure having a plurality of levels that store files for a database. The files may include one or more records that each have a key and corresponding data. The computer system may also maintain a cache for the database whose en…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24552. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).