Flash memory cache including for use with persistent key-value store

US9436596B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9436596-B2
Application numberUS-201313919727-A
CountryUS
Kind codeB2
Filing dateJun 17, 2013
Priority dateMay 5, 2010
Publication dateSep 6, 2016
Grant dateSep 6, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described is using flash memory, RAM-based data structures and mechanisms to provide a flash store for caching data items (e.g., key-value pairs) in flash pages. A RAM-based index maps data items to flash pages, and a RAM-based write buffer maintains data items to be written to the flash store, e.g., when a full page can be written. A recycle mechanism makes used pages in the flash store available by destaging a data item to a hard disk or reinserting it into the write buffer, based on its access pattern. The flash store may be used in a data deduplication system, in which the data items comprise chunk-identifier, metadata pairs, in which each chunk-identifier corresponds to a hash of a chunk of data that indicates. The RAM and flash are accessed with the chunk-identifier (e.g., as a key) to determine whether a chunk is a new chunk or a duplicate.

First claim

Opening claim text (preview).

What is claimed is: 1. In a computing environment, a system comprising: a secondary storage device including data items; and a RAM-based index that contains one or more pointers for determining location of the data items in the secondary storage device, at least one of the one or more pointers is divided into a first subspace and a second subspace, the first subspace pointing to a location in RAM, and the second subspace pointing to a location in the secondary storage device. 2. The system of claim 1 , wherein the RAM-based index includes a truncated cuckoo hash table, in which an entry of the RAM-based index comprises a compact footprint checksum and a pointer to a page, the RAM-based index of each data item configured for storage in one of a plurality of locations in the table, and wherein the compact footprint checksum validates whether the data item is stored in the page. 3. The system of claim 1 , wherein the data items comprise key-value pairs, with the key and associated value comprising arbitrary byte arrays. 4. The system of claim 1 , wherein the secondary storage device comprises at least one of a non-volatile memory devices or a hard drive device. 5. The system of claim 1 , further comprising: a storage mechanism configured to store the data items to the secondary storage device, including writing the data items to a page and inserting a compact index of the data items into the RAM-based index. 6. The system of claim 5 , wherein the storage mechanism further comprises a RAM-based write buffer configured to maintain the data items to be written to the secondary storage, and wherein the storage mechanism is configured to write a page of the data items from the RAM-based write buffer to the secondary storage device when the data items fill a page, or writes less than a page of the data items from the RAM-based write buffer to the secondary store device when a coalesce time is reached. 7. The system of claim 1 , wherein the secondary storage device comprises a flash store, and further comprising a disk-based storage and a recycle mechanism that makes a page in the flash store again available for use by destaging at least some of the data items from the flash store to the disk-based storage. 8. The system of claim 7 , wherein the recycle mechanism includes an oldest first policy, least recently used policy, or first-in, first out policy. 9. The system of claim 1 , wherein the secondary storage device comprises a flash store, and further comprising a data structure that includes information that indicates to a high probability whether a data item has been recently accessed, and a recycle mechanism that makes a page in the flash store available by processing valid data items on the page, including destaging a data item from the page in the flash store to a disk-based storage or reinserting the data item into a RAM-based write buffer to be written back to the flash store, based on whether the information in the data structure indicates that the data item has been recently accessed. 10. The system of claim 1 , wherein the data items in the secondary storage device are checked whether the data items are pointed by the pointers in the RAM-based index. 11. In a computing environment, a method performed on at least one processor, comprising: maintaining key-value pairs in a secondary storage device; and maintaining a RAM-based index with compact index that contains one or more pointers to determine whether the key-value pairs are located in the secondary storage device, at least one of the one or more pointers is divided into a first subspace and a second subspace, the first subspace pointing to a location in RAM, and the second subspace pointing to a location in the secondary storage. 12. The method of claim 11 further comprising, writing the key-value pairs to the secondary storage device, and adding an entry into the RAM-based index for the key-value pairs. 13. The method of claim 12 further comprising, looking for a key in the RAM-based index, wherein looking for the key of the key-value pair is performed by a first thread, wherein writing the key-value pair is performed by a second thread, and wherein a recycling process is performed by a third thread, in which at least one of the threads uses a locking mechanism. 14. The method of claim 11 , wherein the RAM-based index comprises a truncated cuckoo hash table, wherein an entry of the RAM-based index comprises a compact checksum and one of the one or more pointers, wherein each key is stored in one of multiple locations in the RAM-based index with the locations determined by multiple hash functions, wherein a compact checksum is calculated for each of the locations and its associated key, wherein looking up the key comprises checking the multiple locations in the RAM-based index whether the stored checksum matches the checksum of the associated key, and wherein the pointers of the locations with a checksum match are returned as the pointers to a location in which the key-value pairs may be stored. 15. The method of claim 14 , wherein when the key does not exist in the secondary storage device, no pointers are retrieved, or if one or more of the pointers are retrieved, the method further comprises checking content pointed to by the one or more of the pointers to validate if the key is stored in the location, and if so, returning the value of the key-value pair. 16. The method of claim 12 , wherein writing the key-value pairs into the secondary storage device includes appending the key-value pair to a logical end of the secondary storage device, retrieving one or more of the pointers of the existing key in the RAM-based index, checking the one or more of the pointers to determine if a previous version of the key-value pair is stored in the secondary storage device and the RAM-based index, and if the previous version of the key-value pair exists, replacing the one or more of the pointers to the previous version of the key-value pair with the one or more of the pointers to a new version of the key-value pair and rendering the previous version of the key-value pair as not pointed to by any of the pointers in the RAM-based index to thereby be processed by a recycling process, and if no previous version of the key-value pair exists, storing the one or more of the pointers of a new version of the key-value pair in an unoccupied locations if one exists, and if no unoccupied location is found, relocating the one of pointers stored in an occupied location to an alternative location or destaging the one of the pointers and the associated key-value pair to another storage device. 17. The method of claim 16 , wherein relocating the one of the pointers includes: (a) retrieving keys associated with the one of the pointers as potential relocation candidates, (b) finding an alternative location for each of the relocation candidates, (c) if an unoccupied alternative location is found, relocating the relocation candidate, (d) and if all alternative locations of all relocation candidates are occupied, adding the keys at all the alternative locations as relocation candidates. 18. The method of claim 16 , wherein the recycling process treats the content stored in the secondary storage device as a stream, and for each key-value pair in the secondary storage device, checks if it is pointed by a pointer in the RAM-based index, and if pointed to, performs: copying the key-value pair into a new stream; garbage collecting at least a portion of a previous stream, and periodically check pointing the RAM-based index into a storage device i

Assignees

Inventors

Classifications

  • in block erasable memory, e.g. flash memory · CPC title

  • for peripheral storage systems, e.g. disk cache · CPC title

  • with two or more cache hierarchy levels (with multilevel cache hierarchies G06F12/0811) · CPC title

  • for memories with random access ports synchronised on clock signal pulse trains, e.g. synchronous memories, self timed memories · CPC title

  • with age lists, e.g. queue, most recently used [MRU] list or least recently used [LRU] list · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9436596B2 cover?
Described is using flash memory, RAM-based data structures and mechanisms to provide a flash store for caching data items (e.g., key-value pairs) in flash pages. A RAM-based index maps data items to flash pages, and a RAM-based write buffer maintains data items to be written to the flash store, e.g., when a full page can be written. A recycle mechanism makes used pages in the flash store availa…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F12/0246. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 06 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).