File storage system including tiers

US9824092B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9824092-B2
Application numberUS-201514741154-A
CountryUS
Kind codeB2
Filing dateJun 16, 2015
Priority dateJun 16, 2015
Publication dateNov 21, 2017
Grant dateNov 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data storage systems and processes are provided including processes for handling write and read requests to a storage system. A storage system can include data stores, such as a log store, a hash store and a journal store. Data can be written to a log store, a log store can be converted to a hash store, and hash stores can be merged into a journal store. A storage system can use optimizations in writing and storing data, to provide lower latency, lower levels of write amplification and higher throughput.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method for storing data in a multi-tier storage system, the method comprising: generating a log store data instance comprising a plurality of data instances, wherein a data instance comprises a metadata portion and a data portion; writing the log store data instance into an extent store; generating a hash store data instance for one or more log store data instances, wherein the hash store data instance is a single file having a header, a hash portion, a hash metadata portion, and a hash data portion, wherein generating the hash store data instance for the one or more log store data instances is based on converting the one or more log store data instances to the hash store data instance, wherein converting the one or more log store data instances comprises: (a) determining a size of the metadata portions of the one or more log store data instances; (b) calculating disk offsets for the metadata portions; (c) copying the metadata portions and the data portions of the one or more log store data instances; and (d) writing the metadata portions and data portions based on calculated disk offsets; writing the hash store data instance into the extent store; merging a plurality of hash store data instances to a journal store data instance, wherein the journal store data instance is a file format comprising a journal index portion and a journal data portion, the journal index portion comprising a plurality of hash metadata portions having offsets that point to corresponding journal data portions; merging hash metadata portions of the plurality of hash metadata instances into an index file portion of the journal store data instance; adding hash data portions of the plurality of hash metadata instances into data files; and writing the journal store data instance into the extent store. 2. The computer-implemented method of claim 1 , wherein writing the log store data instance into an extent store comprises defining a hash for metadata portions of the plurality of data instances, wherein the hash is a cuckoo hash. 3. The computer-implemented method of claim 1 , wherein a size of the log store data instance is substantially proportional to the size of a block size of a storage device associated with the log store data instance. 4. The computer-implemented method of claim 1 , wherein the hash portion comprises file offsets that point to corresponding hash metadata portions, and wherein the hash metadata portions comprise hash metadata offsets to corresponding hash data portions. 5. The computer-implemented method of claim 1 , wherein writing the journal store data instance comprises appending the journal store data instance to an end of an existing journal store data instance. 6. The computer-implemented method of claim 1 , wherein writing the log store data instance, the hash store data instance, and the journal store data instance comprises writing the log store data instance, the hash store data instance, and the journal store data instance into append blocks in a compressed format in one or more physical storage containers, wherein the append blocks in compressed format are associated with corresponding logical uncompressed sizes. 7. The media of claim 1 , wherein generating the journal store data instance is based on: appending the plurality of hash store data instances to the journal store data instance, wherein appending the plurality of hash store data instances comprises: merging hash metadata portions of the plurality of hash metadata portions into the index file portion of the journal store data instance; and adding hash data portions of the hash metadata portions into data files. 8. The media of claim 7 , further comprising: compacting the journal store data instance based on re-writing the index file portion and the data files of the journal store data, wherein compacting the journal store data instance includes performing one or more pending delete operations associated with the journal store data instance. 9. The media of claim 1 , wherein the extent store comprises one or more solid state drives. 10. One or more computer storage media having computer-executable instructions embodied thereon that, when executed, by one or more processors, causes the one or more processors to perform a method for storing data in a multi-tier storage system, the method comprising: converting a log store data instance to a hash store data instance, the log store data instance comprising a metadata portion and a data portion, wherein converting the log store data instance comprises: (a) determining a size of the metadata portion of the log store data instance; (b) calculating disk offsets for the metadata portion; (c) copying the metadata portion and the data portion; and (d) writing the metadata portion and the data portion based on calculated disk offsets; merging a plurality of hash store data instances to a journal store data instance, a hash store data instance comprising a single file having a header, a hash, a hash metadata portion, and a hash data portion, wherein merging the plurality of hash store data instances comprises: merging hash metadata portions of the plurality of hash metadata instances into an index file portion of the journal store data instance; and adding hash data portions of the plurality of hash metadata instances into data files. 11. The computer storage media of claim 10 , further comprising: compacting the journal store data instance based on re-writing the index file portion and the data files of the journal store data. 12. The computer storage media of claim 10 , wherein converting the log store data instance to a hash store data instance is triggered based on one of the following: determining that a log store capacity threshold has been met; or determining an occurrence of a write failure associated with a physical write associated with the log store. 13. The media of claim 10 , wherein merging the hash store data instance to the journal store data instance is triggered based on one of the following: determining that a plurality of hash store data instances have met a threshold limit time as hash store data instances; or determining that a count of a plurality of hash store data instances has met a threshold limit. 14. The media of claim 11 , wherein compacting the journal store data instance is triggered based one of the following: determining a space constraints threshold limit in a physical store has been met; determining a threshold amount of deletes exist; or determining a predefined time limit threshold that triggers compacting has been met. 15. A system for storing data in a multi-tier storage system comprising: a processor and a memory configured for providing computer program instructions to the processor; a log store component configured for: generating a log store data instance comprising a plurality of data instances, wherein each data instance comprises at metadata portion and a data portion; and writing the log store data instance into an extent store; a hash store component configured for: generating a hash store data instance for a plurality of log store data instances, wherein the hash store data instance is a single file having a header, a hash portion, a hash metadata portion, and a hash data portion, and wherein the hash store component is generated based on a first condition associated with the log store component wherein generating the hash store data instance for the one or more log store data instances is based on converting the one or more log store data instances to the hash stor

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9824092B2 cover?
Data storage systems and processes are provided including processes for handling write and read requests to a storage system. A storage system can include data stores, such as a log store, a hash store and a journal store. Data can be written to a log store, a log store can be converted to a hash store, and hash stores can be merged into a journal store. A storage system can use optimizations i…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F17/30153. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).