Storing data in a log-structured format in a two-tier storage system

US11803469B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11803469-B2
Application numberUS-202117410673-A
CountryUS
Kind codeB2
Filing dateAug 24, 2021
Priority dateAug 24, 2021
Publication dateOct 31, 2023
Grant dateOct 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure herein describes storing data using a capacity data storage tier and a smaller performance data storage tier. The capacity data storage tier includes capacity data storage hardware configured to store log-structured leaf pages (LLPs), and the performance data storage tier includes performance data storage hardware. A virtual address table (VAT) includes a set of virtual address entries referencing the LLPs. A tree-structured index includes index nodes referencing the set of virtual address entries of the VAT. Data to be stored is received, and at least a first portion of metadata associated with the received data is stored in the LLPs using the VAT, and at least a second portion of metadata associated with the received data is stored in the performance data storage tier. The architecture reduces space usage of the performance data storage tier.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed by at least one processor, the method comprising: connecting to a capacity data storage tier including capacity data storage hardware configured to store log-structured leaf pages (LLPs); connecting to a performance data storage tier including performance data storage hardware; generating a virtual address table (VAT) including a set of virtual address entries, wherein the virtual address entries include references to the LLPs; creating a tree-structured index including a set of index nodes, wherein a subset of index nodes of the tree-structured index includes references to the set of virtual address entries of the VAT, the VAT and the tree-structured index being stored in the performance data storage tier; receiving data to be stored; and storing (i) at least a first portion of metadata associated with the received data in the LLPs in the capacity data storage tier using the VAT and (ii) at least a second portion of the metadata associated with the received data in the performance data storage tier; wherein the performance data storage hardware has a lower data storage capacity than the capacity data storage hardware, and the performance data storage hardware has a faster data rate than the capacity data storage hardware. 2. The method of claim 1 , further comprising: storing a set of dirty LLPs in a cache in the performance data storage tier based on the received data; detecting a flush operation trigger associated with the cache; grouping the set of dirty LLPs into a new LLP segment based on the detected flush operation trigger; writing the new LLP segment to the capacity data storage tier; and for each dirty LLP in the new LLP segment: identifying a virtual address entry in the VAT that includes a reference to a previous version of the dirty LLP; updating the reference in the identified virtual address entry to a location of the dirty LLP in the new LLP segment; incrementing a usage value for the new LLP segment; and decrementing a usage value of a segment in which the previous version of the dirty LLP is stored. 3. The method of claim 2 , further comprising: receiving a write instruction including write data and a target write address; and performing a write operation on a dirty LLP in the cache based on the write data and target write address of the received write instruction, wherein performing the write operation on the dirty LLP causes the flush operation trigger. 4. The method of claim 1 , further comprising: receiving a read instruction including a read address; searching the tree-structured index based on the read address; identifying a referenced virtual address entry of the VAT; identifying a referenced LLP in the identified virtual address entry; accessing the identified LLP; and responding to the read instruction with data from the accessed LLP. 5. The method of claim 1 , further comprising: receiving a free page instruction including a target address of a target LLP to be freed; searching the tree-structured index based on the target address; identifying a referenced virtual address entry of the VAT; identifying the target LLP referenced in the identified virtual address entry; decrementing a usage value of a segment of the identified target LLP in the capacity data storage tier; and setting an allocation flag of the identified virtual address entry to indicate that the identified virtual address entry is free. 6. The method of claim 1 , further comprising: receiving a virtual address entry allocation request; based on a sequential allocation index of the VAT including a virtual address in a valid virtual address range: identifying a first virtual address entry using the virtual address included in the sequential allocation index; setting an allocation flag of the identified first virtual address entry to indicate that the identified first virtual address entry is allocated; providing the identified first virtual address entry in response to the virtual address entry allocation request; and increment the sequential allocation index to a next virtual address; and based on the sequential allocation index including a virtual address outside the valid virtual address range: identifying a first virtual address entry in a free entry list of the VAT; setting an allocation flag of the identified first virtual address entry to indicate that the identified first virtual address entry is allocated; providing the identified first virtual address entry in response to the virtual address entry allocation request; identifying a second virtual address entry to which the identified first virtual address entry is linked; removing the identified first virtual address entry from the free entry list; and setting the second virtual address entry as a new first virtual address entry in the free entry list. 7. The method of claim 1 , wherein the metadata is used to identify, classify, or describe the received data, and wherein the second portion is different from the first portion. 8. A system comprising: at least one processor; a capacity data storage tier including capacity data storage hardware configured to store log-structured leaf pages (LLPs); a performance data storage tier including performance data storage hardware; a virtual address table (VAT) including a set of virtual address entries, wherein the virtual address entries include references to the LLPs; a tree-structured index including a set of index nodes, wherein a subset of index nodes of the tree-structured index includes references to the set of virtual address entries of the VAT, the VAT and the tree-structured index being stored in the performance data storage tier; wherein the performance data storage hardware has a lower data storage capacity than the capacity data storage hardware, and the performance data storage hardware has a faster data rate than the capacity data storage hardware; and at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the at least one processor to: receive data to be stored in the system; and store (i) at least a first portion of metadata associated with the received data in the LLPs in the capacity data storage tier using the VAT and (ii) at least a second portion of the metadata associated with the received data in the performance data storage tier. 9. The system of claim 8 , wherein the at least one memory and the computer program code are configured to, with the at least one processor, further cause the at least one processor to: store a set of dirty LLPs in a cache in the performance data storage tier based on the received data; detect a flush operation trigger associated with the cache; group the set of dirty LLPs into a new LLP segment based on the detected flush operation trigger; write the new LLP segment to the capacity data storage tier; and for each dirty LLP in the new LLP segment: identify a virtual address entry in the VAT that includes a reference to a previous version of the dirty LLP; update the reference in the identified virtual address entry to a location of the dirty LLP in the new LLP segment; increment a usage value for the new LLP segment; and decrement a usage value of a segment in which the previous version of the dirty LLP is stored. 10. The system of claim 9 , wherein the at least one memory and the computer program code are configured to, with the at least one processor, further cause the at least one processor to: receive a write instruction including write data and a target write address; and perform a metadata update operation on a dirty

Assignees

Inventors

Classifications

  • with main memory updating (G06F12/0806 takes precedence) · CPC title

  • using page tables, e.g. page table structures · CPC title

  • Trees, e.g. B+trees · CPC title

  • Reliability improvement, data loss prevention, degraded operation etc · CPC title

  • G06F16/13Primary

    File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11803469B2 cover?
The disclosure herein describes storing data using a capacity data storage tier and a smaller performance data storage tier. The capacity data storage tier includes capacity data storage hardware configured to store log-structured leaf pages (LLPs), and the performance data storage tier includes performance data storage hardware. A virtual address table (VAT) includes a set of virtual address e…
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/0804. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).