Data storage system with threshold-based container splitting in cache flushing structure

US12554641B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12554641-B2
Application numberUS-202418636587-A
CountryUS
Kind codeB2
Filing dateApr 16, 2024
Priority dateApr 16, 2024
Publication dateFeb 17, 2026
Grant dateFeb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A working-set structure is used to organize cached data for storing to persistent storage, which includes leaf structures and page descriptors (PDs) for data pages to be persisted. Upon adding a new PD located in an address range of an existing leaf structure, a PD population count of the existing leaf structure is compared to a predetermined PD population threshold. When the count is below the threshold, the new PD is incorporated into an existing set of PDs for the existing leaf structure, and otherwise (a) a new leaf structure is created, and (b) the new leaf structure is used for the new PD and later-added PDs in the address range. Flush parallelism is enhanced by avoiding large differences in PD population across a set of leaf structures.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of flushing cached data to persistent storage in a data storage system, comprising: using a working-set structure to organize cached data for storing to persistent storage, the working-set structure including respective leaf structures referring to corresponding page descriptors (PDs) for respective data pages to be persisted, the leaf structures being associated with respective distinct address ranges and corresponding sets of PDs; and upon adding a new PD to the working-set structure for eventual flushing of the respective data page, the PD being located in an address range of a single existing leaf structure: 1) comparing a PD population count of the existing leaf structure to a predetermined PD population threshold, wherein the PD population count comprises a total number of PDs currently associated with the existing leaf structure; (2) in response to the PD population count being less than the PD population threshold, incorporating the new PD into an existing set of PDs for the existing leaf structure, and incrementing the PD population count; and (3) in response to the PD population count being greater than the PD population threshold, (a) creating a new leaf structure for the address range, and (b) using the new leaf structure instead of the existing leaf structure for a new set of PDs including the new PD and later-added PDs in the address range, while maintaining the existing leaf structure for referencing the existing set of PDs. 2 . The method of claim 1 , wherein the PD population threshold is less than a maximum number of PDs that could be contained in a single set of PDs for a single leaf, the maximum number of PDs being associated with a size of individual ones of a plurality of fixed-length storage segments on the persistent storage that are used to store the data pages, wherein the PD population threshold comprises an optimal number of PDs to be processed by individual ones of a plurality of flusher processes, wherein each flusher process persists data pages to a respective one of the fixed-length storage segments in the persistent storage, and wherein the individual flusher processes operate independently and in parallel to persist data pages to respective ones of the plurality of fixed-length storage segments in the persistent storage. 3 . The method of claim 2 , wherein the new leaf structure is created with no explicit dependency on the existing leaf structure with respect to the order in which their respective data pages must be flushed, such that the data pages for the new leaf structure are subsequently flushed by a first one of the flusher processes in parallel with flushing of the data pages for the existing leaf structure by a second one of the flusher processes. 4 . The method of claim 3 , wherein the new leaf structure is marked with a Parallel Flush Allowed marker to indicate the absence of an order-based dependency on the existing leaf structure. 5 . The method of claim 4 , wherein the existing leaf structure is also marked with a Parallel Flush Allowed marker to indicate that one or more later-created leaf structures in the address range, including the new leaf structure, may be processed for flushing of the respective data pages in parallel with the data pages of others of the later-created leaf structures. 6 . The method of claim 5 , wherein the parallel flushing of data pages among the others of the later-created leaf structures overrides timeline-based dependencies reflected in a chaining of the existing leaf structure and the later-created leaf structures that reflects sequential creation of the leaf structures and corresponding sequential flushing, and the Parallel Flush Allowed marker overrides the timeline-based dependencies for the respective leaf containers, allowing for the new leaf structure to be designated for flushing even if a later-created leaf structure is not yet flushed. 7 . The method of claim 5 , wherein the existing leaf structure and the later-created leaf structures form a chain of leaf structures having dependencies and non-dependencies, the non-dependencies indicated by the Parallel Flush Allowed marker, the dependencies indicated by a Flush Before or Flush After marker indicated the data pages for a respective leaf structure must be flushed before or after, respectively, the data pages for a respective other leaf structure, the dependencies forming flush barriers such that parallel flushing is allowed for all leaf structures in a group between two leaf structures having dependencies, while the dependencies are also observed in the sequence of flushing for the respective dependency-containing leaf structures. 8 . A data storage apparatus, comprising: temporary storage; long-term persistent storage; and processing circuitry coupled to memory configured to flush cached data to persistent storage by: using a working-set structure to organize cached data for storing to persistent storage, the working-set structure including respective leaf structures referring to corresponding page descriptors (PDs) for respective data pages to be persisted, the leaf structures being associated with respective distinct address ranges and corresponding sets of PDs; and upon adding a new PD to the working-set structure for eventual flushing of the respective data page, the PD being located in an address range of a single existing leaf structure: 1) comparing a PD population count of the existing leaf structure to a predetermined PD population threshold, wherein the PD population count comprises a total number of PDs currently associated with the existing leaf structure; (2) in response to the PD population count being less than the PD population threshold, incorporating the new PD into an existing set of PDs for the existing leaf structure, and incrementing the PD population count; and (3) in response to the PD population count being greater than the PD population threshold, (a) creating a new leaf structure for the address range, and (b) using the new leaf structure instead of the existing leaf structure for a new set of PDs including the new PD and later-added PDs in the address range, while maintaining the existing leaf structure for referencing the existing set of PDs. 9 . The data storage apparatus of claim 8 , wherein the PD population threshold is less than a maximum number of PDs that could be contained in a single set of PDs for a single leaf, the maximum number of PDs being associated with a size of individual ones of a plurality of fixed-length storage segments on the persistent storage that are used to store the data pages, wherein the PD population threshold comprises an optimal number of PDs to be processed by individual ones of a plurality of flusher processes, wherein each flusher process persists data pages to a respective one of the fixed-length storage segments in the persistent storage, and wherein the individual flusher processes operate independently and in parallel to persist data pages to respective ones of the plurality of fixed-length storage segments in the persistent storage. 10 . The data storage apparatus of claim 9 , wherein the new leaf structure is created with no explicit dependency on the existing leaf structure with respect to the order in which their respective data pages must be flushed, such that the data pages for the new leaf structure are subsequently flushed by a first one of the flusher processes in parallel with flushing of the data pages for the existing leaf structure by a second one of the flusher processes. 11 . The data storage apparatus of claim 10 , wherein the new leaf structure is marked with a Parallel Flush Allowed marker to indicate the absence of an orde

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12554641B2 cover?
A working-set structure is used to organize cached data for storing to persistent storage, which includes leaf structures and page descriptors (PDs) for data pages to be persisted. Upon adding a new PD located in an address range of an existing leaf structure, a PD population count of the existing leaf structure is compared to a predetermined PD population threshold. When the count is below the…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F12/0802. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).