Low hiccup time fail-back in active-active dual-node storage systems with large writes

US12411620B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12411620-B2
Application numberUS-202318204444-A
CountryUS
Kind codeB2
Filing dateJun 1, 2023
Priority dateJun 1, 2023
Publication dateSep 9, 2025
Grant dateSep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for limiting or reducing storage accessibility “hiccups” in active-active clustered systems that perform large writes. The techniques can include executing, by a surviving, failover, or rebooted node of an active-active clustered system, a specialized recovery protocol that includes treating each large write request from a host computer as a plurality of small write requests while execution of the specialized recovery protocol is in progress, draining all dedicated sub-ubers for a primary and secondary node of the active-active clustered system, and, having completed execution of the specialized recovery protocol, resuming normal treatment of large write requests from the host computer. In this way, the need and complexity of managing large writes and maintaining their corresponding sub-uber information during recovery from a forced reboot, crash, or disaster involving the primary or secondary node can be avoided, and storage accessibility hiccups due to performing the large writes can be limited or reduced.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of limiting or reducing storage accessibility hiccups in an active-active clustered system that performs user data chunk write operations, the active-active clustered system including a first storage node and a second storage node, the method comprising: executing, by the first storage node of the active-active clustered system, a specialized recovery protocol for data and/or metadata associated with the user data chunk write operations, the first storage node being an active node, the specialized recovery protocol comprising, in response to the second storage node of the active-active clustered system transitioning from being an active node to an inactive node: treating, by the first storage node, each large write request from among one or more large write requests from a host computer as a plurality of small write requests, the large write request corresponding to a request to write a respective user data chunk containing a plurality of data elements, each small write request corresponding to a request to write a respective data element from among the plurality of data elements, the first storage node and the second storage node having their own dedicated sub-ubers into which one or more user data chunks associated with one or more large write requests have been stored or ingested; draining, by the first storage node, one or more dedicated sub-ubers associated with the first storage node; returning, by the first storage node, the one or more dedicated sub-ubers associated with the first storage node to a computerized sub-uber manager; draining, by the first storage node, one or more dedicated sub-ubers associated with the second storage node; and returning, by the first storage node, the one or more dedicated sub-ubers associated with the second storage node to the computerized sub-uber manager, whereby the first storage node takes responsibility for its own dedicated sub-ubers and those of the second storage node with regard to draining and returning them to the computerized sub-uber manager; and having completed execution of the specialized recovery protocol, resuming, by the first storage node, normal treatment of large write requests from the host computer. 2. The method of claim 1 wherein the active-active clustered system includes multiple storage tiers, the multiple storage tiers including a page descriptor (PD) tier, a page buffer (PB) tier, and a user data (UD) tier, and wherein resuming normal treatment of large write requests from the host computer includes: for each large write request, performing a user data chunk write operation including logging PD metadata associated with the user data chunk write operation in the PD tier, and storing, directly to the UD tier, a user data chunk associated with the user data chunk write operation, the stored user data chunk being made up of a plurality of data elements. 3. The method of claim 2 wherein storing a user data chunk directly to the UD tier includes storing the user data chunk directly to a respective dedicated sub-uber from among the dedicated sub-ubers in the UD tier, and wherein performing a user data chunk write operation includes performing asynchronous flush operations on the dedicated sub-ubers in the UD tier. 4. The method of claim 2 wherein treating each large write request from the host computer as a plurality of small write requests includes: for each small write request, performing a small write operation including logging PB data associated with the small write operation in the PB tier, logging PD metadata associated with the small write operation in the PD tier, and maintaining in-memory representations of the PB data and the PD metadata based on the logged PB data and the logged PD metadata. 5. The method of claim 1 wherein the first storage node includes a sub-uber resource allocator, and wherein draining the one or more dedicated sub-ubers associated with the first storage node and draining the one or more dedicated sub-ubers associated with the second storage node include: waiting for inflight user data chunk write operations to complete. 6. The method of claim 5 wherein draining the one or more dedicated sub-ubers associated with the first storage node includes purging, from the sub-uber resource allocator, addresses of the one or more user data chunks stored or ingested into the dedicated sub-ubers of the first storage node, wherein draining the one or more dedicated sub-ubers associated with the second storage node includes purging, from the sub-uber resource allocator, addresses of the one or more user data chunks stored or ingested into the dedicated sub-ubers of the second storage node, and wherein the method comprises: having purged the addresses of user data chunks from the sub-uber resource allocator and waited for the inflight user data chunk write operations to complete, performing forced flush operations on the dedicated sub-ubers of the first storage node in the UD tier, and performing forced flush operations on the dedicated sub-ubers of the second storage node in the UD tier. 7. The method of claim 6 comprising: directing the sub-uber resource allocator to allocate storage for additional dedicated sub-ubers in the UD tier. 8. The method of claim 1 wherein the first storage node corresponds to a failover node, wherein the second storage node corresponds to a failback node, and wherein the method comprises: in response to the failover node becoming inactive during execution of the specialized recovery protocol, restarting the specialized recovery protocol by the failback node. 9. A system for limiting or reducing storage accessibility hiccups in an active-active clustered system that performs user data chunk write operations, the active-active clustered system including a first storage node and a second storage node, the system comprising: a memory; and processing circuitry configured to execute program instructions out of the memory to: execute, by the first storage node of the active-active clustered system, a specialized recovery protocol for data and/or metadata associated with user data chunk write operations, the first storage node being an active node, the specialized recovery protocol comprising, in response to the second storage node of the active-active clustered system transitioning from being an active node to an inactive node: treating, by the first storage node, each large write request from among one or more large write requests from a host computer as a plurality of small write requests, the large write request corresponding to a request to write a respective user data chunk containing a plurality of data elements, each small write request corresponding to a request to write a respective data element from among the plurality of data elements, the first storage node and the second storage node having their own dedicated sub-ubers into which one or more user data chunks associated with one or more large write requests have been stored or ingested; draining, by the first storage node, one or more dedicated sub-ubers associated with the first storage node; returning, by the first storage node, the one or more dedicated sub-ubers associated with the first storage node to a computerized sub-uber manager; draining, by the first storage node, one or more dedicated sub-ubers associated with the second storage node; and returning, by the first storage node, the one or more dedicated sub-ubers associated with the second storage node to the computerized sub-uber manager, whereby the first storage node takes responsibility for its own dedicated sub-ubers and those of the second storage node with regard to draining and returning them to the computerized sub-uber manager; and havin

Assignees

Inventors

Classifications

  • Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices · CPC title

  • Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP] · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • G06F3/0622Primary

    in relation to access · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12411620B2 cover?
Techniques for limiting or reducing storage accessibility “hiccups” in active-active clustered systems that perform large writes. The techniques can include executing, by a surviving, failover, or rebooted node of an active-active clustered system, a specialized recovery protocol that includes treating each large write request from a host computer as a plurality of small write requests while ex…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F3/0622. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).