Efficient deduplication in a metadata delta log architecture

US2025238372A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025238372-A1
Application numberUS-202418417463-A
CountryUS
Kind codeA1
Filing dateJan 19, 2024
Priority dateJan 19, 2024
Publication dateJul 24, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In at least one embodiment, destaging a write of a user data (UD) log may include: performing deduplication that determines content written by the write to a logical address LA is a duplicate of existing content; and establishing mapping information of a chain of metadata (MD) pages mapping LA to a physical address PA storing the existing content, wherein the chain includes a MD leaf page and a VLB (virtual layer block) page. An update of a MD log can update an indirect pointer (IDP) field of a MD leaf entry of the MD leaf page to reference a VLB entry of the VLB page where the VLB entry further includes PA. A second update of the MD log can increment a reference count of the VLB entry. The second update can be recorded blindly in the MD log without reading or accessing reference count during destaging the write.

First claim

Opening claim text (preview).

1 . A computer-implemented method comprising: receiving a write operation that writes first content to a first target logical address; recording the write operation in a user data (UD) log; destaging the write operation from the UD log including: performing deduplication processing that determines the first content written by the write operation is a duplicate of the first content as currently stored at a first physical address on non-volatile storage; and establishing mapping information of a chain of metadata (MD) pages that maps the first target logical address to the first physical address of the first content, wherein the chain of MD pages includes a MD leaf page and a VLB (virtual layer block) page, and wherein said establishing includes: recording in a MD log two corresponding updates in connection with deduplication of the first content, the two corresponding updates including a first update and a second update, wherein the first update updates an indirect pointer (IDP) field of a first MD leaf entry of the MD leaf page to reference a first VLB entry of the VLB page where the first VLB entry further includes the first physical address of the first content, and wherein the second update increments a reference count of the first VLB entry, wherein said recording includes recording the second update in the MD log without reading or accessing a current value of the reference count of the first VLB entry during said destaging the write operation from the UD log, where the reference count of the first VLB entry denotes a number of logical addresses that reference the first content as stored at the first physical address, wherein the first update of the MD log is an IDP tuple that identifies the first MD leaf entry, and identifies a first address or location of the first VLB entry, wherein the second update of the MD log is an extended increment reference count (incref) tuple that identifies the first VLB entry including the reference count, and includes a back reference to the IDP tuple denoting the first update, and wherein said destaging the write operation includes: acquiring a shared or read lock on the first VLB page and acquiring an exclusive or write lock on the MD leaf page; adding the IDP tuple to a current MD transaction; adding the extended incref tuple to the current MD transaction; and committing the current MD transaction to the MD log, wherein said committing includes transactionally storing the IDP tuple and the extended incref tuple for the write operation in the MD log. 2 - 4 . (canceled) 5 . The computer-implemented method of claim 1 , wherein the extended incref tuple includes a redirection flag indicating whether redirection resolution is needed for the IDP field of the first MD leaf entry to determine whether a first current value of the IDP field denotes a valid address or location of the first VLB entry. 6 . The computer-implemented method of claim 5 , wherein said destaging the write operation includes: determining whether redirection resolution of the IDP field of the first MD leaf entry is needed; and responsive to determining redirection resolution of the IDP field of the first MD leaf entry is needed setting the redirection flag of the extended incref tuple to true, and otherwise setting the redirection flag of the extended incref tuple to false. 7 . The computer-implemented method of claim 5 , further comprising performing MD log destaging of recorded updates to the first VLB page including: aggregating a first set (S) of relevant increment tuples from the MD log that increment the reference count of the first VLB entry, wherein S includes the extended incref tuple; calculating an updated value for the reference count by incrementing a current value of the reference count in accordance with a number of increments denoted by the relevant increment tuples of S; and determining whether the updated value exceeds a maximum allowable value (MAX) for the reference count. 8 . The computer-implemented method of claim 7 , wherein said MD log destaging of recorded updates to the first VLB page includes: responsive to determining that the updated value does exceed MAX, performing first processing including: calculating an excess value with respect to the updated value of the reference count, wherein the excess value denotes an amount by which the updated value exceeds MAX; creating a second VLB page including a second VLB entry corresponding to the first VLB entry of the first VLB page; and selecting a first number of increment tuples from S, wherein the first number equals the excess value, wherein the first number of increment tuples selected includes the extended incref tuple. 9 . The computer-implemented method of claim 8 , wherein the first processing includes: identifying, using the back reference of the extended incref tuple, the first MD leaf entry; and recording in the MD log a second IDP tuple that updates the IDP of the first MD leaf entry to reference or point to the second VLB entry of the second VLB. 10 . The computer-implemented method of claim 9 , wherein said first processing further includes: recording in the MD log a second number of increment reference tuples each incrementing a second reference count of the second VLB entry, wherein the second number equals the excess value. 11 . The computer-implemented method of claim 10 , wherein the second number of increment reference tuples includes one or more extended incref tuples incrementing the second reference count of the second VLB entry. 12 . The computer-implemented method of claim 10 , wherein the second number of increment reference tuples includes one or more non-extended incref tuples incrementing the second reference count of the second VLB entry. 13 . The computer-implemented method of claim 12 , wherein each of the non-extended incref tuples does not include a back reference to a corresponding IDP tuple, and wherein each of the non-extended incref tuples does not include a redirection flag field. 14 . The computer-implemented method of claim 13 , wherein said first processing includes: persistently storing MAX as a value for the reference count of the first VLB entry. 15 . The computer-implemented method of claim 10 , wherein said MD log destaging of recorded updates to the first VLB page is included in a first MD log destage cycle, and the method includes performing a second MD log destage cycle that includes destaging from the MD log the second IDP tuple and the second number of increment reference tuples. 16 . The computer-implemented method of claim 15 , wherein during the first MD log destage cycle, the MD log that is destaged is a first in-memory MD log instance that is in a frozen state; wherein during the first MD log destage cycle, a second in-memory MD log instance is in an active state, and the second IDP tuple and the second number of increment reference tuples are recorded in the second in-memory MD log instance during the first MD log destage cycle; and wherein during the second MD log destage cycle, the second in-memory MD log instance is in the frozen state and the first in-memory MD log instance is in the active state. 17 . The computer-implemented method of claim 7 , wherein said MD log destaging of recorded updates to the first VLB page includes: determining whether the redirection flag of the extended incref tuple is true; responsive to determining that the redirection flag of the extended incref tuple is true, performing second processing including: determining whether the first current value of the IDP field of the first MD leaf entry is invalid becau

Assignees

Inventors

Classifications

  • G06F3/0641Primary

    De-duplication techniques · CPC title

  • using page tables, e.g. page table structures · CPC title

  • with multilevel cache hierarchies · CPC title

  • with main memory updating (G06F12/0806 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025238372A1 cover?
In at least one embodiment, destaging a write of a user data (UD) log may include: performing deduplication that determines content written by the write to a logical address LA is a duplicate of existing content; and establishing mapping information of a chain of metadata (MD) pages mapping LA to a physical address PA storing the existing content, wherein the chain includes a MD leaf page and a…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 24 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).