Change block tracking for transfer of data for backups

US2020026777A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020026777-A1
Application numberUS-201816041697-A
CountryUS
Kind codeA1
Filing dateJul 20, 2018
Priority dateJul 20, 2018
Publication dateJan 23, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one approach, a set of data blocks or files is tracked for changes between snapshots. This may be done by a file system filter running in kernel mode. The data blocks or files that are tagged as unchanged are not transferred to backup because there is no need to update since the last backup. Other data blocks and files may be first tested for change, for example by comparing digital fingerprints of the current data versus the previously backed up data, before transferring to backup.

First claim

Opening claim text (preview).

1 . In a data management and storage (DMS) system, a method for backup of a next snapshot of a fileset from a compute infrastructure serviced by the DMS system, the method comprising: running a file system filter in kernel mode on the compute infrastructure for a session that begins before the compute infrastructure took a previous snapshot of the fileset and continues until after the compute infrastructure takes a next snapshot of the fileset, the file system filter tracking which data blocks in a group of files have been write accessed during the session; after the end of the session, the file system filter providing the DMS system with tracking data that indicates which data blocks were write accessed during the session; and based on the tracking data, determining whether to transfer data blocks from the compute infrastructure to the DMS system for backup of the next snapshot. 2 . The method of claim 1 , wherein determining whether to transfer data blocks from the compute infrastructure to the DMS system comprises, for data blocks in the fileset: determining from the tracking data whether the data block was write accessed during the session; and if the data block was not write accessed according to the tracking data, then not transferring the data block from the compute infrastructure to the DMS system. 3 . The method of claim 1 , wherein determining whether to transfer data blocks from the compute infrastructure to the DMS system comprises, for data blocks in the fileset: determining from the tracking data whether the data block was write accessed during the session; and if the data block was write accessed according to the tracking data, then: transferring a digital fingerprint of the previous snapshot of the data block to the compute infrastructure; causing the compute infrastructure to calculate a digital fingerprint of the data block and to determine whether the digital fingerprints of the data block and of the previous snapshot of the data block are the same; and if the digital fingerprints are not the same, then transferring the data block from the compute infrastructure to the DMS system but, if the digital fingerprints are the same, then not transferring the data block. 4 . The method of claim 1 , wherein determining whether to transfer data blocks from the compute infrastructure to the DMS system comprises, for data blocks in the fileset: determining from the tracking data whether the data block was write accessed during the session; and if the data block was write accessed according to the tracking data, then transferring the data block from the compute infrastructure to the DMS system. 5 . The method of claim 1 , wherein the tracking data comprises a bitmap of bits, each bit representing one of the data blocks and indicating whether that data block was write accessed during the session. 6 . The method of claim 5 , wherein a size of the data block represented by each bit is configurable. 7 . The method of claim 5 , wherein, during the session, the bitmap is stored in kernel space memory. 8 . The method of claim 1 , wherein the tracking data comprises a linked list. 9 . The method of claim 1 , wherein the tracking data comprises a listing of files in the group of files, and a bitmap of bits for each of said files, each bit representing one of the data blocks in said file and indicating whether that data block was write accessed during the session. 10 . The method of claim 1 , wherein the file system filter maintains a list of sessions. 11 . The method of claim 1 , wherein the file system filter is automatically called by a file system when the file system makes a write access during the session. 12 . The method of claim 1 , wherein: the DMS system comprises a DMS cluster of peer DMS nodes, a distributed data store implemented across the peer DMS nodes, and a DMS agent installed on the compute infrastructure; the previous snapshot is stored in the distributed data store; the DMS agent determines whether to transfer data blocks from the compute infrastructure to the DMS system for backup of the next snapshot; jobs to transfer data blocks from the compute infrastructure to the distributed data store are posted to a job queue accessible by the peer DMS nodes; and the peer DMS nodes autonomously fetch and execute jobs from the job queue. 13 . The method of claim 12 , wherein the DMS agent starts the session and/or stops the session. 14 . The method of claim 12 , wherein the DMS agent instructs the compute infrastructure to take the next snapshot of the fileset and then stops the session. 15 . The method of claim 12 , wherein the DMS agent starts the session and then instructs the compute infrastructure to take the previous snapshot of the fileset. 16 . The method of claim 12 , wherein, after the end of the session, the file system filter provides the DMS agent with the tracking data. 17 . The method of claim 12 , wherein the DMS agent runs in user mode on the compute infrastructure. 18 . In a data management and storage (DMS) system, a method for pulling a snapshot of a fileset from a compute infrastructure serviced by the DMS system, the method comprising: determining whether a data block in the fileset is currently tagged by the DMS system as unchanged; and if the data block is currently tagged as unchanged, then not transferring the data block from the compute infrastructure to the DMS system. 19 . The method of claim 18 , further comprising: if the data block is not currently tagged as unchanged, then: transferring a digital fingerprint of a previous snapshot of the data block to the compute infrastructure; causing the compute infrastructure to calculate a digital fingerprint of the data block and to determine whether the digital fingerprints of the data block and of the previous snapshot of the data block are the same; and if the digital fingerprints are not the same, then transferring the data block from the compute infrastructure to the DMS system but, if the digital fingerprints are the same, then not transferring the data block. 20 . The method of claim 18 , wherein: the DMS system comprises a DMS cluster of peer DMS nodes, a distributed data store implemented across the peer DMS nodes, and a DMS agent installed on the compute infrastructure; a previous snapshot is stored in the distributed data store; the DMS agent determines whether to transfer data blocks from the compute infrastructure to the DMS system for backup of the next snapshot; jobs to transfer data blocks from the compute infrastructure to the distributed data store are posted to a job queue accessible by the peer DMS nodes; and the peer DMS nodes autonomously fetch and execute jobs from the job queue.

Assignees

Inventors

Classifications

  • G06F16/128Primary

    Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion (error detection or correction of the data by redundancy in operations or in hardware G06F11/14, G06F11/16) · CPC title

  • G06F16/122Primary

    using management policies (point-in-time backing up or restoration of persistent data G06F11/1446; file migration policies for HSM systems G06F16/185) · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • Distributed file systems · CPC title

  • for networked environments · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020026777A1 cover?
In one approach, a set of data blocks or files is tracked for changes between snapshots. This may be done by a file system filter running in kernel mode. The data blocks or files that are tagged as unchanged are not transferred to backup because there is no need to update since the last backup. Other data blocks and files may be first tested for change, for example by comparing digital fingerpr…
Who is the assignee on this patent?
Rubrik Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/128. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).