Self healing and restartable multi-steam data backup

US9600487B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9600487-B1
Application numberUS-201414320265-A
CountryUS
Kind codeB1
Filing dateJun 30, 2014
Priority dateJun 30, 2014
Publication dateMar 21, 2017
Grant dateMar 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations are provided herein for a self-healing scalable ring communication topology that enables a multi-stream restartable backup. A standard network management protocol tree-walk process can be altered to support parallel tree-walk. Parallel tree-walk provides for splitting a backup directory among multiple sessions, where each session, in parallel, can walk the tree of portions of a backup directory and stream the results to separate backup storage devices. By allowing multiple backup sessions to work together to back up a single root directory, the backup process becomes more scalable to very large scale data storage systems. In addition, if a single stream of a multi-stream backup experiences a failure, only that stream need be restarted and other streams of the backup can continue without or with very little interruption.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a multi-stream backup request associated with a dataset; establishing a first session and a second session among a set of sessions; associating a first stream identifier with the first session and a second stream identifier with the second session; dynamically assigning sessions among the set of sessions with an upstream neighbor and a downstream neighbor based at least in part on associated stream identifiers, wherein each session in the set of sessions has one upstream neighbor and one downstream neighbor; splitting the dataset into a first segment and a second segment among a set of segments based on sequential hash values, wherein segments in the set of segments are associated with at least a segment identifier, a starting hash value, and an ending hash value; storing at least the segment identifier, the starting hash value, and the ending hash value for segments in the set of segments in a segment location table; assigning the first segment to the first session and the second segment to the second session; in parallel, streaming the first segment by the first session to a first backup storage drive among a set of backup storage drives and the second segment by the second session to a second backup storage drive among the set of backup storage drives, wherein the streaming is based on tree-walking a segment, and wherein the tree-walking the segment is based on sequential hash values of the segment; and in response to streaming an entirety of a segment among the set of segments by a session among the set of sessions: communicating among sessions in the set of sessions that the segment cannot be split; and requesting a new segment by the session from at least one of the upstream neighbor of the session or the downstream neighbor of the session. 2. The method of claim 1 , further comprising: storing an identifier indicating the assigned upstream neighbor and downstream neighbor for each session in the set of sessions in a session status file. 3. The method of claim 2 , wherein the dynamically assigning sessions within the set of sessions with an upstream neighbor and a downstream neighbor includes updating the session status file based on the dynamic assigning. 4. The method of claim 2 , further comprising: establishing a third session among the set of sessions, wherein the dynamically assigning sessions within the set of sessions with an upstream neighbor and a downstream neighbor is further based on the session status file. 5. The method of claim 4 , dynamically splitting from a segment among the set of segments a new segment among the set of segments based on at least a current entry hash value associated with the streaming of the segment and the ending hash value of the segment; updating the segment location table based on the dynamic splitting; assigning the new segment to the third session; streaming the new segment by the third session to a third backup storage drive among the set of backup storage drives wherein the streaming is based on tree-walking the new segment, and wherein the tree-walking the new segment is based on sequential hash values of the new segment. 6. The method of claim 2 , further comprising: detecting change events associated with the session status file; and in response to detecting change events, determining, by a session among the set of sessions, an upstream neighbor status and a downstream neighbor status. 7. The method of claim 6 , further comprising: in response to the upstream neighbor status indicating a failure: disconnecting an upstream communication channel of the session with the upstream neighbor; and connecting a new communication channel of the session with a new upstream neighbor. 8. The method of claim 6 , further comprising: in response to the downstream neighbor status indicating a failure: disconnecting a downstream communication channel of the session with the downstream neighbor; establishing a new downstream communication channel of the session; and connecting with a new downstream neighbor of the session using the new downstream communication channel. 9. The method of claim 6 , further comprising: updating the segment location table based on the detecting change events. 10. The method of claim 1 , further comprising: in response to receiving a request for a new segment by a session among the set of sessions from a requesting session, dynamically splitting from a current segment among the set of segments the new segment based on at least a current entry hash value associated with the streaming of the current segment and the ending hash value of the current segment; updating the segment location table based on the dynamic splitting; and sending an identifier associated with the new segment to the requesting session. 11. A system, comprising: a memory for storing data and instructions; and at least one processor that executes the instructions to enable actions, including: receiving a multi-stream backup request associated with a dataset; establishing a first session and a second session among a set of sessions; associating a first stream identifier with the first session and a second stream identifier with the second session; dynamically assigning sessions among the set of sessions with an upstream neighbor and a downstream neighbor based at least in part on associated stream identifiers, wherein each session in the set of sessions has one upstream neighbor and one downstream neighbor; splitting the dataset into a first segment and a second segment among a set of segments based on sequential hash values, wherein segments in the set of segments are associated with at least a segment identifier, a starting hash value, and an ending hash value; storing at least the segment identifier, the starting hash value, and the ending hash value for segments in the set of segments in a segment location table; assigning the first segment to the first session and the second segment to the second session; in parallel, streaming the first segment by the first session to a first backup storage drive among a set of backup storage drives and the second segment by the second session to a second backup storage drive among the set of backup storage drives, wherein the streaming is based on tree-walking a segment, and wherein the tree-walking the segment is based on sequential hash values of the segment; and in response to streaming an entirety of a segment among the set of segments by a session among the set of sessions: communicating among sessions in the set of sessions that the segment cannot be split; and requesting a new segment by the session from at least one of the upstream neighbor of the session or the downstream neighbor of the session. 12. The system of claim 11 , wherein the at least one processor further enables actions comprising: storing an identifier indicating the assigned upstream neighbor and downstream neighbor for each session in the set of sessions in a session status file. 13. The system of claim 12 , wherein the dynamically assigning sessions within the set of sessions with an upstream neighbor and a downstream neighbor includes updating the session status file based on the dynamic assigning. 14. The system of claim 12 , wherein the at least one processor further enables actions comprising: establishing a third session among the set of sessions, wherein the dynamically assigning sessions within the set of sessions with an upstream neighbor and a downstream neighbor is further based on the session status file. 15. The system of claim 14 ,

Assignees

Inventors

Classifications

  • Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title

  • to make the backup process non-disruptive · CPC title

  • Management of the backup or restore process · CPC title

  • G06F16/128Primary

    Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion (error detection or correction of the data by redundancy in operations or in hardware G06F11/14, G06F11/16) · CPC title

  • Hash-based (content-based indexing of textual data G06F16/31) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9600487B1 cover?
Implementations are provided herein for a self-healing scalable ring communication topology that enables a multi-stream restartable backup. A standard network management protocol tree-walk process can be altered to support parallel tree-walk. Parallel tree-walk provides for splitting a backup directory among multiple sessions, where each session, in parallel, can walk the tree of portions of a …
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/128. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).