Archival storage and retrieval system

US9785498B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9785498-B2
Application numberUS-201114113806-A
CountryUS
Kind codeB2
Filing dateJun 17, 2011
Priority dateApr 29, 2011
Publication dateOct 10, 2017
Grant dateOct 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A highly reliable data archival and retrieval method that enables fine grained control over data availability is implemented across a Quality of Service driven archival system, configured to fragment the data into data and parity chunks for storing onto the storage node. The technique employed by the archival system enables files to be read without having need to access any metadata, thereby tolerating complete loss of such metadata. Further, the Quality of Service driven system architecture improves upon the system performance and throughput by means of a storage node regeneration process which ensures balanced load on participating storage node during various storage, retrieval and regeneration operations.

First claim

Opening claim text (preview).

We claim: 1. A file storage and retrieval system comprising: a processing unit; a plurality of storage nodes; and a memory storing instructions, wherein the processing unit is configured to execute the instructions to: receive a Quality of Service (QoS) levels and uniform resource name for a file enabling fine grained control over file availability stored across the plurality of storage nodes; determine k data chunks and m parity chunks fragmented from file chunks by a file encoding and placement scheme wherein the QoS level associated with the each file specifies presence of desired number m d of parity chunks and ensures total number of available chunks for any file to be above k+m min chunks wherein m min <m d ; determine a file chunk Uniform Resource Name (URN) for the k data chunks and m parity chunks; determine a hash for each of the file chunk URNs; determine a node key for each of the plurality of storage nodes using a hash function; and store the k data chunks and m parity chunks across the plurality of storage nodes based on the node keys and the hashs for the file chunks; a monitoring engine to track the status of lost file chunks stored on any one of the storage nodes for their participation in input/output operations performed on the system and regeneration mechanism on the lost chunks wherein regeneration is delayed as long as minimum number of available chunks is greater than k+m min . 2. The system of claim 1 , wherein the file is fragmented into data chunks and parity chunks by erasure encoding technique, and wherein the data chunks and parity chunks are used to reconstruct the file during file retrieval. 3. The system of claim 1 , wherein the QoS level is specified by number of parity chunks and a minimum number of chunks that must always be available in the system. 4. The system of claim 1 , wherein the number of data chunks remains fixed for all files of the system while the number of parity chunks vary based on the QoS level. 5. The system of claim 1 , wherein a load balancer distributes encoding and decoding load uniformly across of the plurality of front end nodes. 6. The system of claim 1 , wherein storing the file chunks includes comparing the hash of a file chunk URN with a node key of a storage node for placing the file chunk on the storage node such that no two data chunks reside on the same storage node. 7. The system of claim 1 , wherein the storage node includes at least one of a physical machine with direct attached disks, a physical machine with network attached disks, and virtual machines with virtual disks or a program that access a cloud storage device. 8. The system of claim 1 , wherein the storage nodes are further configured to perform regeneration of lost file chunks for subsequent storage of regenerated chunks. 9. The system of claim 1 , wherein metadata corresponding to the file chunks is stored in one or more metadata servers and at least a portion of the metadata is also stored in the file chunks. 10. The system of claim 9 , wherein the metadata includes at least one of file URN, desired QoS level, object owner and its creation time, checksum of original file, fragmented chunks, and the chunk header. 11. The system of claim 9 , wherein the metadata is used when performing a lookup operation for the files. 12. The system of claim 1 , wherein hash values are used as a checksum for the data chunks and the parity chunks during retrieval and regeneration. 13. The system of claim 9 , wherein the metadata is stored in a hierarchical directory structure in the metadata server. 14. The system of claim 1 , wherein a status of the file chunks is reported or dynamically updated as active, inactive, degraded or dead to trigger subsequent regeneration mechanism.

Assignees

Inventors

Classifications

  • Hash-based (content-based indexing of textual data G06F16/31) · CPC title

  • Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available (error or fault processing without redundancy G06F11/0703; error detection or correction by redundancy in data representation G06F11/08; error detection or correction of the data by redundancy in operations G06F11/14; error detection or correction by redundancy in hardware G06F11/16) · CPC title

  • Management specially adapted to peer-to-peer storage networks (topology management mechanisms of peer-to-peer networks H04L67/1042) · CPC title

  • Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title

  • Distributed file systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9785498B2 cover?
A highly reliable data archival and retrieval method that enables fine grained control over data availability is implemented across a Quality of Service driven archival system, configured to fragment the data into data and parity chunks for storing onto the storage node. The technique employed by the archival system enables files to be read without having need to access any metadata, thereby to…
Who is the assignee on this patent?
Misra Prateep, Roy Nilanjan, Naskar Soumitra, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F11/1004. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).