RAID-based globally resource-shared data storage system

US10997025B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10997025-B2
Application numberUS-202016856133-A
CountryUS
Kind codeB2
Filing dateApr 23, 2020
Priority dateNov 13, 2017
Publication dateMay 4, 2021
Grant dateMay 4, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The data storage system is a RAID-based data storage system in which resources are globally shared. This storage system includes the first number of disks, and the RAID mechanism is used to store data on each disk. The blocks on different disks form stripes, and at least one of the blocks on the stripe stores the parity information, wherein the width of the stripe is less than the first number. The data layout of the data storage system satisfies the following characteristics: any two physical blocks in the stripe are distributed on different disks; the data blocks distributed on each disk are the same, and the distributed parity blocks are also the same; other data in the stripe associated with any piece of disk data is evenly distributed across all the remaining disks. Normal data layout and degraded data layout can be implemented by orthogonal Latin squares. This system can remove the limitation that the number of disks in the normal data storage system is equal to the stripe width, and break the resource isolation between the disk groups. And in the event of a disk failure, this invention can achieve a complete equalization of the reconstructed read load.

First claim

Opening claim text (preview).

The invention claimed is: 1. A Redundant Array of Independent Disks (RAID)-based data storage system in which resources are globally shared, including a first number of disks, wherein a RAID mechanism is used to store data on each disk, blocks on different disks form stripes, and at least one of the blocks in a stripe stores parity information, wherein a width of a stripe is less than the first number, and a data layout of the data storage system satisfies the following characteristics: any two physical blocks in a stripe are distributed on different disks; numbers of data blocks distributed on each disk are the same, and numbers of parity blocks distributed on each disk are the same; and blocks sharing stripes with blocks on any given disk are distributed evenly among all other disks; wherein the first number is a power of a prime number, and the first number is represented by a number n, which is greater than or equal to 4, stripe width is indicated by a number k, and for the number n, (n−1) orthogonal Latin squares being capable of being obtained, and the data layout of the data storage system is generated as follows: k mutual orthogonal Latin squares in (n−1) mutual orthogonal Latin squares are obtained with rows having same element values in the k mutual orthogonal Latin squares ignored, and then all remaining positions in the k mutual orthogonal Latin squares are traversed in a row-first order, combining element values in same row and column into a mapping group, and each mapping group corresponds to one stripe, and a value of each element in each mapping group indicates an ordinal number of a disk on which each block in a corresponding stripe is placed. 2. The data storage system of claim 1 , wherein the k mutual orthogonal Latin squares are generated according to the following theorem: a first row among all the rows of each generated Latin square is ignored and a first mutual orthogonal Latin square in k generated squares is represented by L 0 , assuming the element on the i th row and j th column of the m th orthogonal Latin square to be L m−1 [ij], the mapping group (L 0 [ij], . . . , L m−1 [ij], . . . , L k−1 [ij]) indicates ordinal numbers of disks on which respective blocks on the ((i−1)*n+j) th stripe are placed, wherein a first block is placed on the L 0 [ij] th disk, an m th block is placed on the L m−1 [ij] th disk, and a k th block is placed on the L k−1 [ij] th disk, wherein data for each disk are placed in blocks, theorem: for a complete set of mutually orthogonal Latin squares with its order being a power of a prime number, the i th Latin square f i (i∈[ 1 , n−1]) has an element value fi[x, y]=i·x+y in the x th row and y th column, where the operator “·” and ‘+’ are the multiplication and addition in a finite field. 3. The data storage system of claim 1 , wherein when one disk fails, for each failed stripe associated with the failed disk, data from other disks associated with the failed stripe for calculating reconstructed data are concurrently read, and the reconstructed data are stored in free space reserved on all other disks, and an ordinal number of the disk on which the reconstructed data is written is determined as follows: selecting a Latin square from (n−1) mutual orthogonal Latin squares other than the k mutual orthogonal Latin squares, and referring the selected Latin square as the (k+1) th Latin square, for each failed stripe associated with the failed disk, identifying a position on the Latin squares corresponding to the failed stripe, and obtaining a first element value at the position on the (k+1) th Latin square, the first element value indicating an ordinal number of the disk on which the reconstructed data is placed, and storing the reconstructed data in free space of the disk indicated by the ordinal number. 4. The data storage system of claim 3 , wherein when another disk fails, a Latin square of (n−1) mutual orthogonal Latin squares other than the k mutual orthogonal Latin squares and the (k+1) th Latin square is selected and referred to as the (k+2) th Latin square, for each failed stripe associated with the failed disk, a position on the Latin squares corresponding to the each failed stripe is identified, and a second element value at the position on the (k+2) th Latin square is obtained, the second element value indicating the ordinal number of the disk on which the reconstructed data block is placed, and the reconstructed data block is stored in free space of the disk indicated by the number. 5. The data storage system of claim 1 , wherein when p disks fail simultaneously, a stripe associated with any one of the p disks is determined; for any stripes associated with any of the p failed disks, a number of data blocks in stripes that locate on the p failed disks is determined; a higher recovery priority for a stripe having a larger number of data blocks located in the p failed disks is assigned; and the stripe with higher priority is recovered with priority. 6. The data storage system of claim 1 , wherein the data storage system stores data with different storage templates; for first data to be stored according to a first template; a first corresponding space in the first number of disks is allocated to the first data; and for the first corresponding space in the first number of disks, mapping a relationship between data stripes in the first template; for second data to be stored according to a second template; a second corresponding space in the first number of disks is allocated to the second data; and for the second corresponding space in the first number of disks, mapping a relationship between the data stripes in the second template. 7. The data storage system of claim 6 , wherein the first template and the second template at least differ in one aspect of RAID levels, stripe width, physical block size, and inter-stripe addressing policy. 8. The data storage system of claim 6 , wherein the first corresponding space is denoted as a logical volume, and each logical volume uses a same type of data template as granularity of storage space allocation, wherein an indexing technique is used to track a physical location of each data template in a logical volume, metadata is maintained to realize a map between user volumes and physical space, and the metadata is cached in memory. 9. The data storage system of claim 8 , wherein when a user request arrives, a specific physical access location is located by querying index tables, locating data templates, locating stripes, locating internal block locations of stripes, and calculating global locations. 10. The data storage system of claim 1 , wherein when data to store is desired to be stored in a read-friendly ordering; determining a mapping relationship between stripes and disks wherein a parity block in one stripe is a last data block of the one stripe; and individual stripes are sorted so that an ordinal number of a disk on which the last block in a stripe is located is an ordinal number of a disk on which the first data block in the next stripe is located. 11. The data storage system of claim 1 , wherein when data to store is desired to be stored in a write-friendly ordering; a parity block in one stripe is a last block of the one stripe; individual stripes are sorted so that an ordinal number of a disk on which the last block in a stripe is located is an amount less than an ordinal number of a disk on which the first block in the next stripe is located; and the amount is a row number of a position of a mapping group corresponding to the stripe in the k mutual orthogonal Latin squares.

Assignees

Inventors

Classifications

  • Rebuilding, e.g. when physically replacing a failing disk · CPC title

  • Redundant storage control functionality · CPC title

  • Parity data used in redundant arrays of independent storages, e.g. in RAID systems · CPC title

  • Digital input from, or digital output to, record carriers {, e.g. RAID, emulated record carriers or networked record carriers} · CPC title

  • Parity calculation or recalculation after configuration or reconfiguration of the system · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10997025B2 cover?
The data storage system is a RAID-based data storage system in which resources are globally shared. This storage system includes the first number of disks, and the RAID mechanism is used to store data on each disk. The blocks on different disks form stripes, and at least one of the blocks on the stripe stores the parity information, wherein the width of the stripe is less than the first number.…
Who is the assignee on this patent?
Univ Tsinghua
What technology area does this patent fall under?
Primary CPC classification G06F11/1092. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 04 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).