Data distribution method, data storage method, related apparatus, and system

US2016357440A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016357440-A1
Application numberUS-201615171794-A
CountryUS
Kind codeA1
Filing dateJun 2, 2016
Priority dateJun 4, 2015
Publication dateDec 8, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data distribution method for improving performance of a distributed storage system includes: receiving, by a data distribution apparatus, a storage instruction of a user, dividing to-be-stored data that the storage instruction instructs to store, into P data segments, determining a storage node group corresponding to each data segment, and finally distributing the data segment to a primary node in the corresponding storage node group.

First claim

Opening claim text (preview).

What is claimed is: 1 . A data distribution method, comprising: receiving, by a distributed storage system, a storage instruction, wherein the storage instruction carries to-be-stored data, wherein the distributed storage system stores data by using erasure coding (EC) stripes, each EC stripe comprises a data part and a parity part, the data part of each EC stripe comprises m data blocks, and the parity part of each EC stripe comprises k parity blocks that are obtained after parity coding is performed on the m data blocks, wherein the distributed storage system comprises multiple storage nodes, the multiple storage nodes constitute multiple storage node groups, a quantity of storage nodes comprised in each storage node group is not less than m+k, and one primary storage node is specified in each storage node group, wherein the m and k are both positive integers; dividing, by the distributed storage system, the to-be-stored data into P data segments, wherein each data segment corresponds to one EC stripe, a size of each data segment is not greater than Z, the Z is a size of m data blocks, and the P is a positive integer; determining, by the distributed storage system, a storage node group corresponding to each data segment; and distributing, by the distributed storage system, the data segment to a primary storage node in the determined storage node group corresponding to the data segment. 2 . The data distribution method according to claim 1 , wherein a logical volume of the distributed storage system comprises multiple logical partitions, each of which has a size of Z and does not overlap with each other, and wherein the dividing, by the distributed storage system, the to-be-stored data into the P data segments comprises: dividing the to-be-stored data into the P data segments according to logical addresses of the to-be-stored data, wherein each data segment falls within one of the logical partitions. 3 . The data distribution method according to claim 2 , wherein a start address of a first data segment in the P data segments is a start address of the to-be-stored data, and a start address of a p th data segment in the P data segments is a start address of a logical partition that the p th data segment falls within, wherein 2≦p≦P. 4 . The data distribution method according to claim 2 , wherein the method further comprises: presetting and recording a correspondence between each logical partition and each storage node group, wherein the determining the storage node group corresponding to each data segment comprises: determining, according to the correspondence between each logical partition and each storage node group, the storage node group corresponding to a logical partition that the data segment falls within. 5 . The data distribution method according to claim 4 , wherein each logical partition uniquely corresponds to one key value, and the presetting and recording the correspondence between each logical partition and each storage node group comprises: presetting and recording a key value corresponding to each storage node group, wherein the determining, according to the correspondence between each logical partition and each storage node group, the storage node group corresponding to a logical partition that the data segment falls within, comprises: determining a key value of the data segment according to the logical partition that the data segment falls within; and determining, according to the key value of the data segment, the storage node group corresponding to the data segment. 6 . A data storage method, comprising: receiving, by a primary storage node, a first data segment, wherein a size of the first data segment is not greater than Z that is a size of m data blocks, wherein the distributed storage system stores data by using erasure coding (EC) stripes, each EC stripe comprises a data part and a parity part, the data part of each EC stripe comprises m data blocks, and the parity part of each EC stripe comprises k parity blocks that are obtained after parity coding is performed on the m data blocks, wherein the distributed storage system comprises multiple storage nodes, the multiple storage nodes constitute multiple storage node groups, a quantity of storage nodes comprised in each storage node group is not less than m+k, and one primary storage node is specified in each storage node group, and the m and k are both positive integers, and wherein the primary storage node is in any storage node group in the multiple storage node groups; performing, by the primary storage node, erasure coding according to the first data segment to obtain a first EC stripe, wherein the first EC stripe comprises m first data blocks and k first parity blocks; and distributing, by the primary storage node, the first EC stripe to m+k storage nodes to execute storage, wherein each storage node in the m+k storage nodes is responsible for storing any one of the m first data blocks or the k first parity blocks of the first EC stripe. 7 . The data storage method according to claim 6 , wherein a logical volume of the distributed storage system comprises multiple logical partitions, each of the multiple logical partitions has a size of Z and does not overlap with each other, and the first data segment falls within one of the logical partitions. 8 . The data storage method according to claim 7 , wherein before the distributing the first EC stripe to m+k storage nodes to execute storage, the method further comprises: receiving a second data segment, wherein the second data segment and the first data segment fall within same logical partition, and logical addresses of the second data segment overlap logical addresses of the first data segment; and performing erasure coding according to the second data segment to obtain a second EC stripe, wherein the second EC stripe comprises m second data blocks and k second parity blocks, wherein the distributing the first EC stripe to m+k storage nodes to execute storage further comprises: determining a serial distribution sequence of the first EC stripe and the second EC stripe; and distributing the first EC stripe and the second EC stripe to the m+k storage nodes in series according to the serial distribution sequence. 9 . The data storage method according to claim 8 , wherein the determining the serial distribution sequence of the first EC stripe and the second EC stripe, and the distributing the first EC stripe and the second EC stripe to the m+k storage nodes in series according to the serial distribution sequence comprises: if a receiving time of the first data segment is earlier than a receiving time of the second data segment, distributing the first EC stripe to the m+k storage nodes to execute storage, and after receiving a response message indicating that the m+k storage nodes successfully store the first EC stripe, distributing the second EC stripe to the m+k storage nodes to execute storage; or if a receiving time of the first data segment is later than a receiving time of the second data segment, distributing the second EC stripe to the m+k storage nodes to execute storage, and after receiving a response message indicating that the m+k storage nodes successfully store the second EC stripe, distributing the first EC stripe to the m+k storage nodes to execute storage. 10 . The data storage method according to claim 7 , wherein before the performing erasure coding according to the first data segment to obtain a first EC stripe, the method further comprises: receiving a third data segment, wherein the third data segment and the first data segment fall within the same logical partition, and logical addresses of the third data segment do not overlap logical a

Assignees

Inventors

Classifications

  • G06F3/064Primary

    Management of blocks · CPC title

  • Management of space entities, e.g. partitions, extents, pools · CPC title

  • Improving or facilitating administration, e.g. storage management · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016357440A1 cover?
A data distribution method for improving performance of a distributed storage system includes: receiving, by a data distribution apparatus, a storage instruction of a user, dividing to-be-stored data that the storage instruction instructs to store, into P data segments, determining a storage node group corresponding to each data segment, and finally distributing the data segment to a primary no…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F3/064. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 08 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).