What technology area does this patent fall under?

Primary CPC classification G06F3/061. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 22 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Methods for managing storage in a data storage cluster with distributed zones based on parity values and devices thereof

US9740403B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9740403-B2
Application number	US-201514636055-A
Country	US
Kind code	B2
Filing date	Mar 2, 2015
Priority date	May 23, 2012
Publication date	Aug 22, 2017
Grant date	Aug 22, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for a data storage cluster and a method for maintaining and updating reliability data and reducing data communication between nodes, are disclosed herein. Each data object is written to a single data zone on a data node within the data storage cluster. Each data object includes one or more data chunks, and the data chunks of a data object are written to a data node in an append-only log format. When parity is determined for a reliability group including the data zone, there is no need to transmit data from other data nodes where the rest of data zones of the reliability group reside. Thus, inter-node data communication for determining reliability data is reduced.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, at a first node of a plurality of nodes within a data storage cluster, a request for storing a data object including one or more data chunks, wherein a signature of each of the one or more data chunks is determined and is sent to a metadata server of the data storage cluster; writing, by the first node, the received one or more data chunks to a data zone in an append-only log format upon determining the data zone to write the received one or more data chunks, wherein the data zone is assigned to a reliability group defined across more than one of the plurality of nodes within the data storage cluster; sending, by the first node, the written one or more data chunks of the data object to a second node of the plurality of nodes within the data storage cluster, wherein the second node includes a parity zone assigned to the reliability group to which the data zone of the first node is assigned; determining parity chunks for the reliability group at the second node based on the sent one or more data chunks wherein the determining of the parity values does not require use of information from nodes other than the first and second nodes; and writing, by the first node, the determined parity chunks to a parity zone of the second node in the append-only log format. 2. The method of claim 1 , wherein each parity chunk of the parity chunks is written to the parity zone of the second node at an offset at which a corresponding data chunk of the data chunks is written to the data zone of the first node. 3. The method of claim 1 , wherein the signature of the data chunks are determined by a hash function. 4. The method of claim 1 , further comprising: deduplicating, by the first node, one or more data chunks when a metadata server matches the signature of the deduplicated data chunks with one or more entries in a global chunk map; transmitting to the first node the locations of the deduplicated data chunks according to the global chunk map; and instructing, by the first node, to write the data chunks other than the deduplicated data chunks to the data zone of the first node. 5. The method of claim 4 , further comprising: recording by first node, the locations of the deduplicated data chunks to an object record of the data object in the first node, wherein the object record is an inode. 6. The method of claim 1 , further comprising: determining, by the first node, to clean a portion of the data zone of the first node, when the portion is no longer allocated to any data objects stored in the data storage cluster; sending, by the first node, data in the portion of the data zone of the first node to the second node; cleaning, by the first node, the portion of the data zone by marking the portion with a predetermined value; and updating, by the first node, a corresponding portion of the parity zone of the second node, by combining the data in the portion of the data zone from data of the corresponding portion of the parity zone. 7. The method of claim 1 , wherein the data chunks are written to the data zone of the first node in an append-only log format so that the data zone is being written in an increasing order. 8. The method of claim 1 , wherein a second data zone of a third node of the plurality of nodes within the data storage cluster is assigned to the reliability group, along with the data zone of the first node and the parity zone of the second node, and wherein the parity chunks determination on the second node after receiving data chunks from the first node does not require use of data from the second data zone of the third node assigned to the reliability group. 9. The method of claim 1 , further comprising: receiving, by first node, a request for storing a second data object including one or more data chunks; writing, by the first node, the data chunks of the second data object to a second data zone of a third node in an append-only log format, wherein the second data zone is assigned to the reliability group to which the data zone of the first node and the parity zone of the second node are assigned; sending, by the first node, the data chunks of the second data object to the second node of the plurality of nodes within the data storage cluster; and updating, by the first node, parity values for the reliability group at the second node based on the data chunks of the second data object received by the second node, wherein the updating of the parity values does not require use of information from nodes other than the second and third nodes. 10. A non-transitory computer readable medium having stored thereon instructions for managing storage comprising executable code which when executed by one or more processors, causes the processors to perform steps comprising: receiving a request for storing a data object including one or more data chunks, wherein a signature of each of the one or more data chunks is determined and is sent to a metadata server of the data storage cluster; writing the received one or more data chunks to a data zone of a first node in an append-only log format upon determining the data zone to write the received one or more data chunks, wherein the data zone is assigned to a reliability group defined across more than one of the plurality of nodes within the data storage cluster; sending the written one or more data chunks of the data object to a second node of the plurality of data nodes within the data storage cluster, wherein the second node includes a parity zone assigned to the reliability group to which the data zone of the first node is assigned; determining parity chunks for the reliability group at the second node based on the sent one or more data chunks wherein the determining of the parity values does not require use of information from nodes other than the first and second nodes; and writing the determined parity chunks to a parity zone of the second node in the append-only log format. 11. The medium as set forth in claim 10 wherein each parity chunk of the parity chunks is written to the parity zone of the second node at an offset at which a corresponding data chunk of the data chunks is written to the data zone of the first node. 12. The medium as set forth in claim 10 further comprising: wherein the signature of the data chunks are determined by a hash function; deduplicating one or more data chunks when a metadata server matches the signature of the deduplicated data chunks with one or more entries in a global chunk map; transmitting to the first node the locations of the deduplicated data chunks according to the global chunk map; and instructing to write the data chunks other than the deduplicated data chunks to the data zone of the first node. 13. The medium as set forth in claim 10 wherein: the data chunks are written to the data zone of the first node in an append-only log format so that the data zone is being written in an increasing order; and wherein a second data zone of a third node of the plurality of nodes within the data storage cluster is assigned to the reliability group, along with the data zone of the first node and the parity zone of the second node, and wherein the parity chunks determination on the second node after receiving data chunks from the first node does not require use of data from the second data zone of the third node assigned to the reliability group. 14. The medium as set forth in claim 10 further comprising: receiving a request for storing a second data object including one or more data chunks; writing the data chunks of the second data object to a second data zone of a third data node

Assignees

Netapp Inc

Inventors

Classifications

G06F2201/84
Using snapshots, i.e. a logical point-in-time copy of the data · CPC title
G06F3/0619
in relation to data integrity, e.g. data losses, bit errors · CPC title
G06F3/061Primary
Improving I/O performance · CPC title
G06F3/064
Management of blocks · CPC title
G06F3/067
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

Patent family

Related publications grouped by family.

View patent family 52575200

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9740403B2 cover?: Techniques for a data storage cluster and a method for maintaining and updating reliability data and reducing data communication between nodes, are disclosed herein. Each data object is written to a single data zone on a data node within the data storage cluster. Each data object includes one or more data chunks, and the data chunks of a data object are written to a data node in an append-only …
Who is the assignee on this patent?: Netapp Inc
What technology area does this patent fall under?: Primary CPC classification G06F3/061. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 22 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).