De-duplicating distributed file system using cloud-based object store

US2016292178A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016292178-A1
Application numberUS-201514675425-A
CountryUS
Kind codeA1
Filing dateMar 31, 2015
Priority dateMar 31, 2015
Publication dateOct 6, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques to provide a de-duplicating distributed file system using a cloud-based object store are disclosed. In various embodiments, a request to store a file comprising a plurality of chunks of file data is received. A determination to store at least a subset of the plurality of chunks is made. The request is responded to at least in part by providing an indication to store two or more chunks comprising the at least a subset of the plurality of chunks comprising the file as a single stored object that includes the combined chunk data of said two or more chunks.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of storing file system data, comprising: receiving a request to store a file comprising a plurality of chunks of file data; determining to store at least a subset of the plurality of chunks; and responding to the request at least in part by providing an indication to store two or more chunks comprising the at least a subset of the plurality of chunks comprising the file as a single stored object that includes the combined chunk data of said two or more chunks. 2 . The method of claim 1 , wherein the request is received from a file system client. 3 . The method of claim 1 , wherein the request includes a hash or other representation of the chunks comprising the file. 4 . The method of claim 3 , wherein the hash or other representation is used to determine whether a chunk comprising the file has already been stored by the file system. 5 . The method of claim 4 , wherein determining to store at least a subset of the plurality of chunks includes using the respective hash or other representation of the chunks comprising the at least a subset of the plurality of chunks to determine that said chunks have not yet been stored by the file system. 6 . The method of claim 1 , wherein said indication to store two or more chunks comprising the at least a subset of the plurality of chunks comprising the file as a single stored object includes a locator or other identifier associated with the single stored object and data associating said at least a subset of the plurality of chunks comprising the file with the locator or other identifier. 7 . The method of claim 6 , wherein the locator or other identifier indicates a location to which the single stored object should be sent to be stored. 8 . The method of claim 1 , wherein the request is sent by a file system client that is configured to respond to said response to the request by assembling the single stored object using said two or more chunks. 9 . The method of claim 1 , further comprising storing metadata that associates said two or more chunks with said single stored object. 10 . The method of claim 9 , wherein said metadata includes for each chunk a location of data comprising that chunk with the single stored object. 11 . The method of claim 1 , further comprising verifying storage of said two or more chunks at least in part by retrieving said single stored object and using the chunk data of the respective chunks comprising the single stored object to verify the chunk data as stored corresponds to the chunk data that was expected to be stored. 12 . The method of claim 11 , further comprising using a “chunks remaining to be verified” counter to determine how long to maintain a copy of the single stored object in a local cache. 13 . The method of claim 1 , further comprising storing for the single stored object a reference count that reflects a number of chunks comprising the blob that remain subject to being retained by the file system. 14 . The method of claim 1 , further comprising decrementing the reference count based at least in part on a determination that a chunk comprising the blob no longer remains subject to being retained by the file system. 15 . A system to store file system data, comprising: a communication interface; and a processor coupled to the communication interface and configured to: receive via the communication interface a request to store a file comprising a plurality of chunks of file data; determine to store at least a subset of the plurality of chunks; and respond to the request at least in part by providing an indication to store two or more chunks comprising the at least a subset of the plurality of chunks comprising the file as a single stored object that includes the combined chunk data of said two or more chunks. 16 . The system of claim 15 , wherein the request includes a hash or other representation of the chunks comprising the file. 17 . The system of claim 16 , wherein the hash or other representation is used to determine whether a chunk comprising the file has already been stored by the file system. 18 . The system of claim 17 , wherein determining to store at least a subset of the plurality of chunks includes using the respective hash or other representation of the chunks comprising the at least a subset of the plurality of chunks to determine that said chunks have not yet been stored by the file system. 19 . The system of claim 15 , wherein the processor is further configured to verify storage of said two or more chunks at least in part by retrieving said single stored object and using the chunk data of the respective chunks comprising the single stored object to verify the chunk data as stored corresponds to the chunk data that was expected to be stored. 20 . A computer program product to store file system data, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for: receiving a request to store a file comprising a plurality of chunks of file data; determining to store at least a subset of the plurality of chunks; and responding to the request at least in part by providing an indication to store two or more chunks comprising the at least a subset of the plurality of chunks comprising the file as a single stored object that includes the combined chunk data of said two or more chunks.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016292178A1 cover?
Techniques to provide a de-duplicating distributed file system using a cloud-based object store are disclosed. In various embodiments, a request to store a file comprising a plurality of chunks of file data is received. A determination to store at least a subset of the plurality of chunks is made. The request is responded to at least in part by providing an indication to store two or more chunk…
Who is the assignee on this patent?
Emc Corp
What technology area does this patent fall under?
Primary CPC classification G06F17/30159. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 06 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).