Redistributing data in a distributed storage system based on the attributes of the data

US9811262B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9811262-B1
Application numberUS-201615357632-A
CountryUS
Kind codeB1
Filing dateNov 21, 2016
Priority dateNov 1, 2012
Publication dateNov 7, 2017
Grant dateNov 7, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Accesses to a number of data blocks stored in a distributed storage are observed. Following observation of the accesses, the stored data blocks are redistributed. In one aspect, redistribution of the data blocks includes determining the access patterns for one or more of the data blocks based on the observed accesses, and determining the storage sizes for the one or more data blocks. Thereafter, based on the determined access patterns and determined storage sizes, the one or more data blocks are sorted. Subsequently, the one or more data blocks are redistributed or rebalanced across a number of storage devices of the distributed storage based on the sorting. In one aspect, the one or more data blocks are redistributed according to either a uniform distribution scheme or a proportional distribution scheme.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: sorting a plurality of data blocks into a plurality of categories based at least in part on an access pattern and a size corresponding to respective data blocks of the plurality of data blocks, wherein each category of the plurality of categories is associated with a respective access pattern level and respective data block storage size requirement; redistributing the plurality of data blocks across a plurality of storage devices of a distributed storage based on the sorting of the plurality of data blocks, the redistributing comprising: calculating a target number of data blocks for each of the plurality of storage devices for a first category by dividing a total number of data blocks in the first category by a number of the plurality of storage devices; and redistributing the data blocks in the first category across the plurality of storage devices based on the calculated target number of data blocks for each of the plurality of storage devices. 2. The method of claim 1 , wherein redistributing the plurality of data blocks across the plurality of storage devices based on the sorting comprises uniformly redistributing data blocks having similar access patterns and storage sizes across the plurality of storage devices. 3. The method of claim 1 , wherein redistributing the plurality of data blocks across the plurality of storage devices is further based on a determined performance characteristic for the plurality of storage devices. 4. The method of claim 3 , wherein redistributing the plurality of data blocks across the plurality of storage devices comprises redistributing the plurality of data blocks across the plurality of storage devices in proportion to a determined performance characteristic for the plurality of storage devices. 5. The method of claim 4 , wherein the determined performance characteristic comprises bandwidth of each of the plurality of storage devices. 6. The method of claim 4 , wherein the determined performance characteristic comprises storage capacity of each of the plurality of storage devices. 7. A non-transitory computer readable storage medium executing computer program instructions, the computer program instructions comprising instructions for: sorting a plurality of data blocks into a plurality of categories based at least in part on an access pattern and a size corresponding to respective data blocks of the plurality of data blocks, wherein each category of the plurality of categories is associated with a respective access pattern level and respective data block storage size requirement; redistributing the plurality of data blocks across a plurality of storage devices of a distributed storage based on the sorting of the plurality of data blocks, the redistributing comprising: calculating a target number of data blocks for each of the plurality of storage devices for a first category by dividing a total number of data blocks in the first category by a number of the plurality of storage devices; and redistributing the data blocks in the first category across the plurality of storage devices based on the calculated target number of data blocks for each of the plurality of storage devices. 8. The medium of claim 7 , wherein redistributing the plurality of data blocks across the plurality of storage devices based on the sorting comprises uniformly redistributing data blocks having similar access patterns and storage sizes across the plurality of storage devices. 9. The medium of claim 7 , wherein redistributing the plurality of data blocks across the plurality of storage devices is further based on a determined performance characteristic for the plurality of storage devices. 10. The medium of claim 9 , wherein redistributing the plurality of data blocks across the plurality of storage devices comprises redistributing the plurality of data blocks across the plurality of storage devices in proportion to the determined performance characteristic for the plurality of storage devices. 11. The medium of claim 10 , wherein the determined performance characteristic comprises bandwidth of each of the plurality of storage devices. 12. The medium of claim 10 , wherein the determined performance characteristic comprises storage capacity of each of the plurality of storage devices. 13. A system comprising: a non-transitory computer readable storage medium storing processor-executable computer program instructions, the instructions comprising instructions for: sorting a plurality of data blocks into a plurality of categories based at least in part on an access pattern and a size corresponding to respective data blocks of the plurality of data blocks, wherein each category of the plurality of categories is associated with a respective access pattern level and respective data block storage size requirement; redistributing the plurality of data blocks across a plurality of storage devices of a distributed storage based on the sorting of the plurality of data blocks, the redistributing comprising: calculating a target number of data blocks for each of the plurality of storage devices for a first category by dividing a total number of data blocks in the first category by a number of the plurality of storage devices; and redistributing the data blocks in the first category across the plurality of storage devices based on the calculated target number of data blocks for each of the plurality of storage devices; and a processor for executing the computer program instructions. 14. The system of claim 13 , wherein redistributing the plurality of data blocks across the plurality of storage devices based on the sorting comprises uniformly redistributing data blocks having similar access patterns and storage sizes across the plurality of storage devices. 15. The system of claim 13 , wherein redistributing the plurality of data blocks across the plurality of storage devices is further based on a determined performance characteristic for the plurality of storage devices. 16. The system of claim 15 , wherein redistributing the plurality of data blocks across the plurality of storage devices comprises redistributing the plurality of data blocks across the plurality of storage devices in proportion to the determined performance characteristic for the plurality of storage devices. 17. The system of claim 16 , wherein the determined performance characteristic comprises bandwidth of each of the plurality of storage devices. 18. The system of claim 16 , wherein the determined performance characteristic comprises storage capacity of each of the plurality of storage devices.

Assignees

Inventors

Classifications

  • G06F3/067Primary

    Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • G06F3/0608Primary

    Saving storage space on storage systems · CPC title

  • Free address space management · CPC title

  • by allocating resources to storage systems · CPC title

  • G06F3/064Primary

    Management of blocks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9811262B1 cover?
Accesses to a number of data blocks stored in a distributed storage are observed. Following observation of the accesses, the stored data blocks are redistributed. In one aspect, redistribution of the data blocks includes determining the access patterns for one or more of the data blocks based on the observed accesses, and determining the storage sizes for the one or more data blocks. Thereafter…
Who is the assignee on this patent?
Quantcast Corp
What technology area does this patent fall under?
Primary CPC classification G06F3/067. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 07 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).