Filesystem block sampling to identify user consumption of storage resources

US10346355B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10346355-B2
Application numberUS-201715854447-A
CountryUS
Kind codeB2
Filing dateDec 26, 2017
Priority dateDec 23, 2016
Publication dateJul 9, 2019
Grant dateJul 9, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Providing a statistical analysis of all files in a file system based on random sampling of data blocks to identify individual user consumption of file system resources and characteristics of the files stored in the file system. In one or more of the various embodiments, the file system is based on information for a plurality of cylinder groups. Also, each cylinder group may include at one or more known locations at least three types of data structures that enable reverse mapping of data blocks to root directories.

First claim

Opening claim text (preview).

What is claimed as new and desired to be protected by Letters Patent of the United States is: 1. A method for managing consumption of data storage resources in a file system, wherein one or more processors execute instructions that perform the method comprising: instantiating a sampling engine to perform actions, including: (i) determining a total amount of blocks of data in each of a plurality of cylinder groups in the file system; (ii) determining a total amount of allocated blocks in the plurality of cylinder groups; (iii) providing a confidence level based on selecting a defined amount of blocks in the plurality of cylinder groups to be sampled; (iv) randomly selecting and sampling one block in a cylinder group; (v) determining a file ID for a file associated with sampled block based on a reverse block map; (vi) employing an inode tree to find an inode for the file, wherein the file's parent pointer is employed to identify the file's parent directory; (vii) storing a file name for the file associated with the sampled block; (viii) when the file ID is non-equivalent to a root node, determining the file's child file in its parent directory and associate the file ID with the parent directory, wherein the method loops returns to performing actions starting at step (iv) again; (ix) when the file ID is equivalent to the root node and one or more of the selected blocks remain to be sampled, the method loops back to performing starting at step (iv); and (x) when the file ID is equivalent to the root node and all of the selected blocks have been sampled, performing statistical analysis of the stored file names and sampled blocks. 2. The method of claim 1 , wherein the actions of the sampling engine further comprise: employing the stored file names to identify sampled blocks. 3. The method of claim 1 , wherein the actions of the sampling engine further comprise identifying one or more users of the files associated with the sampled blocks. 4. The method of claim 1 , wherein the actions of the sampling engine further comprise identifying characteristics of the files associated with the sampled blocks, wherein the characteristics include one or more of: size of file, type of file, author, last user that accessed the file, last time a file was accessed, or other copies of the file. 5. The method of claim 1 , wherein the actions of the sampling engine further comprise employing statistical analysis of sampled blocks and file characteristics to identify data storage resource consumption by identified users of the file system. 6. The method of claim 1 , further comprising instantiating the file system engine to perform actions, including providing reports, alerts, or messages that present information regarding percentage of data storage resource consumption by identified users of the file system. 7. The method of claim 1 , further comprising employing a global positioning systems transceiver to provide geolocation information that is employed to localize information presented to one or more users of the file system. 8. A system for managing consumption of data storage resources in a file system over a network, comprising: one or more server computers that include: a memory for storing instructions; one or more processors, wherein the one or more processors execute the instructions that perform a method comprising: instantiating a sampling engine to perform actions, including: (i) determining a total amount of blocks of data in each of a plurality of cylinder groups in the file system; (ii) determining a total amount of allocated blocks in the plurality of cylinder groups; (iii) providing a confidence level based on selecting a defined amount of blocks in the plurality of cylinder groups to be sampled; (iv) randomly selecting and sampling one block in a cylinder group; (v) determining a file ID for a file associated with sampled block based on a reverse block map; (vi) employing an inode tree to find an inode for the file, wherein the file's parent pointer is employed to identify the file's parent directory; (vii) storing a file name for the file associated with the sampled block; (viii) when the file ID is non-equivalent to a root node, determining the file's child file in its parent directory and associate the file ID with the parent directory, wherein the method loops returns to performing actions starting at step (iv) again; (ix) when the file ID is equivalent to the root node and one or more of the selected blocks remain to be sampled, the method loops back to performing starting at step (iv); and (x) when the file ID is equivalent to the root node and all of the selected blocks have been sampled, performing statistical analysis of the stored file names and sampled blocks. 9. The system of claim 8 , wherein the actions of the sampling engine further comprise employing the stored file names to identify sampled blocks. 10. The system of claim 8 , wherein the actions of the sampling engine further comprise identifying one or more users of the files associated with the sampled blocks. 11. The system of claim 8 , wherein the actions of the sampling engine further comprise identifying characteristics of the files associated with the sampled blocks, wherein the characteristics include one or more of: size of file, type of file, author, last user that accessed the file, last time a file was accessed, or other copies of the file. 12. The system of claim 8 , wherein the actions of the sampling engine further comprise employing statistical analysis of sampled blocks and file characteristics to identify data storage resource consumption by identified users of the file system. 13. The system of claim 8 , further comprising instantiating the file system engine to perform actions, including providing reports, alerts, or messages that present information regarding percentage of data storage resource consumption by identified users of the file system. 14. The system of claim 8 , further comprising employing a global positioning systems transceiver to provide geolocation information that is employed to localize information presented to one or more users of the file system. 15. A non-transitory computer readable storage media that includes instructions for managing consumption of data storage resources in a file system, wherein one or more processors execute instructions that perform the method comprising: instantiating a sampling engine to perform actions, including: (i) determining a total amount of blocks of data in each of a plurality of cylinder groups in the file system; (ii) determining a total amount of allocated blocks in the plurality of cylinder groups; (iii) providing a confidence level based on selecting a defined amount of blocks in the plurality of cylinder groups to be sampled; (iv) randomly selecting and sampling one block in a cylinder group; (v) determining a file ID for a file associated with sampled block based on a reverse block map; (vi) employing an inode tree to find an inode for the file, wherein the file's parent pointer is employed to identify the file's parent directory; (vii) storing a file name for the file associated with the sampled block; (viii) when the file ID is non-equivalent to a root node, determining the file's child file in its parent directory and associate the file ID with the parent directory, wherein the method loops returns to performing actions starting at step (iv) again; (ix) when the file ID is equivalent to the root node and one or more of the selected blocks remain to be sampled, the method loops back to performing starting at step (iv); and (x)

Assignees

Inventors

Classifications

  • Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs · CPC title

  • G06F16/13Primary

    File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title

  • Techniques for file synchronisation in file systems · CPC title

  • Management of blocks · CPC title

  • for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10346355B2 cover?
Providing a statistical analysis of all files in a file system based on random sampling of data blocks to identify individual user consumption of file system resources and characteristics of the files stored in the file system. In one or more of the various embodiments, the file system is based on information for a plurality of cylinder groups. Also, each cylinder group may include at one or mo…
Who is the assignee on this patent?
Qumulo Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/13. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 09 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).