Compression-based detection of inefficiency in local storage

US9952772B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9952772-B2
Application numberUS-201615160898-A
CountryUS
Kind codeB2
Filing dateMay 20, 2016
Priority dateMay 20, 2016
Publication dateApr 24, 2018
Grant dateApr 24, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed embodiments provide a system for detecting and managing inefficiency in local storage. During operation, the system obtains a first snapshot of data in local storage of a computer system, wherein the first snapshot comprises a first set of data elements in the local storage at a first time. Next, the system applies a compression technique to the first snapshot to obtain a first set of inefficiency metrics for the first set of data elements. The system then outputs the first set of inefficiency metrics with additional attributes of the data to improve management of inefficiency in the data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: obtaining a first snapshot of data in local storage of a computer system, wherein the first snapshot comprises a first set of data elements in the local storage at a first time; applying, by a processor, a compression technique to the first snapshot to obtain a first set of inefficiency metrics for the first set of data elements; outputting the first set of inefficiency metrics with additional attributes of the data to improve management of inefficiency in the data; obtaining a difference between the first snapshot and a second snapshot of the data in the local storage, wherein the second snapshot comprises a second set of data elements in the local storage at a second time; applying the compression technique to the difference to obtain a second set of inefficiency metrics for the difference; and analyzing the first and second sets of inefficiency metrics to identify a type of inefficiency in the data. 2. The method of claim 1 , further comprising: grouping a subset of the first set of data elements by an attribute; applying the compression technique to the grouped subset to obtain a group inefficiency metric for the grouped subset; and including the group inefficiency metric in the outputted first set of inefficiency metrics. 3. The method of claim 2 , wherein the attribute comprises at least one of: a file name; a file type; a data type; a directory; a device; a service; and an executable. 4. The method of claim 1 , further comprising: adjusting an interval between the first and second snapshots based on the first or second sets of inefficiency metrics. 5. The method of claim 1 , wherein the type of inefficiency is at least one of: data fragmentation; a logging inefficiency; an input/output (I/O) inefficiency; and a schema inefficiency. 6. The method of claim 1 , wherein the first set of data elements comprises at least one of: a file; a log; a record; a write; and a read. 7. The method of claim 1 , wherein the local storage is at least one of: a hard disk drive (HDD); a solid-state drive; an optical drive; and a tape drive. 8. The method of claim 1 , wherein the set of inefficiency metrics comprises at least one of: a redundancy; and a compression ratio. 9. The method of claim 1 , wherein outputting the first set of inefficiency metrics with the additional attributes of the first set of data elements comprises at least one of: displaying a ranking of the first set of data elements by the first set of inefficiency metrics; and identifying a subset of the first set of data elements as candidates for improving the inefficiency. 10. An apparatus, comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the apparatus to: obtain a first snapshot of data in local storage of a computer system, wherein the first snapshot comprises a first set of data elements in the local storage at a first time; apply a compression technique to the first snapshot to obtain a first set of inefficiency metrics for the first set of data elements; output the first set of inefficiency metrics with additional attributes of the data to improve management of inefficiency in the data; obtain a difference between the first snapshot and a second snapshot of the data in the local storage, wherein the second snapshot comprises a second set of data elements in the local storage at a second time; apply the compression technique to the difference to obtain a second set of inefficiency metrics for the difference; and analyze the first and second sets of inefficiency metrics to identify a type of inefficiency in the data. 11. The apparatus of claim 10 , wherein the memory further stores instructions that, when executed by the one or more processors, cause the apparatus to: group a subset of the first set of data elements by an attribute; apply the compression technique to the grouped subset to obtain a group inefficiency metric for the grouped subset; and include the group inefficiency metric in the outputted first set of inefficiency metrics. 12. The apparatus of claim 11 , wherein the attribute comprises at least one of: a file name; a file type; a data type; a directory; a device; a service; and an executable. 13. The apparatus of claim 10 , wherein the memory further stores instructions that, when executed by the one or more processors, cause the apparatus to: adjust an interval between the first and second snapshots based on the first or second sets of inefficiency metrics. 14. The apparatus of claim 10 , wherein the type of inefficiency is at least one of: data fragmentation; a logging inefficiency; an input/output (I/O) inefficiency; and a schema inefficiency. 15. The apparatus of claim 10 , wherein the first set of data elements comprises at least one of: a file; a log; a record; a write; and a read. 16. A system, comprising: an analysis module comprising a non-transitory computer-readable medium storing computer-executable instructions that, when executed by the system, cause the system to: obtain a first snapshot of data in local storage of a computer system, wherein the first snapshot comprises a first set of data elements in the local storage at a first time; apply a compression technique to the first snapshot to obtain a first set of inefficiency metrics for the first set of data elements; obtain a difference between the first snapshot and a second snapshot of the data in the local storage, wherein the second snapshot comprises a second set of data elements in the local storage at a second time; apply the compression technique to the difference to obtain a second set of inefficiency metrics for the difference; and analyze the first and second sets of inefficiency metrics to identify a type of inefficiency in the data; and a management module comprising a non-transitory computer-readable medium storing instructions that, when executed, cause the system to output the first set of inefficiency metrics with additional attributes of the data to improve management of inefficiency in the data. 17. The system of claim 16 , wherein the non-transitory computer-readable medium of the analysis module further stores computer-executable instructions that, when executed by the system, cause the system to: adjust an interval between the first and second snapshots based on the first or second sets of inefficiency metrics. 18. The method of claim 4 , further comprising: repeating the adjusting of the interval between the first and second snapshots based on the first or second sets of inefficiency metrics, until a source of inefficiency is identified. 19. The apparatus of claim 13 , wherein the memory further stores instructions that, when executed by the one or more processors, cause the apparatus to: repeat the adjusting of the interval between the first and second snapshots based on the first or second sets of inefficiency metrics, until a source of inefficiency is identified. 20. The system of claim 17 , wherein the non-transitory computer-readable medium of the analysis module further stores computer-executable instructions that, when executed by the system, cause the system to: repeat the adjusting of the interval between the first and second snapshots based on the first or second sets of inefficiency metrics, until a source of inefficiency is identified.

Assignees

Inventors

Classifications

  • Single storage device · CPC title

  • Point-in-time backing up or restoration of persistent data · CPC title

  • Format or protocol conversion arrangements · CPC title

  • Securing storage systems · CPC title

  • Free address space management · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9952772B2 cover?
The disclosed embodiments provide a system for detecting and managing inefficiency in local storage. During operation, the system obtains a first snapshot of data in local storage of a computer system, wherein the first snapshot comprises a first set of data elements in the local storage at a first time. Next, the system applies a compression technique to the first snapshot to obtain a first se…
Who is the assignee on this patent?
Linkedin Corp, Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0608. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).