Compression sampling in tiered storage
US-2017090776-A1 · Mar 30, 2017 · US
US10694002B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10694002-B1 |
| Application number | US-201715498995-A |
| Country | US |
| Kind code | B1 |
| Filing date | Apr 27, 2017 |
| Priority date | Apr 27, 2017 |
| Publication date | Jun 23, 2020 |
| Grant date | Jun 23, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Data compression optimization based on client clusters is described. A system identifies a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices. The system identifies a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster. The system identifies a client device, in the cluster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to other client devices in the cluster. The system outputs a data compression recommendation for the client device, based on data compression factors corresponding to the client device and the identified relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster.
Opening claim text (preview).
What is claimed is: 1. A system for data compression optimization based on client clusters, the system comprising: a processor-based application stored on a non-transitory computer-readable medium, which when executed on a computer, will cause one or more processors to: identify a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices; identify a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster; identify a client device, in the duster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to other client devices in the cluster; and output a data compression recommendation for the client device, based on data compression factors corresponding to the client device and the identified relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 2. The system of claim 1 , wherein the processor-based application further causes the one or more processors to identify the data compression factors that correspond to each client device in the group of client devices; wherein a count of client devices in the cluster of similar client devices is greater than a threshold. 3. The system of claim 1 , wherein one of the data compression factors comprises one of an amount of data, a type of data, an age of data, a data compression method, an operating system, a software application, hardware, an enterprise size, a geographical location, and a client/server side of data compression. 4. The system of claim 1 , wherein identifying the cluster of similar client devices in the group of client devices comprises applying one of a clustering algorithm and a similarity function to each client device in the group of client devices. 5. The system of claim 1 , wherein identifying the relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster comprises one of determining a correlation between one of the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster, and generating a regression model based on the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 6. The system of claim 1 , wherein identifying the client device, in the cluster, which corresponds to the data compression ratio that is inefficient relative to the other compression ratios corresponding to the other client devices in the cluster comprises determining an average value and a standard deviation based on the data compression ratios corresponding to the cluster, and identifying the client device which corresponds to the data compression ratio that is a specified amount of the standard deviation from the average value. 7. The system of claim 1 , further comprising: wherein identify a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices further causes the one or more processors to: identify the cluster of similar client devices based on a similar first storage capacity for one or more types of data available at the similar client devices, wherein the other client devices outside of the cluster correspond with a second storage capacity different than the first storage capacity and different types of data than the one or more types of data in the cluster; and wherein identify a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster further causes the one or more processors to: identify that a correlation exists between a number of types of data stored among all the client devices in the cluster and respective compression ratios of the client devices in the cluster. 8. A computer-implemented method for data compression optimization based on client clusters, the method comprising: identifying a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices; identifying a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster; identifying a client device, in the cluster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to other client devices in the cluster; and outputting a data compression recommendation for the client device, based on data compression factors corresponding to the client device and the identified relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 9. The method of claim 8 , wherein the method further comprises identifying the data compression factors that correspond to each client device in the group of client devices. 10. The method of claim 8 , wherein a count of client devices in the cluster of similar client devices is greater than a threshold, and one of the data compression factors comprises one of an amount of data, a type of data, an age of data, a data compression method, an operating system, a software application, hardware, an enterprise size, a geographical location, and a client/server side of data compression. 11. The method of claim 8 , wherein identifying the cluster of similar client devices in the group of client devices comprises applying one of a clustering algorithm and a similarity function to each client device in the group of client devices. 12. The method of claim 8 , wherein identifying the relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster comprises one of determining a correlation between one of the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster, and generating a regression model based on the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 13. The method of claim 8 , wherein identifying the client device, in the cluster, which corresponds to the data compression ratio that is inefficient relative to the other compression ratios corresponding to the other client devices in the cluster comprises determining an average value and a standard deviation based on the data compression ratios corresponding to the cluster, and identifying the client device which corresponds to the data compression ratio that is a specified amount of the standard deviation from the average value. 14. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein to be executed by one or more processors, the program code including instructions to: identify a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices; identify a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster; identify a client device, in the cluster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to o
Reducing the amount or size of exchanged application data · CPC title
Protocols · CPC title
Protocols for data compression, e.g. ROHC · CPC title
Hypervisor-specific management and integration aspects · CPC title
for short real-time information, e.g. alarms, notifications, alerts, updates · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.