Data compression optimization based on client clusters

US10694002B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10694002-B1
Application numberUS-201715498995-A
CountryUS
Kind codeB1
Filing dateApr 27, 2017
Priority dateApr 27, 2017
Publication dateJun 23, 2020
Grant dateJun 23, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data compression optimization based on client clusters is described. A system identifies a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices. The system identifies a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster. The system identifies a client device, in the cluster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to other client devices in the cluster. The system outputs a data compression recommendation for the client device, based on data compression factors corresponding to the client device and the identified relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for data compression optimization based on client clusters, the system comprising: a processor-based application stored on a non-transitory computer-readable medium, which when executed on a computer, will cause one or more processors to: identify a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices; identify a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster; identify a client device, in the duster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to other client devices in the cluster; and output a data compression recommendation for the client device, based on data compression factors corresponding to the client device and the identified relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 2. The system of claim 1 , wherein the processor-based application further causes the one or more processors to identify the data compression factors that correspond to each client device in the group of client devices; wherein a count of client devices in the cluster of similar client devices is greater than a threshold. 3. The system of claim 1 , wherein one of the data compression factors comprises one of an amount of data, a type of data, an age of data, a data compression method, an operating system, a software application, hardware, an enterprise size, a geographical location, and a client/server side of data compression. 4. The system of claim 1 , wherein identifying the cluster of similar client devices in the group of client devices comprises applying one of a clustering algorithm and a similarity function to each client device in the group of client devices. 5. The system of claim 1 , wherein identifying the relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster comprises one of determining a correlation between one of the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster, and generating a regression model based on the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 6. The system of claim 1 , wherein identifying the client device, in the cluster, which corresponds to the data compression ratio that is inefficient relative to the other compression ratios corresponding to the other client devices in the cluster comprises determining an average value and a standard deviation based on the data compression ratios corresponding to the cluster, and identifying the client device which corresponds to the data compression ratio that is a specified amount of the standard deviation from the average value. 7. The system of claim 1 , further comprising: wherein identify a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices further causes the one or more processors to: identify the cluster of similar client devices based on a similar first storage capacity for one or more types of data available at the similar client devices, wherein the other client devices outside of the cluster correspond with a second storage capacity different than the first storage capacity and different types of data than the one or more types of data in the cluster; and wherein identify a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster further causes the one or more processors to: identify that a correlation exists between a number of types of data stored among all the client devices in the cluster and respective compression ratios of the client devices in the cluster. 8. A computer-implemented method for data compression optimization based on client clusters, the method comprising: identifying a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices; identifying a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster; identifying a client device, in the cluster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to other client devices in the cluster; and outputting a data compression recommendation for the client device, based on data compression factors corresponding to the client device and the identified relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 9. The method of claim 8 , wherein the method further comprises identifying the data compression factors that correspond to each client device in the group of client devices. 10. The method of claim 8 , wherein a count of client devices in the cluster of similar client devices is greater than a threshold, and one of the data compression factors comprises one of an amount of data, a type of data, an age of data, a data compression method, an operating system, a software application, hardware, an enterprise size, a geographical location, and a client/server side of data compression. 11. The method of claim 8 , wherein identifying the cluster of similar client devices in the group of client devices comprises applying one of a clustering algorithm and a similarity function to each client device in the group of client devices. 12. The method of claim 8 , wherein identifying the relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster comprises one of determining a correlation between one of the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster, and generating a regression model based on the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster. 13. The method of claim 8 , wherein identifying the client device, in the cluster, which corresponds to the data compression ratio that is inefficient relative to the other compression ratios corresponding to the other client devices in the cluster comprises determining an average value and a standard deviation based on the data compression ratios corresponding to the cluster, and identifying the client device which corresponds to the data compression ratio that is a specified amount of the standard deviation from the average value. 14. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein to be executed by one or more processors, the program code including instructions to: identify a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices; identify a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster; identify a client device, in the cluster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to o

Assignees

Inventors

Classifications

  • Reducing the amount or size of exchanged application data · CPC title

  • Protocols · CPC title

  • H04L69/04Primary

    Protocols for data compression, e.g. ROHC · CPC title

  • Hypervisor-specific management and integration aspects · CPC title

  • for short real-time information, e.g. alarms, notifications, alerts, updates · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10694002B1 cover?
Data compression optimization based on client clusters is described. A system identifies a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices. The system identifies a relationship between data compression factors corresponding to the cluster and data compression ratios corresp…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification H04L69/04. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).