What technology area does this patent fall under?

Primary CPC classification H03M7/4037. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Feb 07 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Real-time reduction of CPU overhead for data compression

US9564918B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9564918-B2
Application number	US-201313738300-A
Country	US
Kind code	B2
Filing date	Jan 10, 2013
Priority date	Jan 10, 2013
Publication date	Feb 7, 2017
Grant date	Feb 7, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Real-time reduction of CPU overhead for data compression is performed by a processor device in a computing environment. Non-compressing heuristics are applied on a randomly selected data sample from data sequences for determining whether to compress the data sequences. A compression potential is calculated based on the non-compressing heuristics. The compression potential is compared to a threshold value. The data sequences are either compressed if the compress threshold is matched, compressed using Huffman coding if Huffman coding threshold is matched, or stored without compression.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for real-time reduction of CPU overhead for data compression by a processor device in a computing environment, the method comprising: applying non-compressing heuristics on a randomly selected data sample from data sequences for determining whether to compress the data sequences by calculating a compression potential based on the non-compressing heuristics, wherein the compression potential is compared to a threshold value; and performing: compressing the data sequences if the compress threshold is matched, compressing the data sequences using Huffman coding if Huffman coding threshold is matched, storing the data sequences without compression, using at least one of the non-compressing heuristics, selecting a randomly selected data sample, and computing core characters that compose a predefined percentage of bytes in the randomly selected data sample, an entropy of the data, a relation between appearances of character pairs and a random distribution of the character pairs, and the entropy of the character pairs for determining whether to compress the data sequences, for real-time reduction of CPU overhead for data compression. 2. The method of claim 1 , further including calculating the non-compressing heuristics one after another for developing a heuristic score for making a decision for determining whether to compress the data sequences, wherein the CPU speed is optimized. 3. The method of claim 2 , further including calculating the compression potential based on the heuristic score of the non-compressing heuristics. 4. The method of claim 1 , further including: estimating a relation between appearances of character pairs and a random distribution of the character pairs by: comparing a number of pairs from core characters in the randomly selected data sample against the number of pairs from the core characters expected to appear in a random distribution in the randomly selected data sample, and calculating a L 2 norm distance between a first vector of distributions representing an observed distribution of the number of pairs from the core characters in the randomly selected data sample and a second vector of distributions representing an expected distribution of single values assuming there is no correlation between subsequent pairs from the core characters. 5. The method of claim 1 , further including performing one of turning off and turning on the calculating the compression potential, and determining whether to compress the data sequences according to a predefined setting. 6. The method of claim 1 , wherein the predefined setting, further includes at least one of: continuously applying the non-compressing heuristics and providing an indication as to whether to compress the data sequences, applying the non-compressing heuristics on demand when a compression ratio is above the predetermined threshold for a predefined number of the data sequences, applying the non-compressing heuristics according to a size of a buffer, and applying a prefix compression estimation and deciding whether to compress the data sequences based on a prefix compression ratio when the prefix compression ratio of the data sequences is below a threshold. 7. The method of claim 1 , further including applying the non-compressing heuristics to a relation between character pairs of core characters and random distributions of the character pairs of the core characters from the randomly selected data sample from data sequences for determining whether to compress the data sequences by calculating the compression potential based on the non-compressing heuristics. 8. A system for real-time reduction of CPU overhead for data compression in a computing environment, the system comprising: a processor device operable in the computing storage environment, wherein the processor device: applies non-compressing heuristics on a randomly selected data sample from data sequences for determining whether to compress the data sequences by calculating a compression potential based on the non-compressing heuristics, wherein the compression potential is compared to a threshold value, and performs: compressing the data sequences if the compress threshold is matched, compressing the data sequences using Huffman coding if Huffman coding threshold is matched, storing the data sequences without compression, using at least one of the non-compressing heuristics, selecting a randomly selected data sample, and computing core characters that compose a predefined percentage of bytes in the randomly selected data sample, an entropy of the data, a relation between appearances of character pairs and a random distribution of the character pairs, and the entropy of the character pairs for determining whether to compress the data sequences, for real-time reduction of CPU overhead for data compression. 9. The system of claim 8 , wherein the processor device calculates the non-compressing heuristics one after another for developing a heuristic score for making a decision for determining whether to compress the data sequences, wherein the CPU speed is optimized. 10. The system of claim 9 , wherein the processor device calculates the compression potential based on the heuristic score of the non-compressing heuristics. 11. The system of claim 8 , wherein the processor device: estimates a relation between appearances of character pairs and a random distribution of the character pairs by: comparing a number of pairs from core characters in the randomly selected data sample against the number of pairs from the core characters expected to appear in a random distribution in the randomly selected data sample, and calculating a L 2 norm distance between a first vector of distributions representing an observed distribution of the number of pairs from the core characters in the randomly selected data sample and a second vector of distributions representing an expected distribution of single values assuming there is no correlation between subsequent pairs from the core characters. 12. The system of claim 8 , wherein the processor device performs one of turning off and turning on the calculating the compression potential, and determining whether to compress the data sequences according to a predefined setting. 13. The system of claim 8 , wherein the processor device performs at least one of: continuously applying the non-compressing heuristics and providing an indication as to whether to compress the data sequences, applying the non-compressing heuristics on demand when a compression ratio is above the predetermined threshold for a predefined number of the data sequences, applying the non-compressing heuristics according to a size of a buffer, and applying a prefix compression estimation and deciding whether to compress the data sequences based on a prefix compression ratio when the prefix compression ratio of the data sequences is below a threshold. 14. The system of claim 8 , wherein the processor device applies the non-compressing heuristics to a relation between character pairs of core characters and random distributions of the character pairs of the core characters from the randomly selected data sample from data sequences for determining whether to compress the data sequences by calculating the compression potential based on the non-compressing heuristics. 15. A computer program product for real-time reduction of CPU overhead for data compression by a processor device by a processor device, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code port

Assignees

Inventors

Classifications

H03M7/607
Selection between different types of compressors · CPC title
H03M7/6088
according to the data type · CPC title
H03M7/4037Primary
Prefix coding · CPC title

Patent family

Related publications grouped by family.

View patent family 51061790

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9564918B2 cover?: Real-time reduction of CPU overhead for data compression is performed by a processor device in a computing environment. Non-compressing heuristics are applied on a randomly selected data sample from data sequences for determining whether to compress the data sequences. A compression potential is calculated based on the non-compressing heuristics. The compression potential is compared to a thres…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification H03M7/4037. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Feb 07 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).