Adaptive compression and transmission for big data migration

US9521218B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9521218-B1
Application numberUS-201615002421-A
CountryUS
Kind codeB1
Filing dateJan 21, 2016
Priority dateJan 21, 2016
Publication dateDec 13, 2016
Grant dateDec 13, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for optimizing migration efficiency of a data file over network is provided. Specifically, a total time of compression time of the data file, transfer time of the data file over the network, and decompression time of the data file, is minimized by adaptively selecting compression methods to compress each data block of the data file. For selecting a compression method for a data block, information entropy of the data block is analyzed, and a real status of computing and system resources is considered. Further, trade-off among the resource usage, compassion speed and compression ratio is made to calculate an optimized transmission solution over the network for each data block of the data file.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, the method comprising: identifying an information entropy of a first data block having a block size; receiving a real-time resource status of a system, wherein the system includes a first computer, a second computer, and a communication channel having a bandwidth between the first computer and the second computer; determining a first preferred compression method to compress the first data block based at least in part on the information entropy and the real-time resource status; generating a compressed first data block according to the first preferred compression method; transferring the compressed first data block over the communication channel from the first computer to the second computer; and decompressing the compressed first data block upon arriving at the second computer. 2. The method of claim 1 , wherein the real-time resource status of a system includes available resource of central processing unit (CPU) of the first computer, available resource of memory of the first computer, available resource of CPU of the second computer, available resource of memory of the second computer, and available resource of the bandwidth of the communication channel. 3. The method of claim 1 , wherein the first preferred compression method includes an algorithm and an algorithm parameter. 4. The method of claim 1 , wherein the step of determining a first preferred compression method is performed by using a member of the group consisting of: the information entropy of the first data block, the block size of the first data block, the real-time status of the system, the bandwidth of the communication channel, a predicted compression rate (PCR) of the first data block, a predicted compression time (PCT) of the first data block, a predicted transfer time (PTT) of the first data block, and a predicted decompress time (PDT) of the first data block. 5. The method of claim 4 , further comprising: identifying, for the first data block, a real compression rate (RCR), a real compression time (RCT), a real transfer time (RTT), and a real decompression time (RDT). 6. The method of claim 5 , further comprising: determining a distance (D) between the PCT, the PTT, the PDT, and the RCT, the RTT, the RDT. 7. The method of claim 6 , further comprising: determining, for a second data block, a second preferred compression method in response to the distance being greater than a preset threshold. 8. A computer program product comprising a non-transitory computer readable storage medium having a set of instructions stored therein which, when executed by a processor, causes the processor to compress adaptively and transmit big data by: identifying an information entropy of a first data block having a block size; receiving a real-time resource status of a system, wherein the system includes a first computer, a second computer, and a communication channel having a bandwidth between the first computer and the second computer; determining a first preferred compression method to compress the first data block based at least in part on the information entropy and the real-time resource status; generating a compressed first data block according to the first preferred compression method; transferring the compressed first data block over the communication channel from the first computer to the second computer; and decompressing the compressed first data block upon arriving at the second computer. 9. The computer program product of claim 8 , wherein the real-time resource status of a system includes available resource of central processing unit (CPU) of the first computer, available resource of memory of the first computer, available resource of CPU of the second computer, available resource of memory of the second computer, and available resource of the bandwidth of the communication channel. 10. The computer program product of claim 8 , wherein the first preferred compression method includes an algorithm and an algorithm parameter. 11. The computer program product of claim 8 , wherein the step of determining a first preferred compression method is performed by using a member of the group consisting of: the information entropy of the first data block, the block size of the first data block, the real-time status of the system, the bandwidth of the communication channel, a predicted compression rate (PCR) of the first data block, a predicted compression time (PCT) of the first data block, a predicted transfer time (PTT) of the first data block, and a predicted decompress time (PDT) of the first data block. 12. The computer program product of claim 11 , further comprising: identifying for the first data block a real compression rate (RCR), a real compression time (RCT), a real transfer time (RTT), and a real decompression time (RDT). 13. The computer program product of claim 12 , further comprising: determining a distance (D) between the PCT, the PTT, the PDT, and the RCT, the RTT, the RDT. 14. The computer program product of claim 13 , further comprising: determining for a second data block a second preferred compression method in response to the distance being greater than a preset threshold. 15. A computer system comprising: a processor set; and a computer readable storage medium; wherein: the processor set is structured, located, connected, and/or programmed to run program instructions stored on the computer readable storage medium; and the program instructions which, when executed by the processor set, cause the processor set to compress adaptively and transmit big data by: identifying an information entropy of a first data block having a block size; receiving a real-time resource status of a system, wherein the system includes a first computer, a second computer, and a communication channel having a bandwidth between the first computer and the second computer; determining a first preferred compression method to compress the first data block based at least in part on the information entropy and the real-time resource status; generating a compressed first data block according to the first preferred compression method; transferring the compressed first data block over the communication channel from the first computer to the second computer; and decompressing the compressed first data block upon arriving at the second computer. 16. The computer system of claim 15 , wherein the first preferred compression method includes an algorithm and an algorithm parameter. 17. The computer system of claim 15 , wherein the step of determining a first preferred compression method is performed by using a member of the group consisting of: the information entropy of the first data block, the block size of the first data block, the real-time status of the system, the bandwidth of the communication channel, a predicted compression rate (PCR) of the first data block, a predicted compression time (PCT) of the first data block, a predicted transfer time (PTT) of the first data block, and a predicted decompress time (PDT) of the first data block. 18. The computer system of claim 17 , further comprising: identifying for the first data block a real compression rate (RCR), a real compression time (RCT), a real transfer time (RTT), and a real decompression time (RDT). 19. The computer system of claim 18 , further comprising: determining a distance (D) between the PCT, the PTT, the PDT, and the RCT, the RTT, the RDT. 20. The computer system of claim 19 , fur

Assignees

Inventors

Classifications

  • H04L69/04Primary

    Protocols for data compression, e.g. ROHC · CPC title

  • specially adapted for file transfer, e.g. file transfer protocol [FTP] · CPC title

  • Digital compression and data reduction techniques where the original information is represented by a subset or similar information, e.g. lossy compression · CPC title

  • H03M7/40Primary

    Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code · CPC title

  • H03M7/607Primary

    Selection between different types of compressors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9521218B1 cover?
A method for optimizing migration efficiency of a data file over network is provided. Specifically, a total time of compression time of the data file, transfer time of the data file over the network, and decompression time of the data file, is minimized by adaptively selecting compression methods to compress each data block of the data file. For selecting a compression method for a data block, …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification H04L69/04. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 13 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).