Method of storing data in distributed manner based on technique of predicting data compression ratio, and storage device and system using same

US9606750B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9606750-B2
Application numberUS-201414446806-A
CountryUS
Kind codeB2
Filing dateJul 30, 2014
Priority dateNov 25, 2013
Publication dateMar 28, 2017
Grant dateMar 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of storing data in a distributed manner based on data compression ratio prediction, and a mass storage device and system using the method are disclosed. The device includes a compression ratio predicting unit, a compressing unit, and a control unit. When an address and first unit sized data are received, the compression ratio predicting unit estimates the predicted compression ratio of the first unit sized data. The compressing unit generates compressed data. The control unit calculates the benefit of compression based on at least the estimated predicted compression ratio, stores the compressed data in a first storage area if the calculated benefit of compression is higher than a predetermined benefit threshold value, and stores the first unit sized data in the second storage area if the calculated benefit of compression is equal to or lower than the predetermined benefit threshold value.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of distributively storing data in a mass storage device having logically or physically defined first and second storage areas, based on data compression ratio prediction, the method comprising: receiving an address and a first unit sized data, together with a write command, from a host device; estimating a predicted compression ratio of the first unit sized data, based on a Shannon byte entropy; calculating a benefit of compression, based on the predicted compression ratio; comparing the calculated benefit of compression with a predetermined benefit threshold value, and in response to the calculated benefit of compression being higher than the predetermined benefit threshold value, compressing the first unit sized data so as to store the compressed data in the first storage area, and in response to the calculated benefit of compression being equal to or lower than the predetermined benefit threshold value, storing the first unit sized data in the second storage area; wherein the predicted compression ratio is estimated using the following predicted compression ratio estimation formula: C ( X )= H ( X ) c ; wherein C(X) is a predicted compression ratio of sample data X, c is a predicted compression index that is empirically given, based on a compression method, and H(X) is the Shannon byte entropy of the sample data X, estimated using the following formula: H ⁡ ( X ) = ⁢ - ∑ i ⁢ P ⁡ ( x i ) ⁢ log b ⁢ P ⁡ ( x i ) = ⁢ - ∑ i ⁢ n i N ⁢ log b ⁢ n i N ; = ⁢ log b ⁢ N - 1 N ⁢ ∑ i ⁢ n i ⁢ log b ⁢ n i wherein sample data X includes a data symbol x i , n i is the frequency of appearance of each data symbol x i in the sample data X, N is an overall frequency of appearance of all of the data symbols in the sample data X, and P(x i ) is the probability mass function of the data symbol x i . 2. A method of distributively storing data in a mass storage device having logically or physically defined first and second storage areas, based on data compression ratio prediction, the method comprising: receiving an address and a first unit sized data, together with a write command, from a host device; estimating a predicted compression ratio of the first unit sized data, based on a Shannon byte entropy; calculating a benefit of compression, based on the predicted compression ratio; comparing the calculated benefit of compression with a predetermined benefit threshold value, and in response to the calculated benefit of compression being higher than the predetermined benefit threshold value, compressing the first unit sized data so as to store the compressed data in the first storage area, and in response to the calculated benefit of compression being equal to or lower than the predetermined benefit threshold value, storing the first unit sized data in the second storage area; wherein the predicted compression ratio is estimated using the following predicted compression ratio estimation formula: C ( X )=2 H(x) 2 −1, wherein C(X) is the predicted compression ratio of sample data X, and H(X) is the Shannon byte entropy of the sample data X. 3. The method of claim 1 , wherein the predicted compression ratio is estimated by referring to a look-up table that is constructed by mapping values of Shannon entropy to values of the predicted compression ratio based on actual compression ratio obtained using the compression method. 4. The method of claim 1 , wherein the benefit of compression is calculated, based on at least one of: a remaining storage capacity of the first storage area, which is configured to store: compressed data, an

Assignees

Inventors

Classifications

  • G06F3/0685Primary

    Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title

  • Management of files · CPC title

  • Saving storage space on storage systems · CPC title

  • Hybrid storage device · CPC title

  • Management of blocks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9606750B2 cover?
A method of storing data in a distributed manner based on data compression ratio prediction, and a mass storage device and system using the method are disclosed. The device includes a compression ratio predicting unit, a compressing unit, and a control unit. When an address and first unit sized data are received, the compression ratio predicting unit estimates the predicted compression ratio of…
Who is the assignee on this patent?
Univ Sungkyunkwan Res & Bus, Res And Business Found Sungkyunkwan Univ
What technology area does this patent fall under?
Primary CPC classification G06F3/0685. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).