Method, device, and computer program product for recognizing reducible contents in data to be written

US10936227B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10936227-B2
Application numberUS-201916272605-A
CountryUS
Kind codeB2
Filing dateFeb 11, 2019
Priority dateApr 28, 2018
Publication dateMar 2, 2021
Grant dateMar 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques recognize reducible contents in data to be written. The techniques involve receiving information related to data to be written, the information indicating that the data to be written comprises reducible contents, the reducible contents comprising data with a first reduction pattern. The techniques further involve recognizing the reducible contents in the data to be written based on the information. The techniques further involve reducing the reducible contents based on the first reduction pattern. With such techniques, active I/O pattern recognition with communication between applications and storage devices may be accomplished. In addition, with such techniques, it is easy/simple to expand recognizable new patterns, and I/O pattern limitations in standard approaches no longer exist.

First claim

Opening claim text (preview).

We claim: 1. A method for recognizing reducible contents in data to be written, the method comprising: receiving a write data command containing information related to data to be written, the information indicating whether the data to be written contains reducible contents, the reducible contents including data with a first reduction pattern; based on the information indicating that the data to be written does not contain reducible contents, writing the data to storage without reducing; based on the information indicating that the data to written does contain reducible contents, then: 1) recognizing the reducible contents in the data to be written; 2) reducing the reducible contents to reduced data based on the first reduction pattern; and 3) writing the reduced data to storage with an indication that the reduced data requires restoring upon a subsequent read of the data, and maintaining a hash list usable to identify sampled data as reducible data, wherein recognizing the reducible contents includes sampling the data to be written and determining whether a sampling result indicates that the data to be written has a pattern requiring use of the hash list to identify the data to be written as having the reducible contents, and if so then performing a hash lookup in the hash list to determine (1) in case of a hit, that the reducible contents are known and the write command is then committed, and (2) in case of a miss, calculate a new hash value for the reducible contents and insert the calculated hash value into the hash list for subsequent use, and wherein the hash list is a two-level hash list in which (1) a first level is a primary level capturing first information about reducible contents, and (2) a second level is a secondary level referenced by elements of the first level, the second level capturing distinct second information about reducible contents. 2. The method according to claim 1 , wherein recognizing the reducible contents in the data to be written comprises: sampling the data to be written to obtain a data sample; and recognizing the reducible contents from the data to be written based on the data sample. 3. The method according to claim 1 , wherein reducing the reducible contents comprises: determining the first reduction pattern; looking up the first reduction pattern in a predetermined set of reduction patterns; in response to the first reduction pattern being found in the set of reduction patterns, determining a reduction operation corresponding to the first reduction pattern; and reducing the reducible contents based on the determined reduction operation. 4. The method according to claim 3 , further comprising: in response to the first reduction pattern not being found in the set of reduction patterns, including the first reduction pattern into the set of reduction patterns. 5. The method according to claim 3 , wherein the information related to data to be written indicates a first reduction pattern of the reducible contents, wherein determining the first reduction pattern comprises: extracting the first reduction pattern from the information related to data to be written. 6. The method according to claim 1 , wherein reducing the reducible contents comprises: in response to the first reduction pattern being a non-single character repetitive type, determining from the reducible content least reserved data and description data associated with the least reserved data so as to store the least reserved data and the descriptive data, wherein the least reserved data may be restored to the reducible contents based on the descriptive data. 7. The method according to claim 1 , wherein the write data command is contained in an I/O request packet having a flag indicating whether the data to be written contains reducible contents. 8. The method according to claim 7 , wherein the flag is an indicator whether the write data command is either a Write Same command or a Write Scattered, both the Write Same command and the Write Scattered command being used by an application to write data having reducible contents to storage. 9. The method according to claim 1 , wherein the first information is cycle information describing a size of a cycle of repetition of data in the reducible contents, and the second information is type information describing data contents of repeated cycles of the reducible contents. 10. A device for recognizing reducible contents in data to be written, comprising: at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions to be executed by the at least one processing unit, the instructions, when being executed by the at least one processing unit, causing the device to perform acts comprising: receiving a write data command containing information related to data to be written, the information indicating whether the data to be written contains reducible contents, the reducible contents including data with a first reduction pattern, based on the information indicating that the data to be written does not contain reducible contents, writing the data to storage without reducing; based on the information indicating that the data to written does contain reducible contents, then: 1) recognizing the reducible contents in the data to be written, 2) reducing the reducible contents to reduced data based on the first reduction pattern, and 3) writing the reduced data to storage with an indication that the reduced data requires restoring upon a subsequent read of the data, and maintaining a hash list usable to identify sampled data as reducible data, wherein recognizing the reducible contents includes sampling the data to be written and determining whether a sampling result indicates that the data to be written has a pattern requiring use of the hash list to identify the data to be written as having the reducible contents, and if so then performing a hash lookup in the hash list to determine (1) in case of a hit, that the reducible contents are known and the write command is then committed, and (2) in case of a miss, calculate a new hash value for the reducible contents and insert the calculated hash value into the hash list for subsequent use, and wherein the hash list is a two-level hash list in which (1) a first level is a primary level capturing first information about reducible contents, and (2) a second level is a secondary level referenced by elements of the first level, the second level capturing distinct second information about reducible contents. 11. The device according to claim 10 , wherein recognizing the reducible contents in the data to be written comprises: sampling the data to be written to obtain a data sample; and recognizing the reducible contents from the data to be written based on the data sample. 12. The device according to claim 10 , wherein reducing the reducible contents comprises: determining the first reduction pattern; looking up the first reduction pattern in a predetermined set of reduction patterns; in response to the first reduction pattern being found in the set of reduction patterns, determining a reduction operation corresponding to the first reduction pattern; and reducing the reducible contents based on the determined reduction operation. 13. The device according to claim 12 , wherein the acts further comprise: in response to the first reduction pattern not being found in the set of reduction patterns, including the first reduction pattern into the set of reduction patterns. 14. The device according to claim 12 , wherein the information related to data to be written indicates

Assignees

Inventors

Classifications

  • Single storage device · CPC title

  • G06F3/0608Primary

    Saving storage space on storage systems · CPC title

  • String search, i.e. pattern matching, e.g. find identical word or best match in a string · CPC title

  • Data buffering arrangements · CPC title

  • Disk arrays, e.g. RAID, JBOD · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10936227B2 cover?
Techniques recognize reducible contents in data to be written. The techniques involve receiving information related to data to be written, the information indicating that the data to be written comprises reducible contents, the reducible contents comprising data with a first reduction pattern. The techniques further involve recognizing the reducible contents in the data to be written based on t…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0608. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).