Managing data sets of a storage system

US10169394B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10169394-B2
Application numberUS-201414297128-A
CountryUS
Kind codeB2
Filing dateJun 5, 2014
Priority dateJun 5, 2014
Publication dateJan 1, 2019
Grant dateJan 1, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, system, and computer program product for managing data sets of a storage facility is disclosed. The method, system, and computer program product include determining, by analyzing a first data set, that the first data set includes a first record having padded data. To identify the padded data, the method, system, and computer program product include comparing at least a portion of the first record of the first data set with a second record of a second data set. Next, the method, system, and computer program product include removing, from the first record of the first data set, the padded data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for managing data sets of a storage system by repairing a data set that includes padded data such that the data set can be read by an application that cannot read a data set including padded data, the method comprising: determining, by analyzing a first data set, that the first data set includes a first record having padded data, wherein padded data represents data added to a record to make a variable length record a fixed length record; comparing, to identify the padded data, at least a portion of the first record of the first data set with a second record of a second data set; and removing, from the first record of the first data set, the padded data identified in response to comparing at least the portion of the first record of the first data set with the second record of the second data set wherein the removing includes: deleting a segment of the first record matching a mask derived from a character pattern, and updating a record length for the first record; loading the first record, without padded data, into a temporary file with the first data set; storing an original file, with padded data, as a retained file with the first data set; storing the temporary file, without padded data, with a name of the original file; and accessing the temporary file by the application, wherein the application could not read the first record with padded data. 2. The method of claim 1 , wherein determining, by analyzing the first data set, that the first data set includes the first record having padded data includes: determining the first record is a fixed length record. 3. The method of claim 2 , further comprising: determining the first record is expected to be a variable length record. 4. The method of claim 1 , wherein determining, by analyzing the first data set, that the first data set includes the first record having padded data includes: determining the first record has been converted to a fixed length record from a variable length record. 5. The method of claim 1 , wherein determining, by analyzing the first data set, that the first data set includes the first record having padded data includes: determining the first data set is without a backup data set; and scanning at least the portion of the first record to resolve a character pattern. 6. The method of claim 5 , wherein comparing, to identify the padded data, at least the portion of the first record of the first data set with the second record of the second data set includes: comparing the character pattern of the first record of the first data set with the second record of the second data set, wherein the second data set is the first data set and the first record is different from the second record. 7. The method of claim 6 , wherein: scanning at least the portion of the first record for the character pattern includes scanning from a back end of the first record toward a front end of the first record until the character pattern stops; and comparing, to identify the padded data, at least the portion of the first record of the first data set with the second record of the second data set includes storing the character pattern and determining a mask derived from the character pattern matches at least a segment of a subsequent record of the first data set. 8. The method of claim 1 , wherein determining, by analyzing the first data set, that the first data set includes the first record having padded data includes: determining the second data set backs-up the first data set; and determining that both the first data set and the second data set include a type of record that is keyed. 9. The method of claim 8 , wherein comparing, to identify the padded data, at least the portion of the first record of the first data set with the second record of the second data set includes: searching, using a key from the second data set, the first data set for the key; and determining the key in the second record matches a like key in the first record. 10. The method of claim 9 , wherein comparing, to identify the padded data, at least the portion of the first record of the first data set with the second record of the second data set includes: scanning from a back end of the first record toward a front end of the first record to resolve a character pattern configured to identify the padded data as a segment which mismatches the second record; and storing a mask derived from the character pattern to identify the padded data. 11. The method of claim 1 , wherein determining, by analyzing the first data set, that the first data set includes the first record having padded data includes: determining the second data set backs-up the first data set; determining that both the first data set and the second data set include a type of record that is non-keyed; and scanning at least the portion of the first record to resolve a segment other than a character pattern. 12. The method of claim 11 , wherein comparing, to identify the padded data, at least the portion of the first record of the first data set with the second record of the second data set includes: searching, using the segment from the first record of the first data set, the second data set for the segment; and determining the segment in the first record matches a like segment in the second record. 13. The method of claim 12 , wherein comparing, to identify the padded data, at least the portion of the first record of the first data set with the second record of the second data set includes: scanning at least the portion of the first record to resolve the character pattern; and determining, to identify the padded data, a mask derived from the character pattern represents a feature in which the first record mismatches the second record. 14. A system for managing data sets in a storage facility, comprising: a remote device; and a host device, at least one of the remote device and the host device including a managing module, the managing module comprising: a determining module to determine, by analyzing a first data set, that the first data set includes a first record having padded data, wherein padded data represents data added to a record to make a variable length record a fixed length record; a comparing module to compare, to identify the padded data, at least a portion of the first record of the first data set with a second record of a second data set; and a removing module to remove, from the first record of the first data set, the padded data identified in response to comparing at least the portion of the first record of the first data set with the second record of the second data set; wherein removing includes; deleting a segment of the first record matching a mask derived from a character pattern, and updating record length for the first record; loading the first record, without padded data, into a temporary file with the first data set; storing an original file, with padded data, as a retained file with the first data set; storing the temporary file, without padded data, with a name of the original file; and accessing the temporary file by the application, wherein the application could not read the first record with padded data. 15. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a first computing device, causes the first computing device to: determine, by analyzing a first data set, that the first data set includes a first record having padded data, wherein the padded data was added to the f

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10169394B2 cover?
A method, system, and computer program product for managing data sets of a storage facility is disclosed. The method, system, and computer program product include determining, by analyzing a first data set, that the first data set includes a first record having padded data. To identify the padded data, the method, system, and computer program product include comparing at least a portion of the …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/30371. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 01 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).