Storing files in a parallel computing system based on user or application specification
US-9298733-B1 · Mar 29, 2016 · US
US9811545B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9811545-B1 |
| Application number | US-201313921719-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jun 19, 2013 |
| Priority date | Jun 19, 2013 |
| Publication date | Nov 7, 2017 |
| Grant date | Nov 7, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical end of the sparse file. Additional storage efficiency can optionally be achieved by (i) detecting a write pattern for a plurality of the data portions and generating a single patterned index entry for the plurality of the patterned data portions; and/or (ii) storing the patterned index entries for a plurality of the sparse files in a single directory, wherein each entry in the single directory comprises an identifier of a corresponding sparse file.
Opening claim text (preview).
What is claimed is: 1. A method for storing a sparse file, comprising the steps of: obtaining, using at least one processing device, at least a portion of said sparse file, wherein said sparse file portion comprises a plurality of data portions and a corresponding plurality of holes, wherein each of said plurality of data portions has been written with data and wherein remainder portions of said sparse file portion associated with each of said holes have not been written with data; detecting a write pattern for a plurality of said data portions of a plurality of said sparse files; generating, using at least one processing device, a patterned index entry for each of said sparse files only for said patterned data portions of said plurality of said sparse files, each of said patterned index entries comprising a logical offset, physical offset and length of each of said data portions; and storing, using at least one processing device, said plurality of data portions of said sparse file in a single file in a storage device of a file system using a parallel log-structured file system without storing said hole associated with each of said data portions, wherein said patterned index entries for said plurality of said sparse files are stored as a file in a directory, wherein each patterned index entry in said file comprises an identifier of a corresponding sparse file. 2. The method of claim 1 , wherein said hole is restored to said sparse file upon a reading of said sparse file. 3. The method of claim 1 , wherein said storing step further comprises the step of storing said data portion at a logical end of said sparse file. 4. The method of claim 1 , wherein said sparse file is generated by a process running on a compute node in a parallel computing system. 5. The method of claim 1 , wherein said sparse file is provided to a middleware virtual file system for storage. 6. The method of claim 1 , wherein said sparse file is stored on a parallel file system comprised of one or more disks. 7. A computer program product comprising a tangible machine-readable recordable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by the processor of the processing device implement the steps of the method of claim 1 . 8. An apparatus for storing a sparse file, comprising: a memory; and at least one processing device operatively coupled to the memory and configured to: obtain, using said at least one processing device, at least a portion of said sparse file, wherein said sparse file portion comprises a plurality of data portions and a corresponding plurality of holes, wherein each of said plurality of data portions has been written with data and wherein remainder portions of said sparse file portion associated with each of said holes have not been written with data; detecting a write pattern for a plurality of said data portions of a plurality of said sparse files; generate, using said at least one processing device, a patterned index entry for each of said sparse files only for said patterned data portions of said plurality of said sparse files, each of said patterned index entries comprising a logical offset, physical offset and length of each of said data portions; and store, using said at least one processing device, said plurality of data portions of said sparse file in a single file in a storage device of a file system using a parallel log-structured file system without storing said hole associated with each of said data portions, wherein said patterned index entries for said plurality of said sparse files are stored as a file in a directory, wherein each patterned index entry in said file comprises an identifier of a corresponding sparse file. 9. The apparatus of claim 8 , wherein said hole is restored to said sparse file upon a reading of said sparse file. 10. The apparatus of claim 8 , wherein said at least one hardware device is further configured to store said data portion at a logical end of said sparse file. 11. The apparatus of claim 8 , wherein said sparse file is generated by a process running on a compute node in a parallel computing system. 12. The apparatus of claim 8 , wherein said sparse file is provided to a middleware virtual file system for storage. 13. The apparatus of claim 8 , wherein said sparse file is stored on a parallel file system comprised of one or more disks. 14. A data storage system for storing a sparse file, comprising: a hardware processing unit for obtaining at least a portion of said sparse file, wherein said sparse file portion comprises a plurality of data portions and a corresponding plurality of holes, wherein each of said plurality of data portions has been written with data and wherein remainder portions of said sparse file portion associated with each of said holes have not been written with data; detecting a write pattern for a plurality of said data portions of a plurality of said sparse files; generating, using at least one processing device, a patterned index entry for each of said sparse files only for said patterned data portions of said plurality of said sparse files, each of said patterned index entries comprising a logical offset, physical offset and length of each of said data portions; and storing, using said at least one processing device, said plurality of data portions of said sparse file in a single file of a file system using a parallel log-structured file system without storing said hole associated with each of said data portions, wherein said patterned index entries for said plurality of said sparse files are stored as a file in a directory, wherein each patterned index entry in said file comprises an identifier of a corresponding sparse file; and a storage device for storing said sparse files and said patterned index entries. 15. The data storage system of claim 14 , wherein said hole is restored to said sparse file upon a reading of said sparse file. 16. The data storage system of claim 14 , wherein said storing step further comprises the step of storing said data portions at a logical end of said sparse files. 17. The data storage system of claim 14 , wherein said sparse files are generated by a process running on a compute node in a parallel computing system. 18. The data storage system of claim 14 , wherein said sparse files are provided to a middleware virtual file system for storage. 19. The data storage system of claim 14 , wherein said sparse files are stored on a parallel file system comprised of one or more disks.
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title
of structured data, e.g. relational data · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.