Storage of sparse files using parallel log-structured file system

US9811545B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9811545-B1
Application numberUS-201313921719-A
CountryUS
Kind codeB1
Filing dateJun 19, 2013
Priority dateJun 19, 2013
Publication dateNov 7, 2017
Grant dateNov 7, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical end of the sparse file. Additional storage efficiency can optionally be achieved by (i) detecting a write pattern for a plurality of the data portions and generating a single patterned index entry for the plurality of the patterned data portions; and/or (ii) storing the patterned index entries for a plurality of the sparse files in a single directory, wherein each entry in the single directory comprises an identifier of a corresponding sparse file.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for storing a sparse file, comprising the steps of: obtaining, using at least one processing device, at least a portion of said sparse file, wherein said sparse file portion comprises a plurality of data portions and a corresponding plurality of holes, wherein each of said plurality of data portions has been written with data and wherein remainder portions of said sparse file portion associated with each of said holes have not been written with data; detecting a write pattern for a plurality of said data portions of a plurality of said sparse files; generating, using at least one processing device, a patterned index entry for each of said sparse files only for said patterned data portions of said plurality of said sparse files, each of said patterned index entries comprising a logical offset, physical offset and length of each of said data portions; and storing, using at least one processing device, said plurality of data portions of said sparse file in a single file in a storage device of a file system using a parallel log-structured file system without storing said hole associated with each of said data portions, wherein said patterned index entries for said plurality of said sparse files are stored as a file in a directory, wherein each patterned index entry in said file comprises an identifier of a corresponding sparse file. 2. The method of claim 1 , wherein said hole is restored to said sparse file upon a reading of said sparse file. 3. The method of claim 1 , wherein said storing step further comprises the step of storing said data portion at a logical end of said sparse file. 4. The method of claim 1 , wherein said sparse file is generated by a process running on a compute node in a parallel computing system. 5. The method of claim 1 , wherein said sparse file is provided to a middleware virtual file system for storage. 6. The method of claim 1 , wherein said sparse file is stored on a parallel file system comprised of one or more disks. 7. A computer program product comprising a tangible machine-readable recordable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by the processor of the processing device implement the steps of the method of claim 1 . 8. An apparatus for storing a sparse file, comprising: a memory; and at least one processing device operatively coupled to the memory and configured to: obtain, using said at least one processing device, at least a portion of said sparse file, wherein said sparse file portion comprises a plurality of data portions and a corresponding plurality of holes, wherein each of said plurality of data portions has been written with data and wherein remainder portions of said sparse file portion associated with each of said holes have not been written with data; detecting a write pattern for a plurality of said data portions of a plurality of said sparse files; generate, using said at least one processing device, a patterned index entry for each of said sparse files only for said patterned data portions of said plurality of said sparse files, each of said patterned index entries comprising a logical offset, physical offset and length of each of said data portions; and store, using said at least one processing device, said plurality of data portions of said sparse file in a single file in a storage device of a file system using a parallel log-structured file system without storing said hole associated with each of said data portions, wherein said patterned index entries for said plurality of said sparse files are stored as a file in a directory, wherein each patterned index entry in said file comprises an identifier of a corresponding sparse file. 9. The apparatus of claim 8 , wherein said hole is restored to said sparse file upon a reading of said sparse file. 10. The apparatus of claim 8 , wherein said at least one hardware device is further configured to store said data portion at a logical end of said sparse file. 11. The apparatus of claim 8 , wherein said sparse file is generated by a process running on a compute node in a parallel computing system. 12. The apparatus of claim 8 , wherein said sparse file is provided to a middleware virtual file system for storage. 13. The apparatus of claim 8 , wherein said sparse file is stored on a parallel file system comprised of one or more disks. 14. A data storage system for storing a sparse file, comprising: a hardware processing unit for obtaining at least a portion of said sparse file, wherein said sparse file portion comprises a plurality of data portions and a corresponding plurality of holes, wherein each of said plurality of data portions has been written with data and wherein remainder portions of said sparse file portion associated with each of said holes have not been written with data; detecting a write pattern for a plurality of said data portions of a plurality of said sparse files; generating, using at least one processing device, a patterned index entry for each of said sparse files only for said patterned data portions of said plurality of said sparse files, each of said patterned index entries comprising a logical offset, physical offset and length of each of said data portions; and storing, using said at least one processing device, said plurality of data portions of said sparse file in a single file of a file system using a parallel log-structured file system without storing said hole associated with each of said data portions, wherein said patterned index entries for said plurality of said sparse files are stored as a file in a directory, wherein each patterned index entry in said file comprises an identifier of a corresponding sparse file; and a storage device for storing said sparse files and said patterned index entries. 15. The data storage system of claim 14 , wherein said hole is restored to said sparse file upon a reading of said sparse file. 16. The data storage system of claim 14 , wherein said storing step further comprises the step of storing said data portions at a logical end of said sparse files. 17. The data storage system of claim 14 , wherein said sparse files are generated by a process running on a compute node in a parallel computing system. 18. The data storage system of claim 14 , wherein said sparse files are provided to a middleware virtual file system for storage. 19. The data storage system of claim 14 , wherein said sparse files are stored on a parallel file system comprised of one or more disks.

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • Physics · mapped topic

  • G06F16/13Primary

    File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title

  • of structured data, e.g. relational data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9811545B1 cover?
A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical …
Who is the assignee on this patent?
Emc Ip Holding Co Llc, Los Alamos Nat Security Llc
What technology area does this patent fall under?
Primary CPC classification G06F17/30321. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 07 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).