Who is the assignee on this patent?

Emc Corp, Los Alamos Nat Security Llc

What technology area does this patent fall under?

Primary CPC classification G06F16/11. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 16 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Global distributed file append using log-structured file system

US10262000B1 · US · B1

Patent metadata
Field	Value
Publication number	US-10262000-B1
Application number	US-201313921657-A
Country	US
Kind code	B1
Filing date	Jun 19, 2013
Priority date	Jun 19, 2013
Publication date	Apr 16, 2019
Grant date	Apr 16, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are provided for globally appending data from a group of distributed processes to a shared file using a log-structured file system. Data generated by a plurality of processes in a parallel computing system are appended to a shared file by storing the data to the shared file using a log-structured file system (such as a Parallel Log-Structured File System (PLFS)); and generating an index entry for the data, the index entry comprising a logical offset entry and a timestamp entry indicating a time of the storage, wherein the logical offset entry is resolved at read time. The logical offset entry can be populated with an append placeholder that is resolved when the shared file is read. At read time, a plurality of the index entries associated with the shared file can be sorted using the timestamp entry to deliver the requested shared file to a requesting application.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for appending data generated by a plurality of processes in a parallel computing system to a shared file, comprising the steps of: storing, using at least one processing device, said data from said plurality of processes to a non-deterministic logical end of said shared file in a storage medium using a log-structured file system; generating, using at least one processing device, an index entry for said data, said index entry comprising a logical offset entry and a timestamp entry indicating a time of said storage into said shared file in said storage medium; and constructing a view of the shared file at read time by (i) sorting, at said read time, a plurality of said timestamp entries for said shared file indicating said time of said storage of said data from said plurality of processes into said shared file in said storage medium, and (ii) determining, at said read time, a deterministic location for each of a plurality of data chunks in the shared file based on the sorted timestamp entries, wherein the shared file is shared by said plurality of processes. 2. The method of claim 1 , further comprising the step of populating said logical offset entry with an append placeholder that is resolved when said shared file is read. 3. The method of claim 1 , wherein said sorting further comprises the step of reconstructing multiple write streams from said plurality of processes to a single logical file in a single read stream. 4. The method of claim 1 , wherein said sorting defers a mapping of the deterministic location for each of a plurality of data chunks in said shared file until a reading application opens said shared file. 5. The method of claim 1 , wherein said log-structured file system comprises a Parallel Log-Structured File System. 6. The method of claim 1 , wherein said storing step further comprises the step of storing said data at a logical end of said shared file. 7. The method of claim 1 , wherein said storing step creates a write stream for each of said plurality of processes. 8. The method of claim 7 , wherein said write streams for said plurality of processes are reassembled into a single read stream at read time. 9. The method of claim 1 , wherein said plurality of processes are running on a plurality of compute nodes. 10. The method of claim 1 , wherein shared file is provided to a middleware virtual file system for storage. 11. The method of claim 1 , wherein said shared file is stored on a parallel file system comprised of one or more disks. 12. A computer program product comprising a tangible machine-readable recordable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by the processor of the processing device implement the steps of the method of claim 1 . 13. An apparatus for appending data generated by a plurality of processes in a parallel computing system to a shared file, comprising: a memory; and at least one processing device operatively coupled to the memory and configured to: store, using at least one processing device, said data from said plurality of processes to a non-deterministic logical end of said shared file in a storage medium using a log-structured file system; generate, using at least one processing device, an index entry for said data, said index entry comprising a logical offset entry and a timestamp entry indicating a time of said storage into said shared file in said storage medium; and construct a view of the shared file at read time by (i) sorting, at said read time, a plurality of said timestamp entries for said shared file indicating said time of said storage of said data from said plurality of processes into said shared file in said storage medium, and (ii) determining, at said read time, a deterministic location for each of a plurality of data chunks in the shared file based on the sorted timestamp entries, wherein the shared file is shared by said plurality of processes. 14. The apparatus of claim 13 , wherein said at least one hardware device is further configured to populate said logical offset entry with an append placeholder that is resolved when said shared file is read. 15. The apparatus of claim 13 , wherein said sorting further comprises reconstructing multiple write streams from said plurality of processes to a single logical file in a single read stream. 16. The apparatus of claim 13 , wherein said sorting defers a mapping of the deterministic location for each of a plurality of data chunks in said shared file until a reading application opens said shared file. 17. The apparatus of claim 13 , wherein said log-structured file system comprises a Parallel Log-Structured File System. 18. The apparatus of claim 13 , wherein said data is stored at a logical end of said shared file. 19. The apparatus of claim 13 , wherein a write stream is created for each of said plurality of processes. 20. The apparatus of claim 13 , wherein said plurality of processes are running on a plurality of compute nodes. 21. The apparatus of claim 13 , wherein shared file is stored on one or more of a middleware virtual file system one or more disks of a parallel file system. 22. A data storage system for appending data generated by a plurality of processes in a parallel computing system to a shared file, comprising: a storage medium for storing said shared file and an index entry; and a hardware processing unit for (i) storing said data from said plurality of processes to a non-deterministic logical end of said shared file using a log-structured file system; and generating, using said hardware processing unit, said index entry for said data, said index entry comprising a logical offset entry and a timestamp entry indicating a time of said storage into said shared file in said storage medium, and (ii) constructing a view of the shared file at read time by (a) sorting, at said read time, a plurality of said timestamp entries for said shared file indicating said time of said storage of said data from said plurality of processes into said shared file in said storage medium, and (b) determining, at said read time, a deterministic location for each of a plurality of data chunks in the shared file based on the sorted timestamp entries, wherein the shared file is shared by said plurality of processes. 23. The data storage system of claim 22 , wherein said sorting further comprises reconstructing multiple write streams from said plurality of processes to a single logical file in a single read stream. 24. The data storage system of claim 22 , wherein said sorting defers a mapping of the deterministic location for each of a plurality of data chunks in said shared file until a reading application opens said shared file.

Assignees

Inventors

Classifications

G06F16/11Primary
File system administration, e.g. details of archiving or snapshots (error detection or correction of the data by redundancy in operations G06F11/14) · CPC title
G06F17/3007Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 66098521

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10262000B1 cover?: Techniques are provided for globally appending data from a group of distributed processes to a shared file using a log-structured file system. Data generated by a plurality of processes in a parallel computing system are appended to a shared file by storing the data to the shared file using a log-structured file system (such as a Parallel Log-Structured File System (PLFS)); and generating an in…
Who is the assignee on this patent?: Emc Corp, Los Alamos Nat Security Llc
What technology area does this patent fall under?: Primary CPC classification G06F16/11. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 16 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).