Determining trusted file awareness via loosely connected events and file attributes
US-2024364713-A1 · Oct 31, 2024 · US
US9424271B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9424271-B2 |
| Application number | US-201213600181-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 30, 2012 |
| Priority date | Aug 30, 2012 |
| Publication date | Aug 23, 2016 |
| Grant date | Aug 23, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Augmenting data files in a repository of an append-only file system comprises maintaining metadata corresponding to each data file for tracking a logical end-of-file (EOF) for each data file for appending. A global versioning mechanism for the metadata allows selecting the current version of the metadata to read for performing an append job for a set of data files. Each append job comprises multiple append tasks. For each successful append job, the global versioning mechanism increments a valid metadata version to use for each data file appended. Said valid metadata version indicates the logical EOF corresponding to a new physical EOF for each of the data files appended.
Opening claim text (preview).
What is claimed is: 1. A method of augmenting data files in a repository of an append-only file system, comprising: maintaining a companion metadata file for each corresponding data file in a map-reduce system using the append-only file system, wherein each companion metadata file tracks a logical end-of-file (EOF) for each data file; maintaining global versioning of each companion metadata file for selecting a current version of EOF metadata to read for a corresponding data file; performing an append job for a set of data files using a modified read protocol for each reading task of the repository using a current global version number for the companion metadata file, wherein the append job comprises a map-reduce job including multiple append tasks; and for each successful append job, incrementing a logical EOF for each appended file to a new physical EOF, wherein the global versioning is used to increment a valid companion metadata file version for each data file appended, and said valid companion metadata file version indicates the logical EOF corresponding to the new physical EOF for each of the data files appended; and for each failed append task of the append job, maintaining a logical EOF for each failed append task by not incrementing the logical EOF for each failed append task, wherein subsequent append tasks that read a data file for retrying failed append tasks use metadata to stop reading upon reaching the logical EOF for the failed append task even when a current physical EOF is not reached. 2. The method of claim 1 , further comprising: for a failed data file append task, maintaining a current companion metadata file version for the data file, wherein partially appended bytes are ignored. 3. The method of claim 1 , further comprising: for a failed append task, in a next successful append task updating the companion metadata file to skip a region corresponding to a failed append task. 4. The method of claim 3 , further comprising: for a failed append task, in subsequent tasks, referring to said region as an invalid region. 5. The method of claim 4 , further comprising: after a failed append task, in a subsequent append task, incrementing the logical EOF to a new physical EOF. 6. The method of claim 5 , further comprising: for subsequent successful append tasks, updating the companion metadata file for skipping the invalid regions corresponding to a failed append task. 7. The method of claim 6 , further comprising: updating a global version of a companion metadata file when the append job comprising multiple append tasks succeeds, wherein a modified write protocol is used for writing to the repository, the modified write protocol augments data files with metadata files, and the global version number for each current metadata file is stored in the repository in a separate file. 8. The method of claim 7 , further comprising: not updating the global version of the companion metadata file if the append job fails even if one or more of the constituent tasks of the job succeeded. 9. The method of claim 1 , wherein the file system comprises an HDFS file system. 10. A method of data storage, comprising: augmenting data files in a repository of an append-only file system as a map- reduce job in a map-reduce system by atomic incremental load, including: maintaining a separate end-of-file (EOF) metadata file for each corresponding data file, wherein each EOF metadata file tracks a logical EOF for each data file; maintaining global versioning of the EOF metadata files for selecting the current version of an EOF metadata file to read, wherein different versions of EOF metadata files replace a previous versioned EOF metadata file; performing an append task of an append job for a data file using a modified read protocol for each reading task of the repository using a current global version number for the EOF metadata files, wherein the append job comprises a map-reduce job including multiple append tasks; and for each successful append job, incrementing a logical EOF for each appended file to a new physical EOF, wherein the global versioning is used to increment a valid companion metadata file version for each data file appended, and the valid companion metadata file version indicates the logical EOF corresponding to the new physical EOF for each data file appended; and for each failed append task of the append job, maintaining a logical EOF for each failed append task by not incrementing the logical EOF for each failed append task, wherein subsequent append tasks that read a data file for retrying failed append tasks use metadata to stop reading upon reaching the logical EOF for the failed append task even when a current physical EOF is not reached. 11. A computer program product for augmenting data files in a repository of an append-only file system, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, wherein the program instructions executable by a computer to cause the computer to perform a method comprising: maintaining a companion metadata file for each corresponding data file in a map-reduce system using the append-only file system, wherein each companion metadata file tracks a logical end-of-file (EOF) for each data file; maintaining global versioning of each companion metadata file for selecting a current version of a companion metadata file to read EOF metadata, wherein different versioned companion metadata files replace previous versioned companion metadata files; performing an append job for a set of data files using a modified read protocol for each reading task of the repository using a current global version number for the companion metadata file, wherein the append job comprises a map-reduce job including multiple append tasks; and for each successful append job, incrementing a logical EOF for each appended file to a new physical EOF, wherein the global versioning is used to increment a valid companion metadata file version to use for each data file appended, and the valid companion metadata file version indicates the logical EOF corresponding to the new physical EOF for each of the data files appended; and for each failed append task of the append job, maintaining a logical EOF for each failed append task by not incrementing the logical EOF for each failed append task, wherein subsequent append tasks that read a data file for retrying failed append tasks use metadata to stop reading upon reaching the logical EOF for the failed append task even when a current physical EOF is not reached. 12. The computer program product of claim 11 , further comprising: for a failed append task, in a next successful append task updating the companion metadata file to skip a region corresponding to a failed append task. 13. The computer program product of claim 12 , further comprising: for a failed append task, in subsequent tasks, referring to said region as an invalid region. 14. The computer program product of claim 13 , further comprising: after a failed append task, in a subsequent append task, incrementing the logical EOF to a new physical EOF. 15. The computer program product of claim 14 , further comprising: for subsequent successful append tasks, updating the companion metadata file for skipping the invalid regions corresponding to a failed append task. 16. The computer program product of claim 15 , further comprising: updating a global version for a companion metadata file when the append job comprising multiple append tasks succeeds, wherein a modified write protocol is
File meta data generation · CPC title
Versioning file systems, temporal file systems, e.g. file system supporting different historic versions of files · CPC title
Management specifically adapted to NAS (management of storage area networks [SAN] G06F3/067) · CPC title
Append-only file systems, e.g. using logs or journals to store data · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.