Distributed data deduplication reference counting
US-2020142974-A1 · May 7, 2020 · US
US11403261B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11403261-B2 |
| Application number | US-201816213561-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 7, 2018 |
| Priority date | Dec 7, 2018 |
| Publication date | Aug 2, 2022 |
| Grant date | Aug 2, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosure provides for isolation of concurrent read and write transactions on the same file, thereby enabling higher file system throughput relative to serial-only transactions. Race conditions and lock contentions in multi-writer scenarios are avoided in file stat (metadata) updates by the use of an aggregator to merge updates of committed transactions to maintain file stat truth, and an upgrade lock that enforces atomicity of file stat access, even while still permitting multiple processes to concurrently read from and/or write to the file data. The disclosure is applicable to generic file systems, whether native or virtualized, and may be used, for example, to speed access to database files that require prolonged input/output (I/O) transaction time periods.
Opening claim text (preview).
What is claimed is: 1. A method of isolating concurrent read and write transactions on a file by a plurality of processes, the method comprising: providing, by a processor, a shared lock of file stat data for the file to the plurality of processes, the file stat data representing metadata for the file, the file stat data and the file being stored in a first storage area, wherein each of the plurality of processes is configured to concurrently copy at least a portion of the file stat data from the first storage area into a respective second storage area associated with the process, complete a transactions associated with the file, and store an update of the file stat data for the completed transaction to the respective second storage area associated with the process; after each of the plurality of processes has stored the update of the file stat data for the completed transaction to the respective second storage area associated with the process: providing, by the processor, an upgrade lock of the file stat data; obtaining each stored update of the file stat data for each of the plurality of processes from the respective second storage area; merging each obtained stored update of the file stat data from the respective second storage area with the file stat data from the first storage area; and atomically storing the merged file stat data in the file stat data in the first storage area. 2. The method of claim 1 , wherein each respective second storage area is a private storage. 3. The method of claim 1 , wherein merging each obtained stored updates with the file stat data from the first storage area comprises: selecting a maximum timestamp value for at least one timestamp selected from a list consisting of: access time (atime), change time (ctime), and modification time (mtime). 4. The method of claim 1 , wherein copying at least a portion of the file stat data comprises copying at least a portion of a latest version of the file stat data from the first storage area. 5. The method of claim 1 , wherein each second storage areas corresponds to a respective one of the plurality of processes. 6. The method of claim 1 , the method further comprising: obtaining, by a reading process, a first stat pointer to access the file and setting a value of a first reference counter at one corresponding to the first stat pointer; while the merging is being executed, maintaining the value of the first reference counter at one and creating a second stat pointer and a second reference counter with count value set at one; and upon the reading process releasing the first stat pointer, decrementing the first reference counter to zero, deleting the first stat pointer and maintaining the second reference counter at one, wherein the second reference counter points to the merged file stat data. 7. The method of claim 1 , wherein atomically storing the merged file stat data in the file stat data in the first storage area comprises: creating a new pointer for the file stat data. 8. The method of claim 1 , further comprising: while a first writing process is writing to the file in a first transaction, obtaining, by a second writing process, the shared lock of the file stat data; obtaining, by the second writing process, a shared pointer of at least a portion of the file stat data; writing, by the second writing process, to the file in a second transaction; storing an update of the file stat data for the second transaction in a second storage area associated with the second writing process; releasing, by the second writing process, the shared pointer of the at least a portion of the file stat data; committing the second transaction; and releasing, by the second writing process, the shared lock of the file stat data. 9. The method of claim 8 , further comprising: while the second writing process is writing to the file, reading from the file by a reading process. 10. A computer system for isolating concurrent read and write transactions on a file by a plurality of processes, the computer system comprising: a processor; a computer-readable medium storing instructions that are operative when executed by the processor to: provide a shared lock of file stat data for the file to the plurality of processes, the file stat data representing metadata for the file, the file stat data and the file being stored in a first storage area, wherein each of the plurality of processes is configured to concurrently copy at least a portion of the file stat data from the first storage area into a respective second storage area associated with the process, complete a transaction associated with the file, and store an update of the file stat data for the completed transaction to the respective second storage area associated with the process; after each of the plurality of processes has stored the update of the file stat data for the completed transaction to the respective second storage area associated with the process: provide an upgrade lock of the file stat data; obtain each stored update of the file stat data for each of the plurality of processes from the respective second storage area; merge each obtained stored update of the file stat data from the respective second storage area with the file stat data from the first storage area; and atomically store the merged file stat data in the file stat data in the first storage area. 11. The computer system of claim 10 , wherein copying at least a portion of the file stat data into the respective second storage area comprises: copying at least a portion of the file stat data directly from the first storage area into a second storage area. 12. The computer system of claim 10 , wherein merging each obtained stored update with the file stat data from the first storage area comprises: selecting a maximum timestamp value for at least one timestamp selected from a list consisting of: access time (atime), change time (ctime), and modification time (mtime); selecting a maximum file size value as a final file size value for the file; and adding a delta of a number of blocks to an initial number of blocks to determine a final number of blocks. 13. The computer system of claim 10 , wherein the instructions are further operative to: obtain a first stat pointer to access the file and setting a value of a first reference counter at one corresponding to the first stat pointer; while the merging is being executed, maintain the value of the first reference counter at one and create a second stat pointer and a second reference counter with count value set at one; and; upon a reading process releasing the first stat pointer, decrement the first reference counter to zero, delete the first stat pointer and maintain the second reference counter at one, wherein the second reference counter points to the merged file stat data. 14. The computer system of claim 10 , wherein atomically storing the merged file stat data in the file stat data in the first storage area comprises: creating a new pointer for the file stat data. 15. The computer system of claim 10 , wherein the instructions are further operative to: while a first writing process is writing to the file in a first transaction, obtain, by a second writing process, the shared lock of the file stat data; obtain, by the second writing process, a shared pointer of at least a portion of the file stat data; write, by the second writing process, to the file in a second transaction; store an update of the file stat data for the second transaction in a second storage area associated with the second writing process; release, by the second
Transactional file systems · CPC title
Locking methods, e.g. locking methods for file systems allowing shared and concurrent access to files · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.