Data migration in a distributed file system
US-12135695-B2 · Nov 5, 2024 · US
US9875262B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9875262-B2 |
| Application number | US-201514733897-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 8, 2015 |
| Priority date | Dec 23, 2010 |
| Publication date | Jan 23, 2018 |
| Grant date | Jan 23, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A distributed storage system may store data object instances in persistent storage and may cache keymap information for those data object instances. The system may cache a latest symbolic key entry for some user keys of the data object instances. When a request is made for the latest version of stored data object instances having a specified user key, the latest version may be determined dependent on whether a latest symbolic key entry exists for the specified user key, and keymap information for the latest version may be returned. When storing keymap information, a flag may be set to indicate that a corresponding latest symbolic key entry should be updated. The system may delete a latest symbolic key entry for a particular user key from the cache in response to determining that no other requests involving the keymap information for data object instances having the particular user key are pending.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: performing, by a computer system that stores a plurality of data objects in a distributed storage system: receiving a request to retrieve a latest version of a data object, wherein the request comprises a user key for the data object but does not include a version identifier of the latest version of the data object; determining if a latest symbolic key entry is stored in a cache of a keymap coordinator of the computer system for the user key; in response to determining a latest symbolic key entry is not stored in the cache for the user key, sending a request specifying the user key to one or more storage nodes associated with the keymap coordinator; determining, by the one or more storage nodes, the version identifier of the latest version of the data object for the user key; returning, by the one or more storage nodes to the keymap coordinator, the version identifier of the latest version of the data object for the user key and an indication that the returned version identifier is the latest version of the data object for the user key; and returning keymap information for the latest version of the data object. 2. The method of claim 1 , further comprising retrieving the latest version of the data object from the distributed storage system in accordance with the returned keymap information. 3. The method of claim 1 , further comprising: caching the keymap information for the latest version of the data object at the keymap coordinator. 4. The method of claim 1 , further comprising: generating a latest symbolic key entry that comprises the version identifier of the latest version of the data object for the user key and that indicates that the version indenter is the latest version of the data object for the user key; and caching the latest symbolic key entry at the keymap coordinator. 5. The method of claim 4 , further comprising: receiving another request to retrieve a latest version of the data object, wherein the request comprises the user key for the data object but does not include a version identifier of the latest version of the data object; and in response to determining the latest symbolic key entry is stored in the cache for the user key, returning keymap information stored in the cache for the latest version of the data object. 6. The method of claim 1 , wherein said determining, by the one or more storage nodes, the version identifier of the latest version of the data object for the user key comprises: examining two or more keymap entries stored in the one or more storage nodes that correspond with the user key to determine the version identifier of the latest version of data object for the user key. 7. The method of claim 6 , wherein said examining the two or more keymap entries comprises: comparing sequencer portions of the two or more keymap entries; comparing timestamp portions of version identifiers of the two or more keymap entries; or comparing leading entries of keymap entries sorted for the user key that include the two or more keymap entries. 8. The method of claim 1 , further comprising: receiving another request to retrieve the data object, wherein the request comprises the user key for the data object and a specified version identifier of the data object; and returning a version of the data object having the specified user key and the specified version identifier. 9. The method of claim 1 , wherein said determining if a latest symbolic key entry is stored in the cache of the keymap coordinator comprises: determining the latest version of the data object is a delete marker object; and wherein said returning keymap information for the latest version of the data object comprises returning an error indication. 10. The method of claim 1 , further comprising: in response to receiving the request to retrieve the latest version of the data object, selecting the keymap coordinator from a plurality of keymap coordinators of the distributed storage system, wherein selecting the keymap coordinator comprises: identifying the keymap coordinator from among the plurality of keymap coordinators according to values stored in a hash table that follows a consistent hashing scheme based, at least in part on, a hash value of the user key. 11. A non-transitory, computer-readable storage medium storing program instructions that when executed on one or more computers cause the one or more computers to: determine, in response to receiving a request to retrieve a latest version of a data object stored in a distributed storage system, wherein the request comprises a user key for the data object but does not include a version identifier of a latest version of the data object, if a latest symbolic key entry for the user key is stored in a cache of the distributed storage system; in response to determining a latest symbolic key entry is not stored in the cache for the user key, send a request specifying the user key to one or more storage nodes of a persistent storage of the distributed storage system; receive, from the one or more storage nodes, the version identifier of the latest version of the data object for the user key and an indication that the returned version identifier is the latest version of the data object for the user key; and return keymap information for the latest version of the data object. 12. The non-transitory, computer-readable storage medium of claim 11 , wherein the program instructions when executed on the one or more computers further cause the one or more computers to: cache the keymap information for the latest version of the data object in the cache of the distributed storage system. 13. The non-transitory, computer-readable storage medium of claim 12 , wherein the program instructions when executed on the one or more computers further cause the one or more computers to: generate a latest symbolic key entry that comprises the version identifier of the latest version of the data object for the user key and that indicates that the version indenter is the latest version of the data object for the user key; and store the latest symbolic key entry in the cache of the distributed storage system. 14. The non-transitory, computer readable storage medium of claim 11 , wherein the program instructions when executed on the one or more computers further cause the one or more computers to: determine, in response to receiving another request to retrieve a latest version of another data object stored in the distributed storage system, wherein the request comprises another user key for the other data object but does not include a version identifier of a latest version of the other data object, if a latest symbolic key entry for the other user key is stored in the cache of the distributed storage system; in response to determining a latest symbolic key entry is stored in the cache for the other user key, return keymap information for the latest version of the other data object. 15. A system comprising: a persistent data store that stores a plurality of data object, wherein each of the plurality of data objects comprises a user key and a version identifier; a cache; one or more processors; and a memory coupled to the one or more processors and storing program instructions that when executed by the one or more processors cause the one or more processors to: receive a request to retrieve a latest version of a data object, wherein the request comprises a user key for the data object but does not include a version identifier of the latest version of the data object; determine if a latest symbolic key entry for the us
Distributed file systems · CPC title
Versioning file systems, temporal file systems, e.g. file system supporting different historic versions of files · CPC title
Replacement control · CPC title
Caching, prefetching or hoarding of files · CPC title
Managing data history or versioning (querying versioned data G06F16/2474; querying temporal data G06F16/2477) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.