Method and system for maintaining consistency for I/O operations on metadata distributed amongst nodes in a ring structure
US-9286344-B1 · Mar 15, 2016 · US
US10372685B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10372685-B2 |
| Application number | US-201414231088-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 31, 2014 |
| Priority date | Mar 31, 2014 |
| Publication date | Aug 6, 2019 |
| Grant date | Aug 6, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A client request, formatted in accordance with a file system interface, is received at an access subsystem of a distributed multi-tenant storage service. After the request is authenticated at the access subsystem, an atomic metadata operation comprising a group of file system metadata modifications is initiated, including a first metadata modification at a first node of a metadata subsystem of the storage service and a second metadata modification at a second node of the metadata subsystem. A plurality of replicas of at least one data modification corresponding to the request are saved at respective storage nodes of the service.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: a plurality of computing devices that implement, using resources of a provider network that includes a plurality of availability containers with independent failure profiles: a service access subsystem configured to receive client requests formatted according to one or more industry-standard file system interfaces from a plurality of compute instances of a virtual computing service implemented at the provider network; a metadata subsystem configured to implement sequential consistency semantics on at least a subset of file store operations; and a storage subsystem configured to store at least respective data portions of one or more file stores, wherein a particular data portion of a particular file store of the one or more file stores is organized as a replica group comprising a plurality of extent replicas including a first extent replica at a first availability container of the provider network and a second extent replica at a second availability container of the provider network; wherein, in response to a particular client request received at the service access subsystem and directed to a file store object, the plurality of computing devices is configured to: perform an atomic metadata operation comprising a group of file system metadata modifications to metadata that corresponds to data of the file store object, including a first metadata modification at a first node of the metadata subsystem and a second metadata modification at a second node of the metadata subsystem; and apply at least one modification to the data of the file store object at a plurality of extent replicas at the storage subsystem prior to a transmission of a response to the particular client request, wherein the data of the file store object is stored on a plurality of nodes of the storage subsystem and the metadata that corresponds to the data of the file store object is stored on one or more other nodes of the metadata subsystem separate from the data of the particular file store object. 2. The system as recited in claim 1 , wherein the plurality of computing devices is configured to: utilize a replicated state machine to generate a response to a particular read request for which respective physical read operations are performed at a plurality of storage devices. 3. The system as recited in claim 1 , wherein the service access subsystem, the metadata subsystem and the storage subsystem are each implemented using respective sets of resources of the provider network, wherein the plurality of computing devices is further configured to: detect one or more of: (a) a potential performance bottleneck at a particular subsystem of a set of subsystems comprising the service access subsystem, the metadata subsystem and the storage subsystem or (b) a node health state change requiring additional resources to be deployed at the particular subsystem; and initiate a deployment of additional resources of the provider network to the particular subsystem, without modifying the number of resources used for remaining subsystems of the set. 4. The system as recited in claim 1 , wherein the plurality of computing devices are further configured to: utilize a consensus-based protocol to replicate log records of changes to a state of the particular file store; and store a representation of the state of the particular file store as a plurality of erasure-coded replicas. 5. The system as recited in claim 1 , wherein the plurality of computing devices are further configured to: store, at a particular node of the storage subsystem, a particular extent replica belonging to a second replica group that includes at least a subset of data content of one or more file stores including the particular file store; and store, at the particular node of the storage subsystem, a particular extent replica of a different replica group that includes at least a subset of metadata of one or more file stores including the particular file store. 6. The system as recited in claim 1 , wherein the plurality of computing devices are further configured to: distribute metadata and data of the particular file store among a plurality of physical storage devices including at least one solid-state disk (SSD device) and one rotating disk device. 7. A method, comprising: performing, by one or more computing devices: receiving a particular client request directed to a file store object, formatted in accordance with an industry-standard file system interface, at an access subsystem of a multi-tenant storage service; determining, at the access subsystem, that the client request meets authentication and authorization requirements; initiating, in response to the particular client request, an atomic metadata operation comprising a group of file system metadata modifications to metadata that corresponds to data of the file store object, including a first metadata modification at a first node of a metadata subsystem of the storage service and a second metadata modification at a second node of the metadata subsystem, wherein the data of the file store object is stored on one or more nodes of a storage subsystem of the storage service separate from the metadata that corresponds to the data of the file store object; verifying, in response to the particular client request, that a plurality of replicas of at least one data modification at a storage subsystem of the storage service have been saved; and storing a record of completion of the particular client request, wherein the record is to be used, asynchronously with respect to the particular client request, to generate a billing amount to a customer of the storage service in accordance with a usage-based pricing policy. 8. The method as recited in claim 7 , wherein the access subsystem, the metadata subsystem and the storage subsystem are each implemented using respective sets of resources of a provider network, further comprising performing, by one or more computing devices of the plurality of computing devices: initiating, in response to a detection of a triggering condition, a deployment of additional resources of the provider network to a particular subsystem of a set of subsystems comprising the access subsystem, the metadata subsystem and the storage subsystem, without modifying the number of resources used for remaining subsystems of the set. 9. The method as recited in claim 7 , further comprising performing, by the plurality of computing devices: utilizing a consensus-based protocol to replicate log records of changes to a state of the particular file store; and storing a representation of the state of the particular file store as a plurality of erasure-coded replicas. 10. The method as recited in claim 7 , further comprising performing, by the plurality of computing devices: storing, at a particular node of the storage subsystem, a particular replica belonging to a replica group storing data content of one or more file stores; and storing, at the particular node of the storage subsystem, a particular replica of a different replica group storing metadata associated with one or more file stores. 11. The method as recited in claim 7 , further comprising performing, by the plurality of computing devices: allocating, in response to one or more write requests directed to a particular file store object, a first set of blocks of storage for write contents indicated in the write requests, and a second set of blocks of storage for metadata associated with the file store object, wherein sizes of blocks of the first set are selected according to a data block sizing policy, wherein sizes of blocks of the second set are selected according to a metadata block sizing polic
Distributed file systems · CPC title
Techniques for file synchronisation in file systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.