Technologies for dividing work across accelerator devices
US-2024143410-A1 · May 2, 2024 · US
US12014218B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12014218-B2 |
| Application number | US-201916530898-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 2, 2019 |
| Priority date | Mar 29, 2010 |
| Publication date | Jun 18, 2024 |
| Grant date | Jun 18, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Customers of a shared-resource environment can provision resources in a fine-grained manner that meets specific performance requirements. A customer can provision a data volume with a committed rate of Input/Output Operations Per Second (IOPS) and pay only for that commitment (plus any overage), and the amount of storage requested. The customer will then at any time be able to complete at least the committed rate of IOPS. If the customer generates submissions at a rate that exceeds the committed rate, the resource can still process at the higher rate when the system is not under pressure. Even under pressure, the system will deliver at least the committed rate. Multiple customers can be provisioned on the same resource, and more than one customer can have a committed rate on that resource. Customers without committed or guaranteed rates can utilize the uncommitted portion, or committed portions that are not being used.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: a plurality of storage servers, respectively comprising a first one or more processors and a first memory, that provide committed rates of input/output (TO) operations for a plurality of data volumes on behalf of different users of a network-available data storage service, wherein individual ones of the IO operations are submitted to respective ones of the plurality of storage servers by respective ones of the different users, and wherein the plurality of data volumes comprise a first data volume provisioned for a user of the different users and a second data volume provisioned for another user of the different users; a plurality of management servers, respectively comprising a second one or more processors and a second memory, that implement a control plane for the network-available data storage service, the control plane configured to: receive, via an interface for the control plane, a request to commit a rate of IO operations for the first data volume to the user; evaluate, responsive to receipt of the request, committed rates of TO operations of the plurality of storage servers to: determine respective rates of unused IO operations committed to the other user for the second data volume based on respective predicted rates of IO operations, submitted by the other user for the second data volume at one or more of the plurality of storage servers, that are less than the respective committed rates of IO operations for the second data volume to predict available IO capacities of the respective one or more of the plurality of storage servers; and identify a storage server of the plurality of storage servers that: provides a committed rate of IO operations for the second data volume provisioned for the other user; and has predicted available IO capacity sufficient to additionally provide at least a portion of the requested rate of IO operations for the first data volume; and commit, according to the respective predicted available IO capacity, the identified storage server to provide, to the user, the portion of the requested rate of IO operations for the first data volume in addition to the committed rate of IO operations for the second data volume, wherein the committed portion of the requested rate of IO operations for the first data volume and the committed rate of IO operations for the second data volume are cumulatively above a total amount of available IO operations capacity of the identified storage server, and wherein the committed portion of the requested rate of IO operations for the first data volume is based on the respective determined rate of unused IO operations committed to the other user for the second data volume. 2. The system of claim 1 , wherein the predicted usage of IO operations is determined based on an average usage of IO operations for the second data volume over a period of time. 3. The system of claim 1 , wherein the control plane is further configured to: evaluate the committed rates of IO operations of the storage servers to identify another storage server of the storage servers that has sufficient capacity in uncommitted IO operations to provide another portion of the committed rate of IO operations for the first data volume; and commit the other storage server to provide the other portion of the committed rate of IO operations for the first data volume. 4. The system of claim 1 , wherein the request that specifies the committed rate of IO operations for the first data volume is a request to create the first data volume. 5. The system of claim 1 , wherein the control plane is configured to send a response to the request indicating that commitment of the requested committed rate of IO operations for the first data volume is complete at the network-available data storage service. 6. The system of claim 1 , wherein to evaluate the committed rates of IO operations of the plurality of storage servers, the control plane is configured to request at least one of commitment information and capacity information from individual ones of the plurality of storage servers. 7. The system of claim 1 , wherein the identified storage server is configured to perform a request for the first data volume in excess of the committed rate of IO operations for the first data volume at a rate for requests without rate commitments or at a blended rate between rates for requests with and without rate commitments. 8. A method, comprising: receiving, via an interface for a control plane of a network-available data storage service, a request to commit a rate of input/output (TO) operations for a first data volume on behalf of a user of the network-available data storage service, wherein individual ones of the IO operations are submitted for the first data volume by the user of the network-available data storage service; evaluating, by the control plane responsive to receipt of the request, one or more of a plurality of storage servers that provide committed rates of TO operations to a plurality of data volumes, including at least the first data volume provisioned for a user of the different users and a second data volume provisioned for another user of the different users including the user, the rates committed on behalf of the different users, to determine respective rates of unused IO operations committed to the other user for the second data volume based on respective predicted rates of TO operations, submitted by the other user for the second data volume at one or more of the plurality of storage servers, that are less than the respective committed rates of IO operations for the second data volume; predict, according to the respective rates of unused IO operations, available TO capacities of the respective one or more of the plurality of storage servers to identify a storage server of the plurality of storage servers that: provides a committed rate of IO operations for the second data volume provisioned for the other user; and has predicted available IO capacity sufficient to additionally provide at least a portion of the requested rate of IO operations for the first data volume; and assigning, by the control plane according to the respective predicted available IO capacity, the identified storage server to provide, to the user, the portion of the requested rate of IO operations as a committed rate of IO operations for the first data volume in addition to the committed rate of IO operations for the second data volume, wherein the committed portion of the requested rate of IO operations for the first data volume and the committed rate of IO operations for the second data volume are cumulatively above a total amount of available IO operations capacity of the identified storage server, and wherein the committed portion of the requested rate of IO operations for the first data volume is based on the respective determined rate of unused IO operations committed to the other user for the second data volume. 9. The method of claim 8 , wherein the predicted usage of IO operations is determined based on an average usage of IO operations for the second data volume over a period of time. 10. The method of claim 8 , further comprising: evaluating, by the control pane, the committed rates of IO operations of the storage servers to identify another storage server of the storage servers that has sufficient capacity in uncommitted IO operations to provide another portion of the committed rate of IO operations for the first data volume; and assigning, by the control pane, the other storage server to provide the other portion of the committed rate of IO operations for the first data volume. 11. The method of claim 8 , wherein the request that specifies the com
Performance criteria · CPC title
considering hardware capabilities · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.