Intelligent local management of data stream throttling in secondary-copy operations
US-2016248676-A1 · Aug 25, 2016 · US
US11989186B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11989186-B2 |
| Application number | US-201816199078-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 23, 2018 |
| Priority date | Nov 23, 2018 |
| Publication date | May 21, 2024 |
| Grant date | May 21, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and computer-readable media for a scalable architecture for a distributed time-series database are disclosed. Using a fleet of ingestion routers, time-series data generated by a plurality of client devices is stored into a plurality of durable partitions. The time-series data comprises a plurality of time series, and an amount of the ingestion routers is determined based at least in part on an ingestion rate of the time-series data. Using a fleet of stream processors, the time-series data from the durable partitions is stored into a plurality of storage tiers including a first storage tier and a second storage tier. A retention period for the first storage tier differs from a retention period for the second storage tier. An amount of the stream processors is determined based at least in part on the time-series data in the durable partitions.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: one or more computing devices comprising respective processors and memory configured to implement a control plane; a plurality of computing devices comprising respective processors and memory configured to implement a fleet of ingestion routers, wherein the fleet of ingestion routers is configured to: receive time-series data generated by a plurality of client devices, wherein the time-series data is associated with a plurality of time series, and wherein an amount of the ingestion routers is determined by the control plane based at least in part on an ingestion rate of the time-series data; and partition the time-series data based at least in part on the plurality of time series to generate partitioned time-series data; one or more persistent storage resources comprising a plurality of durable partitions, wherein the one or more persistent storage resources are configured to store individual partitions of the partitioned time-series data sent from the fleet of ingestion routers in respective ones of the plurality of durable partitions; a plurality of computing devices comprising respective processors and memory configured to implement a fleet of stream processors, wherein an amount of the stream processors in the fleet is determined by the control plane based at least in part on the partitioned time-series data in the durable partitions, wherein the fleet of stream processors is configured to: retrieve the time-series data, stored by the fleet of ingestion routers, from the durable partitions maintained at one or more persistent storage resources of the streaming service; send a first one or more elements of the retrieved time-series data to a first storage tier; and send a different second one or more elements of the retrieved time-series data to a second storage tier; and a plurality of storage tiers, including the first storage tier and the second storage tier, respectively different from the one or more persistent storage resources, wherein individual ones of the plurality of storage tiers are different from and communicatively coupled over a network to respective ones of the fleet of stream processors, wherein a retention period for the first storage tier differs from a retention period for the second storage tier, wherein a performance characteristic for the first storage tier differs from a performance characteristic for the second storage tier, and wherein the individual ones of the plurality of storage tiers are configured to store the retrieved time-series data sent from the fleet of stream processors; and a plurality of computing devices comprising respective processors and memory configured to implement a fleet of query processors configured to access time-series data stored in the first storage tier and the second storage tier, wherein individual ones of the fleet of query processors are each different from individual ones of the fleet of stream processors. 2. The system as recited in claim 1 , wherein the fleet of query processors is configured to: perform queries of the time-series data stored in the plurality of storage tiers, wherein an amount of the query processors is determined by the control plane based at least in part on the queries. 3. The system as recited in claim 1 , wherein an amount of the durable partitions is determined by the control plane based at least in part on the time-series data. 4. The system as recited in claim 1 , wherein an amount of storage resources in the first tier is determined by the control plane based at least in part on an amount of the time-series data within the retention period for the first storage tier, and wherein an amount of storage resources in the second tier is determined by the control plane based at least in part on an amount of the time-series data within the retention period for the second storage tier. 5. A method, comprising: storing, by a fleet of ingestion routers into a plurality of durable partitions maintained at one or more persistent storage resources, time-series data generated by a plurality of client devices, wherein the time-series data is associated with a plurality of time series, and wherein an amount of the ingestion routers is determined based at least in part on an ingestion rate of the time-series data; retrieving, by a fleet of stream processors, the time-series data from the durable partitions maintained at one or more persistent storage resources; storing, by the fleet of stream processors, the time-series data retrieved from the durable partitions into a plurality of storage tiers including a first storage tier and a second storage tier, wherein a first one or more elements of the retrieved time-series data is stored by the fleet of stream processors into the first storage tier and a different second one or more elements of the retrieved time-series data is stored by the fleet of stream processors into the second storage tier, wherein individual ones of the plurality of storage tiers are different from the one or more persistent storage resources from which the time-series data is retrieved, wherein the individual ones of the plurality of storage tiers are different from and communicatively coupled over a network to respective ones of the fleet of stream processors, wherein a retention period for the first storage tier differs from a retention period for the second storage tier, and wherein an amount of the stream processors is determined based at least in part on the time-series data in the durable partitions; and accessing, by a fleet of query processors, time-series data stored in first storage tier and the second storage tier, wherein individual ones of the fleet of query processors are each different from individual ones of the fleet of stream processors. 6. The method as recited in claim 5 , further comprising: performing, by a fleet of query processors, queries of the time-series data stored in the plurality of storage tiers, wherein an amount of the query processors is determined based at least in part on the queries. 7. The method as recited in claim 5 , wherein an amount of the durable partitions is determined based at least in part on the time-series data. 8. The method as recited in claim 5 , wherein an amount of storage resources in the first tier is determined based at least in part on an amount of the time-series data within the retention period for the first storage tier, and wherein an amount of storage resources in the second tier is determined based at least in part on an amount of the time-series data within the retention period for the second storage tier. 9. The method as recited in claim 5 , wherein a latency characteristic for the first storage tier differs from a latency characteristic for the second storage tier. 10. The method as recited in claim 5 , wherein the time-series data is partitioned into the durable partitions based at least in part on a hierarchy of the time series. 11. The method as recited in claim 5 , wherein the time-series data is stored in the first storage tier using a plurality of tiles, wherein the tiles are partitioned based at least in part on spatial boundaries and temporal boundaries. 12. The method as recited in claim 5 , further comprising: organizing, by the fleet of stream processors, the time-series data from the durable partitions into a plurality of tables, wherein the tables are stored in the plurality of storage tiers; and transforming, by the fleet of stream processors, the time-series data from the tables into a plurality of additional tables, wherein the additional tables are stored in the plurality of storage tiers. 13. The method as recited in
Data stream processing; Continuous queries · CPC title
between a Database Management System and a front-end application · CPC title
Data partitioning, e.g. horizontal or vertical partitioning · CPC title
Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title
Lifecycle management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.