Mapreduce implementation in an on-demand network code execution system and stream data processing system
US-2020104378-A1 · Apr 2, 2020 · US
US11550505B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11550505-B1 |
| Application number | US-202017008998-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 1, 2020 |
| Priority date | Sep 1, 2020 |
| Publication date | Jan 10, 2023 |
| Grant date | Jan 10, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A data stream may include a plurality of records that are ordered, and the plurality of records may be assigned to a processing shard. A first set of virtual shards may be formed, the first set of virtual shards having a first quantity of virtual shards that perform parallel processing operations on behalf of the processing shard. First records of the plurality of records may be processed using the first set of virtual shards. The first quantity of virtual shards may be modified, based at least in part on an observed record age, to a second quantity of virtual shards that perform parallel processing operations on behalf of the processing shard. A second set of virtual shards may be formed having the second quantity of virtual shards. Second records of the plurality of records may be processed using the second set of virtual shards.
Opening claim text (preview).
What is claimed is: 1. A computing system comprising: one or more processors; and one or more memories having stored therein computing instructions that, upon execution by the one or more processors, cause the computing system to perform operations comprising: receiving a data stream including a plurality of records that are ordered, wherein the plurality of records are assigned to a processing shard; receiving an indication of a target record age; forming a first set of virtual shards, the first set of virtual shards having a first quantity of virtual shards that execute in parallel on behalf of the processing shard; processing first records of the plurality of records using the first set of virtual shards; determining an observed record age associated with the processing shard; modifying, based at least in part on a relationship between the observed record age and the target record age, the first quantity of virtual shards to a second quantity of virtual shards that execute in parallel on behalf of the processing shard; forming a second set of virtual shards, the second set of virtual shards having the second quantity of virtual shards, wherein the second set of virtual shards is formed by adding or removing one or more virtual shards to or from the first set of virtual shards; and processing second records of the plurality of records using the second set of virtual shards. 2. The computing system of claim 1 , wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards comprises increasing the first quantity of virtual shards to the second quantity of virtual shards based at least in part on an increase in the observed record age. 3. The computing system of claim 1 , wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards comprises decreasing the first quantity of virtual shards to the second quantity of virtual shards based at least in part on a decrease in the observed record age. 4. The computing system of claim 1 , wherein the operations further comprise receiving an indication of a maximum parallelization factor, wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards is based in part on the maximum parallelization factor. 5. The computing system of claim 1 , wherein the operations further comprise receiving an indication of a minimum parallelization factor, wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards is based in part on the minimum parallelization factor. 6. A computer-implemented method comprising: receiving a data stream including a plurality of records that are ordered, wherein the plurality of records are assigned to a processing shard; forming a first set of virtual shards, the first set of virtual shards having a first quantity of virtual shards that execute in parallel on behalf of the processing shard; processing first records of the plurality of records using the first set of virtual shards; determining an observed record age associated with the processing shard; modifying, based at least in part on the observed record age, the first quantity of virtual shards to a second quantity of virtual shards that execute in parallel on behalf of the processing shard; forming a second set of virtual shards, the second set of virtual shards having the second quantity of virtual shards, wherein the second set of virtual shards is formed by adding or removing one or more virtual shards to or from the first set of virtual shards; and processing second records of the plurality of records using the second set of virtual shards. 7. The computer-implemented method of claim 6 , further comprising receiving an indication of a target record age, wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards is based at least in part on a relationship between the observed record age and the target record age. 8. The computer-implemented method of claim 6 , wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards comprises increasing the first quantity of virtual shards to the second quantity of virtual shards based at least in part on an increase in the observed record age. 9. The computer-implemented method of claim 6 , wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards comprises decreasing the first quantity of virtual shards to the second quantity of virtual shards based at least in part on a decrease in the observed record age. 10. The computer-implemented method of claim 6 , further comprising receiving an indication of a maximum parallelization factor, wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards is based in part on the maximum parallelization factor. 11. The computer-implemented method of claim 6 , further comprising receiving an indication of a minimum parallelization factor, wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards is based in part on the minimum parallelization factor. 12. The computer-implemented method of claim 6 , wherein the plurality of records are ordered based at least in part on a plurality of key values, and wherein each record of the plurality of records is assigned a respective key value of the plurality of key values. 13. The computer-implemented method of claim 12 , further comprising maintaining a mapping of the plurality of key values to virtual shards, wherein the mapping indicates, for each key value of the plurality of key values, a respective virtual shard for processing of the key value. 14. One or more non-transitory computer-readable storage media having stored thereon computing instructions that, upon execution by one or computing devices, cause the one or more computing devices to perform operations comprising: receiving a data stream including a plurality of records that are ordered, wherein the plurality of records are assigned to a processing shard; forming a first set of virtual shards, the first set of virtual shards having a first quantity of virtual shards that execute in parallel on behalf of the processing shard; processing first records of the plurality of records using the first set of virtual shards; determining an observed record age associated with the processing shard; modifying, based at least in part on the observed record age, the first quantity of virtual shards to a second quantity of virtual shards that execute in parallel on behalf of the processing shard; forming a second set of virtual shards, the second set of virtual shards having the second quantity of virtual shards, wherein the second set of virtual shards is formed by adding or removing one or more virtual shards to or from the first set of virtual shards; and processing second records of the plurality of records using the second set of virtual shards. 15. The one or more non-transitory computer-readable storage media of claim 14 , further comprising receiving an indication of a target record age, wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards is based at least in part on a relationship between the observed record age and the target record age. 16. The one or more non-transitory computer-readable storage media of claim 14 , wherein the modifying of the first quantity of virtual shards to the second quantity of virtual shards comprises increasing the first quantity of vi
Improving or facilitating administration, e.g. storage management · CPC title
using a plurality of independent parallel functional units · CPC title
Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title
by program, e.g. task dispatcher, supervisor, operating system · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.