Coordinated suspension of replication groups
US-2015379100-A1 · Dec 31, 2015 · US
US9641580B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9641580-B2 |
| Application number | US-201414320706-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 1, 2014 |
| Priority date | Jul 1, 2014 |
| Publication date | May 2, 2017 |
| Grant date | May 2, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A low-latency cloud-scale computation environment includes a query language, optimization, scheduling, fault tolerance and fault recovery. An event model can be used to extend a declarative query language so that temporal analysis of event of an event stream can be performed. Extractors and outputters can be used to define and implement functions that extend the capabilities of the event-based query language. A script written in the extended query language can be translated into an optimal parallel continuous execution plan. Execution of the plan can be orchestrated by a streaming job manager which schedules vertices on available computing machines. The streaming job manager can monitor overall job execution. Fault tolerance can be provided by tracking execution progress and data dependencies in each vertex. In the event of a failure, another instance of the failed vertex can be scheduled. An optimal recovery point can be determined based on checkpoints and data dependencies.
Opening claim text (preview).
What is claimed: 1. A system comprising: at least one processor: a memory connected to the at least one processor; and a cloud-scale query execution platform that supports distributed stream processing comprising a streaming job manager that monitors execution information about streaming jobs executed by a plurality of vertices executing on a plurality of computing devices, the streaming job manager receiving execution progress information and data dependencies for the plurality of vertices, each vertex of the plurality of vertices configured to process events associated with one or more streaming jobs, and each event being labeled with a sequence number used by at least the streaming job manager to describe and track dependencies between input, output and state of a vertex. 2. The system of claim 1 , the streaming job manager scheduling a new vertex in response to detecting a failed vertex of the plurality of vertices, the streaming job manager determining a closest checkpoint from which to resume processing on the new vertex. 3. The system of claim 2 , the streaming job manager calculating a minimum sequence number of event sequence numbers from which the new vertex resumes processing. 4. The system of claim 1 , further comprising a script processor that receives a script written in a declarative query language, the declarative query language supporting distributed stream processing through temporal analysis of input event streams. 5. The system of claim 1 , further comprising a streaming execution plan optimizer that receives a compiled script written in a declarative query language, the declarative query language having a capability to receive user-defined functions to consume event streams. 6. The system of claim 1 , wherein the sequence number is assigned from a monotonically increasing sequence to an event of a plurality of events in an event stream. 7. The system of claim 1 , wherein the execution platform assigns the sequence number to each of the events in an event stream. 8. A method comprising: receiving by a processor of a computing device, execution progress information associated with a plurality of streaming jobs executed by a plurality of vertices executing on a plurality of computing devices, each vertex of the plurality of vertices configured to process events associated with one or more of the plurality of streaming jobs, and each event being assigned a sequence number that describes and tracks dependencies between input, output and state of at least one vertex of the plurality of vertices; in response to detecting a vertex failure among the plurality of vertices, scheduling a new vertex; and determining a closest checkpoint from which to resume processing on the new vertex from the sequence numbers assigned to the events in the streaming jobs. 9. The method of claim 8 , further comprising: performing failure recovery by calculating a minimum sequence number of event sequence numbers from which the new vertex resumes processing. 10. The method of claim 8 , further comprising: receiving a script in a query language extended to support distributed stream processing through temporal analysis of event streams; and generating an optimized streaming execution plan from the script, the script comprising a stream extractor that converts information from a continuous input source into event streams. 11. The method of claim 8 , further comprising: receiving a script in a query language extended to support distributed stream processing through temporal analysis of event streams; and generating an optimized streaming execution plan from the script, the script comprising a stream outputter that performs user-defined actions processing streaming output events. 12. The method of claim 8 , further comprising: receiving the sequence number associated with a last consumed or a last produced event from a vertex of the plurality of vertices. 13. The method of claim 8 , wherein the sequence numbers are monotonically increasing sequence numbers. 14. A computer-readable storage medium comprising computer-readable instructions which when executed cause at least one processor of a computing device to: receive data dependency information associated with a plurality of streaming jobs executed by a plurality of vertices executing on a plurality of computing devices, each vertex of the plurality of vertices configured to process events associated with one or more of the plurality of streaming jobs, and each event being assigned a sequence number that describes and tracks dependencies between input, output and state of at least one vertex of the plurality of vertices; in response to detecting a vertex failure among the plurality of vertices, perform job recovery by scheduling a new vertex; and determine a closest checkpoint from which to resume processing on the new vertex using the sequence numbers assigned to the events in one or more of the plurality of streaming jobs. 15. The computer-readable storage medium of claim 14 , comprising further computer-readable instructions which when executed cause the at least one processor to: calculate a minimum sequence number of event sequence numbers from which the new vertex resumes processing based on stored checkpointing data. 16. The computer-readable storage medium of claim 15 , comprising further computer-readable instructions which when executed cause the at least one processor to: generate an optimized streaming execution plan from a script written in a query language extended to support distributed stream processing through temporal analysis of input event streams. 17. The computer-readable storage medium of claim 14 , comprising further computer-readable instructions which when executed cause the at least one processor to: generate an optimized streaming execution plan from a script written in a query language having a capability to receive user-defined functions to consume event streams. 18. The computer-readable storage medium of claim 14 , comprising further computer-readable instructions which when executed cause the at least one processor to: generate an optimized streaming execution plan from a script written in a query language having a capability to receive user-defined functions to produce event streams. 19. The computer-readable storage medium of claim 14 , comprising further computer-readable instructions which when executed cause the at least one processor to: receive execution progress information comprising last event processed and last event produced from a vertex of the plurality of vertices. 20. The computer-readable storage medium of claim 19 , comprising further computer-readable instructions which when executed cause the at least one processor to: assign a monotonically increasing sequence number to each event in an event stream.
Plan optimisation · CPC title
Network streaming of media packets · CPC title
Restarting or rejuvenating · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.