Cache Based Efficient Access Scheduling for Super Scaled Stream Processing Systems
US-2017242889-A1 · Aug 24, 2017 · US
US10409658B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10409658-B2 |
| Application number | US-201715417418-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 27, 2017 |
| Priority date | Jan 27, 2017 |
| Publication date | Sep 10, 2019 |
| Grant date | Sep 10, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and associated system. For each message of a message batch, the message is assigned by a consumer to a partition of a log. Each partition of the log is associated with a respective processing engine. The message batch includes messages that includes the message. The consumer stores an offset value for each partition. Responsive to completion, by a distributor, of sending a last message in the message batch to the consumer, the distributor ascertains a target offset value for each partition based on a current offset value. In response to a processing engine completing processing of a message assigned to the partition associated with the processing engine, the consumer updates the stored offset value associated with the partition. The distributor determines when all messages of the batch have been processed based on the target offset values and the stored offset values.
Opening claim text (preview).
What is claimed is: 1. A method, said method comprising: producing, by a distributor, a message batch of messages to be sent, by the distributor, to a consumer for being processed by the consumer, wherein each message in the message batch is assigned to a partition of a log, wherein each partition is associated with a respective processing engine configured to process, for the consumer, the messages assigned to each partition, wherein a currently stored offset value for each partition is a monotonically increasing offset function f(n) of a number (n) of currently processed messages in each partition, and wherein the currently stored offset value for each partition is dynamically updated as each message in each partition is processed by the respective processing engine; responsive to completion, by the distributor, of sending a last message in the message batch to the consumer, ascertaining, by the distributor, a target offset value for each partition, said target offset value for each partition being a high watermark against which the currently stored offset values for each partition may be compared to determine whether all of the currently stored offset values for each partition have been processed; and determining; by the distributor, when all messages of the message batch have been processed based on the target offset values and the currently stored offset values. 2. The method of claim 1 , wherein said determining when all messages of the message batch have been processed comprises: obtaining the currently stored offset value of each partition; comparing the currently stored offset value with the target offset value for each partition; and determining that all of the messages of the message batch have not been processed if the currently stored offset value is less than the target offset value for at least one of the partitions or determining that all of the messages of the message batch have been processed if the currently stored offset value is greater than or equal to the target offset value for all of the partitions. 3. The method of claim 2 , wherein said obtaining the currently stored offset value for each partition comprises: polling the consumer for the currently stored offset value of each partition; and storing the offset value returned from said polling. 4. The method as claimed of claim 1 , wherein the consumer comprises a plurality of consumer entities. 5. The method of claim 1 , wherein said obtaining the currently stored offset value of each partition is implemented using an Application Programming Interface, API. 6. The method of claim 1 , wherein the log comprises a distributed commit log. 7. The method of claim 1 , wherein at least one processing engine of the respective processing engines is implemented by a cloud-based server. 8. The method of claim 1 , wherein f(n) is defined recursively via f(n+1)=Af(n)+B for n=0, 1, . . . , N−1, wherein f(0)=K, wherein N is a total number of messages assigned to each partition, and wherein A, B, and K are pre-defined constants having integer values such that A≥1, B>0, and K≥0. 9. The method of claim 1 , wherein for each partition, the high watermark is the offset function f(N) evaluated at the total number (N) of messages assigned to each partition. 10. A computer program product, comprising one or more computer readable hardware storage devices having computer readable program code stored therein, said program code containing instructions executable by one or more processors of a data processing system to implement a method, said method comprising: producing, by a distributor, a message batch of messages to be sent, by the distributor, to a consumer for being processed by the consumer, wherein each message in the message batch is assigned to a partition of a log, wherein each partition is associated with a respective processing engine configured to process, for the consumer, the messages assigned to each partition, wherein a currently stored offset value for each partition is a monotonically increasing offset function f(n) of a number (n) of currently processed messages in each partition, and wherein the currently stored offset value for each partition is dynamically updated as each message in each partition is processed by the respective processing engine; responsive to completion, by the distributor, of sending a last message in the message batch to the consumer, ascertaining, by the distributor, a target offset value for each partition, said target offset value for each partition being a high watermark against which the currently stored offset values for each partition may be compared to determine whether all of the currently stored offset values for each partition have been processed; and determining, by the distributor, when all messages of the message batch have been processed based on the target offset values and the currently stored offset values. 11. The computer program product of claim 10 , wherein said determining when all messages of the message batch have been processed comprises: obtaining the currently stored offset value of each partition; comparing the currently stored offset value with the target offset value for each partition; and determining that all of the messages of the message batch have not been processed if the currently stored offset value is less than the target offset value for at least one of the partitions or determining that all of the messages of the message batch have been processed if the currently stored offset value is greater than or equal to the target offset value for all of the partitions. 12. The computer program product of claim 11 , wherein said obtaining the currently stored offset value for each partition comprises: polling the consumer for the currently stored offset value of each partition; and storing the offset value returned from said polling. 13. The computer program product of claim 10 , wherein f(n) is defined recursively via f(n+1)=Af(n)+B for n=0, 1, . . . , N−1, wherein f(0)=K, wherein N is a total number of messages assigned to each partition, and wherein A, B, and K are pre-defined constants having integer values such that A≥1, B>0, and K≥0. 14. The computer program product of claim 10 , wherein for each partition, the high watermark is the offset function f(N) evaluated at the total number (N) of messages assigned to each partition. 15. A data processing system, comprising one or more processors, one or more memories, and one or more computer readable hardware storage devices, said one or more hardware storage device containing program code executable by the one or more processors via the one or more memories to implement a method, said method comprising: producing, by a distributor, a message batch of messages to be sent, by the distributor, to a consumer for being processed by the consumer, wherein each message in the message batch is assigned to a partition of a log, wherein each partition is associated with a respective processing engine configured to process, for the consumer, the messages assigned to each partition, wherein a currently stored offset value for each partition is a monotonically increasing offset function f(n) of a number (n) of currently processed messages in each partition, and wherein the currently stored offset value for each partition is dynamically updated as each message in each partition is processed by the respective processing engine; responsive to completion, by the distributor, of sending a last message in the message batch to the consumer, ascertaining, by the distributor, a target offset value for each partition, said target offset value for each partiti
Message passing systems or structures, e.g. queues · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.