Simplified Hash Table
US-2024422006-A1 · Dec 19, 2024 · US
US9940169B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9940169-B2 |
| Application number | US-201514806755-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 23, 2015 |
| Priority date | Jul 23, 2015 |
| Publication date | Apr 10, 2018 |
| Grant date | Apr 10, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments related to processing data sets in real-time by using a distributed network to generate and process partitioned streams. Messages are assigned to partition streams. Within each stream, each of a set of processors perform a designated task. Results from the task are transmitted (directly or indirectly) to another processor in the stream. The distributed and ordered processing can allow results to be transmitted while or before the results are stored.
Opening claim text (preview).
What is claimed is: 1. A system for processing data sets in real-time by using a distributed network to generate and process partitioned streams, the system comprising: a message allocator processor that: receives a plurality of data sets from one or more producer devices; for each of the plurality of data sets: identifies a tag or characteristic of the data set; identifies an initial partition stream from amongst a plurality of initial partition streams that corresponds to the tag or the characteristic; and appends the data set to the identified initial partition stream, such that the data set is associated with a rank that is higher than other ranks associated with other data sets in the identified initial partition stream; a partition controller processor that, for the identified initial partition stream of the plurality of initial partition streams, manages a set of task processors such that: each task processor hi the set of task processors is designated to perform a task hi a workflow so as to process data sets in the identified initial partition stream in an order that corresponds to the ranks, wherein the set of task processors includes: a first task processor designated to perform a first task; a second task processor designated to perform a second task; and a third task processor designated to perform a third task; the first task processor hi the set of task processors is configured to: generate, via performance of the first task, processed data sets corresponding to the data sets hi the identified initial partition stream; facilitate storing the processed data sets at a first data store; generate a processed partition stream that includes the processed data sets in the identified initial partition stream; and facilitate routing the processed partition stream for further processing of the processed data sets in accordance with one or more other tasks; the second task processor in the set of task processors is configured to: generate, via performance of the second task, a score corresponding to each data set in the identified initial partition stream; and facilitate storing the scores at a second data store; and the third task processor in the set of task processors is configured to repeatedly: retrieve a plurality of scores from the second data store, for each score in the plurality of scores; generate, via performance of the third task, a real-time analytic variable based on the plurality of scores; and facilitate providing the real-time analytic variable to a client device, wherein the repeated retrieval of the plurality of scores and the repeated generation of the real-time analytic variable causes the real-time analytic variable to be updated in response to appending and task-performance processing of new data appended to the identified initial partition stream. 2. The system for processing data sets hi real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein each of the first task processor and the third task processor includes a virtual server. 3. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein the set of task processors is managed such that a single stream is sent to a plurality of task processors in the set of task processors for parallel performance of tasks designated to be performed by the plurality of task processors, wherein the single stream includes a particular processed version of the identified initial partition stream. 4. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein each of the first data store and the second data store is a part of a same network attached storage or storage area network. 5. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein the managing the set of task processors further includes: monitoring a latency of completing performance of one or more tasks using the data set relative to a time at which the data set was received or appended to the identified initial partition stream; comparing the latency to a threshold; and when it is determined that the latency exceeds the threshold: identifying a position in the workflow as a potential source of the latency exceeding the threshold, the position corresponding to a task processor designated to perform one or more tasks in the workflow; identifying a new task processor to be included in the set of task processors; designating the new task processor for performing part of the one or more tasks in the workflow; and modifying the designation of the new task processor so as to be designated to perform at least part of a remainder of the one or more tasks in the workflow. 6. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein the partition controller further updates the identified initial partition stream so as to remove the data sets in the identified initial partition stream that have been processed by the first task processor via performance of the first task to generate corresponding processed data sets. 7. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein the partition controller further streams the identified initial partition stream to the first task processor. 8. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein the third task processor is controlled so as to further generate a second real-time analytic variable based on a subset of the plurality of scores; and wherein the system further includes: a transceiver that transmits the second real-time analytic variable to the client device. 9. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein the real-time analytic variable does not depend on data sets included in any partition stream, other than the identified initial partition stream, of the plurality of initial partition streams such that the identified initial partition stream facilitates data isolation in workflow processing. 10. The system for processing data sets in real-time by using the distributed network to generate and process partitioned streams as recited in claim 1 , wherein, for each of the plurality of data sets, the tag or the characteristic for the data set is identified based on an identifier associated with the producer device from which the data set was received. 11. A method for processing data sets in real-time by using a distributed network to generate and process partitioned streams, the method comprising: receiving, at a message allocator, a plurality of data sets from one or more producer devices; for each of the plurality of data sets, the message allocator is configured to: identifying a tag or characteristic of the data set; identifying an initial partition stream from amongst a plurality of initial partition streams that corresponds to the tag or the characteristic; and appending the data set to the identified initial partition stream, such that the data set is associated with a rank that is higher than other ranks associated with other data sets in the identified initial partition stream; for the identified initial partit
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.