Systems and methods for optimization of data element utilization using demographic data
US-12014212-B2 · Jun 18, 2024 · US
US9830188B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9830188-B2 |
| Application number | US-201414303449-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 12, 2014 |
| Priority date | Jun 12, 2014 |
| Publication date | Nov 28, 2017 |
| Grant date | Nov 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This disclosure is directed to methods and systems for calculating statistical quantities of computational resources used by distributed data sources in a computing environment. In one aspect, a master node receives a query regarding use of computational resources used by distributed data sources of a computing environment. The data sources generate metric data that represents use of the computational resources and distribute the metric data to two or more worker nodes. The master node directs each worker node to generate worker-node data that represents the metric data received by each of the worker nodes and each worker node sends worker-node data to the master node. The master node receives the worker-node data and calculates a master-data structure based on the worker-node data, which may be used to estimate percentiles of the metric data in response to the query.
Opening claim text (preview).
The invention claimed is: 1. A method stored in one or more data-storage devices and executed using one or more processors of a computing environment, the method comprising: receiving a query at a master node about use of computational resources of the computing environment; distributing metric data generated by data sources in the computing environment to one or more worker nodes, the metric data represents use of the computational resources; each worker node performs the following: receiving the query from the master node, generating worker-node data that represents the metric data received by the worker node in response to the query, and sending the worker-node data to the master node; and calculating at the master node a master-data structure that represents a distribution of the metric data over data-structure intervals based on the worker-node data sent from the worker nodes. 2. The method of claim 1 further comprises calculating estimated percentiles from the master-data structure based on the query. 3. The method of claim 1 , wherein distributing the metric data further comprises: generating the metric data at the data sources of the computing environment that use the computational resources; partitioning the metric data into two or more unique subsets of the metric data; and sending a unique subset of the metric data to each of the two or more worker nodes. 4. The method of claim 1 , wherein generating the worker-node data that represents the metric data received by the worker node further comprises when data size of the metric data received by the worker node is less than or equal to a memory bound of the worker node, the worker-node data is an array of the metric data. 5. The method of claim 1 , wherein generating the worker-node data that represents the metric data received by the worker node further comprises when the data size of the metric data is greater than the memory bound of the worker node, generating a data structure that represents a distribution of the metric data received by the worker, the worker-node data is the data structure. 6. The method of claim 5 , wherein generating the data structure that represents the distribution of the metric data further comprises: estimating a minimum-metric value and a maximum-metric value for the subset of metric data received by the worker node; forming data-structure intervals that combined cover values of the metric data; identifying data-structure intervals the estimated minimum-metric value and the estimated maximum-metric value are in; calculating an interval degree D based on the number of data-structure intervals and a number of intervals between and including the data-structure intervals that contain the estimated minimum-metric value and the estimated maximum-metric value; and counting each metric value of the metric data that lies in the data-structure intervals to form a frequency distribution the metric data over the data-structure intervals. 7. The method of claim 6 , further comprises: when the interval degree D is greater than or equal to one, splitting each of the data-structure intervals into 2 D data-structure subintervals; and counting each metric value of the metric data that lies in the data-structure subintervals to form a frequency distribution the metric data over the data-structure subintervals. 8. The method of claim 1 , wherein calculating the master-data structure further comprises: initializing current data as first worker-node data received; and for each worker-node data received after the first worker-node data, combining the worker-node data with the current data to update current data, the current data being the master-data structure when worker nodes have finished. 9. The method of claim 1 , wherein combining the worker-node data with the current data to update the current data further comprises: when current data and the worker-node data are metric data, combining current data and worker-node data to update current data; when current data is metric data and the worker-node data is a data structure, converting the current data to a data structure and aggregating the current data with the worker-node data; when current data is a data structure and the worker-node data is metric data, converting the worker-node data to a data structure and aggregating the current data with the worker-node data; and when current data and the worker-node data are data structures, aggregating the current data with the worker-node data. 10. The method of claim 1 , wherein a single worker node operates at the master node when a volume of the metric data output from the data sources is below a threshold. 11. A system for generating a data structure of metric data generated in a computing environment comprising: one or more processors; one or more data-storage devices; and a routine stored in the data-storage devices and executed using the one or more processors, the routine receiving a query at a master node about use of computational resources of the computing environment; distributing metric data generated by data sources in the computing environment to one or more worker nodes, the metric data represents use of the computational resources; each worker node performs the following: receiving the query from the master node, generating worker-node data that represents the metric data received by the worker node in response to the query, and sending the worker-node data to the master node; and calculating at the master node a master-data structure that represents a distribution of the metric data over data-structure intervals based on the worker-node data sent from the worker nodes. 12. The system of claim 11 further comprises calculating estimated percentiles from the master-data structure based on the query. 13. The system of claim 11 , wherein distributing the metric data further comprises: generating the metric data at data sources of the computing environment that use the computational resources; partitioning the metric data into two or more unique subsets of the metric data; and sending a unique subset of the metric data to each of the two or more worker nodes. 14. The system of claim 11 , wherein generating the worker-node data that represents the metric data received by the worker node further comprises when data size of the metric data received by the worker node is less than or equal to a memory bound of the worker node, the worker-node data is an array of the metric data. 15. The system of claim 11 , wherein generating the worker-node data that represents the metric data received by the worker node further comprises when the data size of the metric data is greater than the memory bound of the worker node, generating a data structure that represents a distribution of the metric data received by the worker, the worker-node data is the data structure. 16. The system of claim 15 , wherein generating the data structure that represents the distribution of the metric data further comprises: estimating a minimum-metric value and a maximum-metric value for the subset of metric data received by the worker node; forming data-structure intervals that combined cover values of the metric data; identifying data-structure intervals the estimated minimum-metric value and the estimated maximum-metric value are in; calculating an interval degree D based on the number of data-structure intervals and a number of intervals between and including the data-structure intervals that contain the estimated minimum-metric value and the estimated maximum-metric value; and counting ea
Performance evaluation by statistical analysis · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting · CPC title
Monitoring · CPC title
for performance assessment · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.