Performing automatic map reduce job optimization using a resource supply-demand based approach
US-2017315848-A1 · Nov 2, 2017 · US
US11036552B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11036552-B2 |
| Application number | US-201615334215-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 25, 2016 |
| Priority date | Oct 25, 2016 |
| Publication date | Jun 15, 2021 |
| Grant date | Jun 15, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and an apparatus of allocating available resources in a cluster system with learning models and tuning methods are provided. The learning model may be trained from historic performance data of previously executed jobs and used to project a suggested amount of resources for execution of a job. The tuning process may suggest a configuration for the projected amount of resources in the cluster system for an optimal operating point. An optimization may be performed with respect to a set of objective functions to improve resource utilization and system performance while suggesting the configuration. Through many executions and job characterization, the learning/tuning process for suggesting the configuration for the projected amount of resources may be improved by understanding correlations of historic data and the objective functions.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: classifying, by a job resource scheduler, a first job in a job queue comprising a plurality of jobs, the first job having a requested amount of resources, according to a set of job characteristics associated with one or more jobs of a plurality of previously executed jobs, the classifying comprising: receiving job information of the job; retrieving the job characteristics from the job information; and identifying the one or more jobs of the previously executed jobs based on a similarity of the job characteristics; projecting, by the job resource scheduler, an amount of resources based on a result of classifying and correlating the set of job characteristics with a learning model generated by machine learning, the machine learning comprising: analyzing performance data associated with a respective set of job characteristics of the one or more jobs of the previously executed jobs; and updating a learning model associated with the one or more jobs of the previously executed jobs according to a result of the analyzing; suggesting a configuration for the projected amount of resources based on an amount of available resources in a cluster system by a tuning kernel of an optimization with respect to a set of objective functions; allocating, by the job resource scheduler, the projected amount of resources in the cluster system for execution of the first job based on the suggested configuration; updating an amount of currently available resources after the allocating; in response to allocating the projected amount of resources in the cluster system, determining, for a next job in the job queue, whether a requested amount of resources of the next job is greater than an updated available amount of resources, moving another job later in the job queue to the front of the job queue if the another job's requested amount of resources is less than or equal to the updated amount of available resources; and in response to the job in the front of the job queue having requested amount of resources less than or equal to the updated amount of available resources, classifying, projecting, suggesting, and allocating the job in the front of the job queue. 2. The method of claim 1 , wherein the classifying of the job according to the job characteristics associated with the one or more jobs of the previously executed jobs comprises: receiving job information of the job; retrieving the job characteristics from job information; and identifying the one or more jobs of the previously executed jobs based on a similarity of the job characteristics. 3. The method of claim 1 , wherein: suggesting the configuration for the amount of resources in the cluster system by a tuning kernel is a result of an optimization with respect to a set of objective functions. 4. A method of scheduling a current job for execution in a cluster system by a job resource scheduler, comprising: analyzing, by a cluster system management server, the current job in a job queue comprising a plurality of jobs, the current job being submitted with a requested amount of resources to the job resource scheduler, wherein the job resource scheduler has a policy regarding an order of the plurality of jobs in the job queue; classifying, by the job resource scheduler, the current job in a job queue comprising a plurality of jobs, the current job having a requested amount of resources, according to a set of job characteristics associated with one or more jobs of a plurality of previously executed jobs, the classifying comprising: receiving job information associated with the current job; extracting the job characteristics of the current job from the system log based on a similarity of job information; and classifying the current job by identifying the one or more jobs of the previously executed jobs based on a similarity of the job characteristics; projecting, by the job resource scheduler, an amount of resources based on a result of classifying and correlating the set of job characteristics with a learning model generated by machine learning, the machine learning comprising: analyzing performance data associated with a respective set of job characteristics of the one or more jobs of the previously executed jobs; recording, by the cluster system management server, the configuration, a set of job characteristics, and performance data associated with the current job in a system log; and updating, by the cluster system management server, the learning model and the tuning kernel based on recorded historic data; suggesting a configuration for the projected amount of resources based on an amount of available resources in a cluster system by a tuning kernel of an optimization with respect to a set of objective functions the configuration for a projected amount of resources in the cluster system suggested by a learning model and a tuning kernel in the tuning server based on at least one of: minimizing a job-queue waiting time and execution time; or maximizing the resource utilization; allocating, by the job resource scheduler, the projected amount of resources in the cluster system for execution of the current job based on the configuration; updating an amount of currently available resources after the allocating; in response to allocating the projected amount of resources in the cluster system, determining, for a next job in the job queue, whether a requested amount of resources of the next job is greater than an updated available amount of resources, moving another job later in the job queue to the front of the job queue if the another job's requested amount of resources is less than or equal to the updated amount of available resources; and in response to the job in the front of the job queue having requested amount of resources less than or equal to the updated amount of available resources, classifying, projecting, suggesting, and allocating the job in the front of the job queue. 5. The method of claim 4 , wherein the analyzing of the current job comprises: receiving a request for execution of the current job from the job resource scheduler; and collecting information describing availability of resources for the current job. 6. The method of claim 4 , further comprising: receiving, by the tuning server from the cluster system management server, information describing availability of resources and information describing the job characteristics associated with the current job; identifying, by the tuning server from the system log, a plurality of previously executed jobs based on a result of analyzing of the current job; retrieving, by the tuning server from the system log, performance data of the previously executed jobs; and selecting, by the tuning server, a learning model associated with the previously executed jobs from a plurality of learning models. 7. The method of claim 4 , further comprising: suggesting the projected amount of resources by the learning model based on performance data associated with the previously executed jobs in the system log. 8. The method of claim 4 , wherein the optimization with respect to the set of objective functions comprises minimizing or maximizing the objective functions with consideration of resources available in the cluster system. 9. The method of claim 4 , wherein the updating of the learning model and the tuning kernel comprises: incorporating the performance data and the job characteristics into one or more mathematical models and one or more tuning algorithms respectively. 10. The method of claim 4 , further comprising: applying a plurality of software patches in an execution deck of the current job to measure the performance data.
using kernel methods, e.g. support vector machines [SVM] · CPC title
Machine learning · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
to service a request · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.