Learning-based resource management in a data center cloud architecture
US-2018255122-A1 · Sep 6, 2018 · US
US10514958B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10514958-B2 |
| Application number | US-201815896911-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 14, 2018 |
| Priority date | Feb 14, 2018 |
| Publication date | Dec 24, 2019 |
| Grant date | Dec 24, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device, that provides serverless computing, receives a request to execute multiple jobs, and determines criteria for each of the plurality of jobs, wherein the criteria for each of the multiple jobs includes at least one of job posting criteria, job validation criteria, job retry criteria, or a disaster recovery criteria. The device stores information associated with the multiple jobs in a repository, wherein the information associated with the multiple jobs includes the criteria for each of the multiple jobs. The device provides a particular job, of the multiple jobs, to a cluster computing framework for execution, determines modified criteria for the particular job, and provides the modified criteria for the particular job to the cluster computing framework. The device receives, from the cluster computing framework, information indicating that execution of the particular job is complete, and validates a success of completion of the execution of the particular job.
Opening claim text (preview).
What is claimed is: 1. A device, comprising: one or more memories; and one or more processors, communicatively coupled to the one or more memories, to: receive a request to execute a plurality of jobs; determine criteria for each of the plurality of jobs, the criteria for each of the plurality of jobs including: job execution criteria, job posting criteria, job validation criteria, and job retry criteria; store first information associated with the plurality of jobs in a repository, the first information associated with the plurality of jobs including the criteria for each of the plurality of jobs; provide a particular job, of the plurality of jobs, to a first cluster computing framework for execution, the device being remote from the first cluster computing framework; receive, after providing the particular job, a request for one or more additional jobs to be executed in parallel with the particular job; determine, based on receiving the request for the one or more additional jobs to be executed in parallel with the particular job, modified criteria for the particular job, the modified criteria for the particular job indicating that the particular job is to be executed in parallel with the one or more additional jobs; provide the modified criteria for the particular job and a request to retry execution of the particular job to the first cluster computing framework, the modified criteria for the particular job including: a modified job execution criteria, a modified job posting criteria, a modified job validation criteria, and a modified job retry criteria; receive, from the first cluster computing framework, second information indicating whether execution of the particular job is complete; and perform, when the second information indicates that the execution of the particular job failed, a disaster recovery technique, the disaster recovery technique including at least one of: a provision of a first instruction to the first cluster computing framework to re-route the particular job to another functional cluster of the first cluster computing framework, or a provision of a second instruction to a second cluster computing framework to execute the particular job. 2. The device of claim 1 , wherein the one or more processors are further to: cause a job stack to be created for the plurality of jobs; and cause the job stack to be deleted after the plurality of jobs are executed. 3. The device of claim 1 , wherein the one or more processors are further to: provide another particular job, of the plurality of jobs, to the first cluster computing framework for execution; receive, from the first cluster computing framework, third information indicating whether execution of the other particular job is complete; and validate a success of a completion of the other particular job based on the third information indicating that the execution of the other particular job is complete. 4. The device of claim 3 , wherein the one or more processors, when validating the success of the completion of the other particular job, are to: perform analytics on a result of executing the other particular job. 5. The device of claim 3 , wherein the one or more processors, when validating the success of the completion of the other particular job, are to: determine an elapsed time for executing the other particular job. 6. The device of claim 3 , wherein the one or more processors are further to: correlate performance metrics of the other particular job with performance metrics of another job. 7. The device of claim 1 , wherein the plurality of jobs are interdependent. 8. A method, comprising: receiving, by a device and from a client device, a request to execute a plurality of jobs; determining, by the device, criteria for each of the plurality of jobs, the criteria for each of the plurality of jobs including: job execution criteria, job posting criteria, job validation criteria, and job retry criteria; storing, by the device, first information associated with the plurality of jobs in a repository, the first information associated with the plurality of jobs including the criteria for each of the plurality of jobs; posting, by the device, a particular job, of the plurality of jobs, to a first cluster computing framework for execution; receiving, by the device and after posting the particular job, a request for one or more additional jobs to be executed in parallel with the particular job; determining, by the device and based on receiving the request for the one or more additional jobs to be executed in parallel with the particular job, modified criteria for the particular job, the modified criteria for the particular job indicating that the particular job is to be executed in parallel with the one or more additional jobs, and the modified criteria for the particular job including: a modified job execution criteria, a modified job posting criteria, a modified job validation criteria, and a modified job retry criteria; providing, by the device, the modified criteria for the particular job and a request to retry execution of the particular job to the first cluster computing framework; receiving, by the device and from the first cluster computing framework, second information indicating whether execution of the particular job failed; and performing, by the device, a disaster recovery technique for the particular job when the second information indicates that the execution of the particular job failed, performing the disaster recovery technique including at least one of: instructing the first cluster computing framework to re-route the particular job to another functional cluster of the first cluster computing framework, or instructing a second cluster computing framework to execute the particular job. 9. The method of claim 8 , wherein the first information associated with the plurality of jobs further includes at least one of: information associated with the first cluster computing framework, information indicating names of the plurality of jobs, or information indicating execution criteria for the plurality of jobs. 10. The method of claim 8 , further comprising: determining that a first functional cluster of the first cluster computing framework is non-operational; determining a second functional cluster of the first cluster computing framework that is operational; and wherein performing the disaster recovery technique further includes: instructing the first cluster computing framework to re-route the particular job to the second functional cluster. 11. The method of claim 8 , further comprising: causing a job stack to be created for the plurality of jobs; and causing the job stack to be deleted after the plurality of jobs are executed. 12. The method of claim 8 , further comprising: posting another particular job, of the plurality of jobs, to the first cluster computing framework for execution; determining modified criteria for the other particular job; providing the modified criteria for the other particular job to the first cluster computing framework; receiving, from the first cluster computing framework, third information indicating that execution of the other particular job is complete; and validating a success of a completion of the execution of the other particular job. 13. The method of claim 12 , further comprising: providing, to the client device, fourth information indicating that the other particular job successfully executed after validating the success of the completion of the execution of the other particular job. 14
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Execution paradigms, e.g. implementations of programming paradigms · CPC title
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title
Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs (mappping at compile time, see G06F8/451) · CPC title
Grid computing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.