What technology area does this patent fall under?

Primary CPC classification G06F9/5061. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Serverless execution of code using cluster resources

US10474501B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10474501-B2
Application number	US-201715581987-A
Country	US
Kind code	B2
Filing date	Apr 28, 2017
Priority date	Apr 28, 2017
Publication date	Nov 12, 2019
Grant date	Nov 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for cluster resource allocation includes an interface and a processor. The interface is configured to receive a process and input data. The processor is configured to determine an estimate for resources required for the process to process the input data; determine existing available resources in a cluster for running the process; determine whether the existing available resources are sufficient for running the process; in the event it is determined that the existing available resources are not sufficient for running the process, indicate to add new resources; determine an allocated share of resources in the cluster for running the process; and cause execution of the process using the share of resources.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for cluster resource allocation, comprising: an interface configured to: receive a process and input data; and a processor configured to: determine an estimate for resources required for the process to process the input data, comprising to: determine a parallelism level for one or more steps of the process, the parallelism level including a quantity of data per cluster worker, a number of cluster workers, or a quantity of data per processor; determine an implementation type for one or more steps of the process, the implementation type including a sort algorithm, a join algorithm, or a filter algorithm; and determine, based on historical usage data, the estimate for resources required for the process using the parallelism level and the implementation type; determine existing available resources in a cluster for running the process; determine whether the existing available resources are sufficient for running the process based at least in part on the estimate for resources required; in response to determining that the existing available resources are not sufficient for running the process, indicate to add new resources to a cluster; determine an allocated share of resources in the cluster for running the process; cause execution of the process using the allocated share of resources, comprising to: provide an indication of the share of resources, the process, and the input data to a cluster driver; and provide an indication to the cluster driver to execute the process of the input data using the share of resources. 2. The system of claim 1 , wherein determining existing available resources in a cluster for running the process comprises determining resources that are not allocated to other processes. 3. The system of claim 1 , wherein determining existing available resources in a cluster for running the process comprises determining resources that are not currently being used by other processes. 4. The system of claim 1 , wherein determining existing available resources in a cluster for running the process comprises determining the estimated required resources for other processes running in the cluster. 5. The system of claim 1 , wherein the estimate for resources required for the process to process the input data is based at least in part on one or more of the following: the process, the input data, or the historical usage data. 6. The system of claim 1 , wherein the processor is further configured to: in response to determining that the existing available resources are not sufficient for running the process, indicate to add new resources if possible. 7. The system of claim 1 , wherein the processor is further configured to: in response to determining that the existing available resources are more than sufficient for running the process, indicate to remove resources. 8. The system of claim 1 , wherein the process comprises an associated scheduled running time. 9. The system of claim 8 , wherein the processor is further configured to: in response to determining that the existing available resources are not sufficient for running the process, indicate to add new resources in advance of the scheduled running time. 10. The system of claim 1 , wherein the allocated share of resources in the cluster comprises one or more of the following: a share determined by dividing the cluster resources evenly among users, a share determined by dividing the cluster resources according to a predetermined user allocation, an equal share for each running process, or a maximum amount of cluster resources. 11. The system of claim 1 , wherein the allocated share of resources comprises one or more of the following: a number of worker machines, a number of processors, a number of processes, an amount of memory, a number of disks, a number of virtual machines, or a number of containers. 12. The system of claim 1 , wherein the allocated share of resources comprises one of the following: an amount of resources, a fraction of resources, or a priority. 13. The system of claim 1 , wherein the processor is further configured to: determine that the share of resources being used is greater than the allocated share of resources. 14. The system of claim 13 , wherein the processor is further configured to: reduce a second process share of resources associated with a second process. 15. The system of claim 14 , wherein the second process share of resources associated with the second process is not reduced below a threshold share. 16. The system of claim 13 , wherein the processor is further configured to: indicate to add resources. 17. The system of claim 16 , wherein resources are limited to a customer maximum allocation. 18. The system of claim 1 wherein the driver is shared between users of a customer. 19. The system of claim 1 wherein the driver is shared between users of different customers. 20. The system of claim 1 , wherein the threshold comprises one or more of a percentage of the allocated resources or a number of resources. 21. A method for cluster resource allocation, comprising: receiving a process and input data; determining, using a processor, an estimate for resources required for the process to process the input data, comprising: determining a parallelism level for one or more steps of the process, the parallelism level including a quantity of data per cluster worker, a number of cluster workers, or a quantity of data per processor; determining an implementation type for one or more steps of the process, the implementation type including a sort algorithm a join algorithm or a filter algorithm; and determining, based on historical usage data, the estimate for resources required for the process using the parallelism level and the implementation type; determining existing available resources in a cluster for running the process; determining whether the existing available resources are sufficient for running the process based at least in part on the estimate for resources required; in response to determining that the existing available resources are not sufficient for running the process, indicating to add new resources to a cluster; determining an allocated share of resources in the cluster for running the process; causing execution of the process using the allocated share of resources, comprising: providing an indication of the share of resources, the process, and the input data to a cluster driver; and providing an indication to the cluster driver to execute the process of the input data using the share of resources. 22. A computer program product for cluster resource allocation, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for: receiving a process and input data; determining an estimate for resources required for the process to process the input data, comprising: determining a parallelism level for one or more steps of the process, the parallelism level including a quantity of data per cluster worker, a number of cluster workers, or a quantity of data per processor; determining an implementation type for one or more steps of the process, the implementation type including a sort algorithm a join algorithm or a filter algorithm; and determining, based on historical usage data, the estimate for resources required for the process using the parallelism level and the implementation type; determining existing available resource

Assignees

Databricks Inc

Inventors

Classifications

G06F2209/5011
Pool · CPC title
G06F9/5061Primary
Partitioning or combining of resources · CPC title
G06F2209/505
Clust · CPC title
G06F9/5005Primary
to service a request · CPC title

Patent family

Related publications grouped by family.

View patent family 63917222

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10474501B2 cover?: A system for cluster resource allocation includes an interface and a processor. The interface is configured to receive a process and input data. The processor is configured to determine an estimate for resources required for the process to process the input data; determine existing available resources in a cluster for running the process; determine whether the existing available resources are s…
Who is the assignee on this patent?: Databricks Inc
What technology area does this patent fall under?: Primary CPC classification G06F9/5061. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).