Scheduling applications in CPU and GPU hybrid environments

US11221876B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11221876-B2
Application numberUS-201816236546-A
CountryUS
Kind codeB2
Filing dateDec 30, 2018
Priority dateDec 30, 2018
Publication dateJan 11, 2022
Grant dateJan 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method may include receiving instructions to process a first application in response to a user request. The method also includes determining whether to store the first application in a first processing queue or a second processing queue based on a comparison between a CPU processing cost associated with the first application and a GPU processing cost associated with the first application. Further, the method includes grouping a first set of applications stored in the first processing queue according to CPU grouping criteria and grouping a second set of applications stored in the second processing queue according to GPU batching criteria. The method also includes causing a CPU to process the grouped first set of applications and a plurality of GPUs to process the grouped second set of applications.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a plurality of central processing units (CPUs), each of the plurality of CPUs having a plurality of CPU cores; a plurality of graphical processing units (GPUs); a non-transitory memory storing computer-executable instructions; and one or more hardware processors coupled to the non-transitory memory and configured to execute the computer-executable instructions to cause the system to perform operations comprising: receiving a request to process one or more applications; for each application in the one or more applications: determining a respective CPU cost corresponding to processing of the application by one of the plurality of CPUs and a respective GPU cost corresponding to processing of the application by one of the plurality of GPUs; and based on comparing the respective CPU cost and the respective GPU cost, determining whether to store the application in a first processing queue associated with the plurality of CPUs or in a second processing queue associated with the plurality of GPUs; grouping a first set of applications stored in the first processing queue into one or more CPU task groups, wherein the grouping the first set of applications minimizes a difference between a first total processing cost of a first task group of the one or more CPU task groups and a second total processing cost of a second task group of the one or more CPU task groups; causing the plurality of CPUs to process the one or more CPU task groups; grouping a second set of applications stored in the second processing queue into one or more GPU batches according to GPU batching criteria; and causing the plurality of GPUs to process the one or more GPU batches. 2. The system of claim 1 , wherein the respective CPU cost corresponds to a first amount of time associated with processing the application by one of the plurality of CPUs, wherein the respective GPU cost corresponds to a second amount of time associated with processing the application by one of the plurality of GPUs, and wherein the operations further comprise: receiving information related to an actual processing cost of the processing of the application; and adjusting one or more of the respective CPU cost or the respective GPU cost based on the information related to the actual processing cost. 3. The system of claim 1 , wherein the determining whether to store the application in the first processing queue or in the second processing queue further comprises: determining a first CPU cost and a first GPU cost associated with a first application of the one or more applications; and in response to determining that the first CPU cost is less than the first GPU cost, storing the first application in the first processing queue. 4. The system of claim 1 , wherein the determining whether to store the application in the first processing queue or in the second processing queue further comprises: determining a first CPU cost and a first GPU cost associated with a first application of the one or more applications; and in response to determining that the first GPU cost is less than the first CPU cost, storing the first application in the second processing queue. 5. The system of claim 1 , wherein the first processing queue includes applications that were previously assigned to the first processing queue based on one or more previous requests. 6. The system of claim 1 , wherein the second processing queue includes applications that were previously assigned to the second processing queue based on one or more previous requests. 7. The system of claim 1 , wherein the grouping the first set of applications into the one or more CPU task groups further comprises: determining a number of threads included in the first processing queue; and dividing the number of threads by a number of CPUs in the plurality of CPUs. 8. The system of claim 1 , wherein the grouping the second set of applications according to the GPU batching criteria further comprises: determining that first applications from the second set of applications are of a first application type; grouping the first applications into a first GPU batch of the one or more GPU batches; determining that second applications from the second set of applications are of a second application type that is different from the first application type; and grouping the second applications into a second GPU batch of the one or more GPU batches. 9. The system of claim 8 , wherein the causing the plurality of GPUs to process the one or more GPU batches further comprises: causing a first GPU of the plurality of GPUs to process a first GPU batch of the one or more GPU batches; and causing a second GPU of the plurality of GPUs to process a second GPU batch of the one or more GPU batches, the first GPU being different from the second GPU. 10. A method, comprising: assigning, by a system comprising one or more hardware processors, a first application to a central processing unit (CPU) processing queue based on a first CPU processing cost corresponding to a first application identifier of the first application being less than a first graphical processing unit (GPU) processing cost corresponding to the first application identifier; assigning, by the system, a second application to a GPU processing queue based on a second GPU processing cost corresponding to a second application identifier of the second application being less than a second CPU processing cost corresponding to the second application identifier; grouping, by the system, a first set of applications assigned to the CPU processing queue into one or more CPU task groups, wherein the grouping minimizes a difference between a first total processing cost of a first task group of the one or more CPU task groups and a second total processing cost of a second task group of the one or more CPU task groups; grouping, by the system, the second application with a third application assigned to the GPU processing queue in a batch based on determining that the second application identifier matches a third application identifier associated with the third application; causing, by the system, a first GPU of a plurality of GPUs to process the batch; and in response to determining an amount of time for the first GPU to process the batch, determining, by the system, whether to adjust the second GPU processing cost corresponding to the second application identifier. 11. The method of claim 10 , wherein the second GPU processing cost includes a previously determined amount of time taken for at least one of the plurality of GPUs to process data of an application associated with the second application identifier. 12. The method of claim 11 , wherein the determining whether to adjust the second GPU processing cost further comprises: determining whether the amount of time for the first GPU to process the batch is different from the previously determined amount of time. 13. The method of claim 10 , further comprising: causing a CPU to process first data associated with the first application; and in response to determining a second amount of time for the CPU to process the first data, determining whether to adjusting the first CPU processing cost corresponding to the first application identifier. 14. The method of claim 13 , wherein the first CPU processing cost includes a second previously determined amount of time taken for the CPU to process data of an application associated with the first application identifier, wherein the determining whether to adjust the first CPU processing cost the comprises: determining whether the second amount of time is different from t

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • by program, e.g. task dispatcher, supervisor, operating system · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11221876B2 cover?
A method may include receiving instructions to process a first application in response to a user request. The method also includes determining whether to store the first application in a first processing queue or a second processing queue based on a comparison between a CPU processing cost associated with the first application and a GPU processing cost associated with the first application. Fur…
Who is the assignee on this patent?
Paypal Inc
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).