Resource assignment for jobs in a system having a processing pipeline that satisfies a data freshness query constraint

US9389913B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9389913-B2
Application numberUS-201013383594-A
CountryUS
Kind codeB2
Filing dateJul 8, 2010
Priority dateJul 8, 2010
Publication dateJul 12, 2016
Grant dateJul 12, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A set of jobs to be scheduled is identified ( 402 ) in a system including a processing pipeline having plural processing stages that apply corresponding different processing to a data update to allow the data update to be stored. The set of jobs is based on one or both of the data update and a query that is to access data in the system. The set of jobs is scheduled ( 404 ) by assigning resources to perform the set of jobs, where assigning the resources is subject to at least one constraint selected from at least one constraint associated with the data update and at least one constraint associated with the query.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, by at least one processor, a query having at least one query constraint comprising a freshness constraint specifying how up-to-date data in a response to the query should be; identifying, by the at least one processor, a set of jobs to be scheduled in a system including a processing pipeline having plural processing stages that are to apply corresponding different processing to a data update to allow the data update to be stored, wherein the set of jobs is based on the data update and the query that requests access of data in the system, wherein the freshness constraint causes the set of lobs to access data of selected processing stages of the plural processing stages of the processing pipeline, the selected processing stages based on the freshness constraint; and scheduling, by the at least one processor, the set of jobs by assigning resources to perform the set of jobs, wherein assigning the resources is subject to at least one constraint associated with the data update and the at least one query constraint of the query, the resources assigned being based on the selected processing stages. 2. The method of claim 1 , wherein assigning the resources comprises: assigning resources to at least one of the processing stages of the processing pipeline; and assigning resources to a query processing engine, wherein the query processing engine performs query processing in response to the query. 3. The method of claim 1 , wherein assigning the resources to perform the set of jobs is based on the freshness constraint included in the query, the set of jobs comprising processing of the query using a portion of the assigned resources. 4. The method of claim 1 , wherein the at least one constraint associated with the data update is selected from among an input data constraint relating to reading input data from one or more different computer nodes, a precedence constraint, an execution time constraint, and a resource constraint. 5. The method of claim 1 , wherein assigning the resources comprises making decisions selected from the group consisting of: determining a degree of parallelism used for a given job; determining specific ones of the resources to allocate to the given job; determining a fraction of each of the resources to allocate to the given job; determining the given job's start time; and determining the given job's end time. 6. The method of claim 1 , further comprising performing the data update in the processing pipeline that has a stage to transform the data update to allow content of the data update to be stored into a database. 7. The method of claim 6 , wherein transforming the data update comprises at least one selected from among remapping identifiers of the data update, sorting the data update, and merging the data update. 8. The method of claim 1 , wherein the system comprises the at least one processor, and wherein the identifying and the scheduling are performed by a resource allocation and scheduling mechanism in the system. 9. A computer system comprising: at least one central processing unit (CPU); and a scheduling module executable on the at least one CPU to: receive a query having at least one query constraint comprising a freshness constraint specifying how up-to-date data in a response to the query should be, wherein the query is for execution in a system having a processing pipeline including plural processing stages to process a data update, the plural processing stages selected from among: an ingest stage, an identifier remapping stage, a sorting stage, and a merging stage; identify a set of jobs to be scheduled based on the received query and the data update to be processed by the processing pipeline, wherein the freshness constraint causes the set of lobs to access data of selected processing stages of the plural processing stages of the processing pipeline, the selected processing stages based on the freshness constraint; and assign resources of the system having the processing pipeline to the set of jobs according to the at least one query constraint and at least one constraint associated with the data update, the resources assigned being based on the selected processing stages. 10. The method of claim 3 , wherein the at least one query constraint of the query further comprises at least one selected from among a query performance goal, an input data constraint relating to reading input data from one or more different computer nodes, a precedence constraint, an execution time constraint, and a resource constraint. 11. The method of claim 3 , wherein different levels of the freshness constraint included in the query cause the processing of the query to access different respective combinations of the plural processing stages. 12. The method of claim 3 , wherein the query further includes a response time constraint specifying a target response time of the response to the query, wherein assigning the resources to perform the set of jobs is further based on the response time constraint. 13. The computer system of claim 9 , wherein different levels of the freshness constraint included in the query cause the processing of the query to access different respective combinations of the plural processing stages. 14. The computer system of claim 9 , wherein the query further includes a response time constraint specifying a target response time of the response to the query, wherein the assigning of the resources to the set of jobs is further based on the response time constraint. 15. An article comprising at least one non-transitory computer-readable storage medium storing instructions that upon execution cause a computer system to: receive a query having at least one query constraint comprising a freshness constraint specifying how up-to-date data in a response to the query should be; identify a set of jobs to be scheduled in a system that has a processing pipeline to process a data update received at the processing pipeline, wherein the processing pipeline has plural processing stages that apply corresponding different processing to the data update to allow the data update to be stored, wherein the set of jobs identified is based on the data update and the query, wherein the freshness constraint causes the set of lobs to access data of selected processing stages of the plural processing stages of the processing pipeline, the selected processing stages based on the freshness constraint; and allocate resources to perform the set of jobs that is based on a constraint associated with the data update and the at least one query constraint of the query, the resources assigned being based on the selected processing stages. 16. The article of claim 15 , wherein the constraint associated with the data update comprises at least one selected from among an input data constraint relating to reading input data from one or more different computer nodes, a precedence constraint, an execution time constraint, and a resource constraint. 17. The article of claim 15 , wherein allocating the resources comprises employing fitness tests that measure performance of different jobs, and wherein employing the fitness tests determines a level of parallelism to use for query processing and for each of the plural processing stages. 18. The article of claim 15 , wherein different levels of the freshness constraint included in the query cause the processing of the query to access different respective combinations of the plural processing stages.

Assignees

Inventors

Classifications

  • G06F9/5011Primary

    the resources being hardware resources other than CPUs, Servers and Terminals · CPC title

  • of structured data, e.g. relational data · CPC title

  • G06F9/50Primary

    Allocation of resources, e.g. of the central processing unit [CPU] · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9389913B2 cover?
A set of jobs to be scheduled is identified ( 402 ) in a system including a processing pipeline having plural processing stages that apply corresponding different processing to a data update to allow the data update to be stored. The set of jobs is based on one or both of the data update and a query that is to access data in the system. The set of jobs is scheduled ( 404 ) by assigning resource…
Who is the assignee on this patent?
Keeton Kimberly, Morrey Iii Charles B, Souies Craig A, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F9/5011. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 12 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).