Job-processing systems and methods with inferred dependencies between jobs

US10372492B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10372492-B2
Application numberUS-201314103671-A
CountryUS
Kind codeB2
Filing dateDec 11, 2013
Priority dateDec 11, 2013
Publication dateAug 6, 2019
Grant dateAug 6, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An analytics system that executes processing jobs infers dependencies between jobs to be executed based on identification of dependencies between a “sink” job and a source data object on which the sink job depends. Given a job definition for the sink job that identifies a source data object, the system can identify a “source” job that produces the source data object and can infer a dependency of the sink job on the source job. The system can schedule executions of the source and sink jobs such that the source job completes (or completes generation of the source data object) before the sink job is launched.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, at a computer system, via graphical user interface controls of a job creation graphical user interface, a plurality of job definitions for a plurality of jobs, each job definition of the plurality of job definitions specifying at least a source data object to provide input data to be consumed by the job, and an output data object to store output data produced by the job; wherein the plurality of job definitions includes a first job definition and a second job definition; wherein the plurality of jobs includes a first job and a second job; wherein the first job definition is for the first job; wherein the second job definition is for the second job; automatically inferring, by the computer system, a first dependency of the first job on the second job; wherein the automatically inferring the first dependency is based at least in part on automatically determining that the source data object specified in the first job definition is the output data object specified in the second job definition; based at least in part on the automatically inferring the first dependency, dispatching, by the computer system, a first plurality of job instances for execution, wherein the dispatching the first plurality of job instances is controlled such that a first instance of the first job, of the first plurality of job instances, is blocked from being dispatched until after a first instance of the second job, of the first plurality of job instances, completes; after the dispatching the first plurality of job instances for execution, receiving, at the computer system, via graphical user interface controls of the job creation graphical user interface, a third job definition; wherein the third job definition is for a third job; wherein the third job definition specifies at least a source data object to provide input data to be consumed by the third job, and an output data object to store output data produced by the third job; based at least on the third job definition, automatically inferring, by the computer system, a second dependency; wherein the second dependency is the first job on the third job; wherein the second dependency is automatically inferred based at least in part on automatically determining that the source data object specified in the first job definition is the output data object specified in the third job definition; and based at least in part on the automatically inferring the first dependency and the automatically inferring the second dependency, dispatching, by the computer system, a second plurality of job instances for execution, wherein the dispatching the second plurality of job instances is controlled such that a second instance of the first job, of the second plurality of job instances, is blocked from being dispatched until after both: (a) a second instance of the second job, of the second plurality of job instances, completes, and (b) a first instance of the third job, of the second plurality of job instances, completes. 2. The computer-implemented method of claim 1 wherein at least one job definition of the plurality of job definitions specifies whether a dependency on the source data object, of the at least one job definition, is an interval dependency such that the source data object, of the at least one job definition, contains data associated with a specified time interval or a snapshot dependency such that the source data object, of the at least one job definition, contains a snapshot of data reflective of a current condition at a specified snapshot time. 3. The computer-implemented method of claim 1 wherein at least one job definition of the plurality of job definitions specifies whether the output data object comprises a snapshot of data or data for a time interval. 4. The computer-implemented method of claim 1 wherein at least one job definition of the plurality of job definitions specifies that the output data object comprises data for a time interval, the at least one job definition of the plurality of job definitions further specifies a duration of the time interval. 5. The computer-implemented method of claim 1 , wherein a database job definition, of the plurality of job definitions, of a database job, of the plurality of jobs, specifies the source data object to provide input data to be consumed by the database job, as part of a database query. 6. One or more non-transitory computer-readable media storing one or more programs for execution by one or more processing units, the one or more programs comprising instructions configured for: receiving, at a computer system, via graphical user interface controls of a job creation graphical user interface, a plurality of job definitions for a plurality of jobs, each job definition of the plurality of job definitions specifying at least a source data object to provide input data to be consumed by the job, and an output data object to store output data produced by the job; wherein the plurality of job definitions includes a first job definition and a second job definition; wherein the plurality of jobs includes a first job and a second job; wherein the first job definition is for the first job; wherein the second job definition is for the second job; automatically inferring, by the computer system, a first dependency of the first job on the second job; wherein the automatically inferring the first dependency is based at least in part on automatically determining that the source data object specified in the first job definition is the output data object specified in the second job definition; based at least in part on the automatically inferring the first dependency, dispatching, by the computer system, a first plurality of job instances for execution, wherein the dispatching the first plurality of job instances is controlled such that a first instance of the first job, of the first plurality of job instances, is blocked from being dispatched until after a first instance of the second job, of the first plurality of job instances, completes; after the dispatching the first plurality of job instances for execution, receiving, at the computer system, via graphical user interface controls of the job creation graphical user interface, a third job definition; wherein the third job definition is for a third job; wherein the third job definition specifies at least a source data object to provide input data to be consumed by the third job, and an output data object to store output data produced by the third job; based at least on the third job definition, automatically inferring, by the computer system, a second dependency; wherein the second dependency is the first job on the third job; wherein the second dependency is automatically inferred based at least in part on automatically determining that the source data object specified in the first job definition is the output data object specified in the third job definition; and based at least in part on the automatically inferring the first dependency and the automatically inferring the second dependency, dispatching, by the computer system, a second plurality of job instances for execution, wherein the dispatching the second plurality of job instances is controlled such that a second instance of the first job, of the second plurality of job instances, is blocked from being dispatched until after both: (a) a second instance of the second job, of the second plurality of job instances, completes, and (b) a first instance of the third job, of the second plurality of job instances, completes. 7. The one or more non-transitory computer-readable media of claim 6 wherein at least one job definition of the plurality of job definitions specifies whether a dependency on the source data

Assignees

Inventors

Classifications

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10372492B2 cover?
An analytics system that executes processing jobs infers dependencies between jobs to be executed based on identification of dependencies between a “sink” job and a source data object on which the sink job depends. Given a job definition for the sink job that identifies a source data object, the system can identify a “source” job that produces the source data object and can infer a dependency o…
Who is the assignee on this patent?
Dropbox Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 06 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).