Technology for extensible in-memory computing

US10380137B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10380137-B2
Application numberUS-201615290544-A
CountryUS
Kind codeB2
Filing dateOct 11, 2016
Priority dateOct 11, 2016
Publication dateAug 13, 2019
Grant dateAug 13, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A user-defined function (UDF) is received in a central Computer System, which causes registration of the UDF and distributes the UDF to a cluster of computer system nodes configured for performing, in volatile memory of the nodes, extract-transform-load processing of data cached in the volatile memory of the nodes. First and second job specifications that include the UDF are received by the central Computer System, and the central computer system distributes instructions for the job specifications to the nodes including at least one instruction that invokes the UDF for loading and executing the UDF in the volatile memory of at least one of the nodes during runtime of the jobs. The central Computer System does not cause registration of the UDF again after receiving the first job specification.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a user-defined function (UDF) in a central Computer System, and the central computer system causing registration of the UDF and distributing the UDF to a cluster of computer system nodes configured for performing, in volatile memory of the nodes, extract-transform-load processing of data cached in the volatile memory of the nodes; and receiving by the central Computer System first and second job specifications that specify the UDF, and the central computer system distributing instructions for the job specifications to the nodes, including at least one instruction that invokes the UDF for loading and executing the UDF in the volatile memory of at least one of the nodes during runtime of the job, where the central Computer System does not cause registration of the UDF again after receiving the first job specification. 2. The method of claim 1 , where the user-defined function provides a mapping function and the method includes executing the user-defined, mapping function in volatile memory of the nodes during extract-transform-load processing of data cached in the volatile memory of the nodes. 3. The method of claim 2 , where executing the user-defined, mapping function performs at least one of the following actions during extract-transform-load processing of data cached in volatile memory of the nodes: accessing a database; calculating data quality of data from the database; and writing the data quality. 4. The method of claim 3 , comprising: sharing resources among at least two software applications cached in nodes of the cluster while executing the user-defined, mapping function that performs at least one of the actions during extract-transform-load processing of data cached in volatile memory of the nodes. 5. The method of claim 1 , where the at least one instruction that invokes the user-defined function includes a Spark SQL instruction. 6. The method of claim 1 , comprising: compiling the received job specifications to generate the instructions distributed to the nodes, where the first job specification having the at least one instruction that invokes the user-defined function is added to an earlier received job specification having no instruction that invokes the user-defined function, and where the earlier received job specification is compiled before receiving the first job specification and is not recompiled after adding the first job specification. 7. The method of claim 1 , comprising: applying a pin operation to cause the nodes to cache at least one class file for the user-defined function in volatile memory of the nodes for a next job run. 8. A system comprising: a processor; and a computer readable storage medium connected to the processor, where the computer readable storage medium has stored thereon a program for controlling the processor, and where the processor is operative with the program to execute the program for: receiving a user-defined function (UDF) in a central Computer System, and the central computer system causing registration of the UDF and distributing the UDF to a cluster of computer system nodes configured for performing, in volatile memory of the nodes, extract-transform-load processing of data cached in the volatile memory of the nodes; and receiving by the central Computer System first and second job specifications that specify the UDF, and the central computer system distributing instructions for the job specifications to the nodes, including at least one instruction that invokes the UDF for loading and executing the UDF in the volatile memory of at least one of the nodes during runtime of the job, where the central Computer System does not cause registration of the UDF again after receiving the first job specification. 9. The system of claim 8 , where the user-defined function provides a mapping function and the where the processor is operative with the program to execute the program for: executing the user-defined, mapping function in volatile memory of the nodes during extract-transform-load processing of data cached in the volatile memory of the nodes. 10. The system of claim 9 , where executing the user-defined, mapping function performs at least one of the following actions during extract-transform-load processing of data cached in volatile memory of the nodes: accessing a database; calculating data quality of data from the database; and writing the data quality. 11. The system of claim 10 , where the processor is operative with the program to execute the program for: sharing resources among at least two software applications cached in nodes of the cluster while executing the user-defined, mapping function that performs at least one of the actions during extract-transform-load processing of data cached in volatile memory of the nodes. 12. The system of claim 8 , where the at least one instruction that invokes the user-defined function includes a Spark SQL instruction. 13. The system of claim 8 , where the processor is operative with the program to execute the program for: compiling the received job specifications to generate the instructions distributed to the nodes, where the first job specification having the at least one instruction that invokes the user-defined function is added to an earlier received job specification having no instruction that invokes the user-defined function, and where the earlier received job specification is compiled before receiving the first job specification and is not recompiled after adding the first job specification. 14. The system of claim 8 , where the processor is operative with the program to execute the program for: applying a pin operation to cause the nodes to cache at least one class file for the user-defined function in volatile memory of the nodes for a next job run. 15. A computer program product, including a computer readable storage medium having instructions stored thereon for execution by a computer system, where the instructions, when executed by the computer system, cause the computer system to implement a method comprising: receiving a user-defined function (UDF) in a central Computer System, and the central computer system causing registration of the UDF and distributing the UDF to a cluster of computer system nodes configured for performing, in volatile memory of the nodes, extract-transform-load processing of data cached in the volatile memory of the nodes; and receiving by the central Computer System first and second job specifications that specify the UDF, and the central computer system distributing instructions for the job specifications to the nodes, including at least one instruction that invokes the UDF for loading and executing the UDF in the volatile memory of at least one of the nodes during runtime of the job, where the central Computer System does not cause registration of the UDF again after receiving the first job specification. 16. The computer program product of claim 15 , where the user-defined function provides a mapping function and the where the instructions, when executed by the computer system, cause the computer system to implement a method comprising: executing the user-defined, mapping function in volatile memory of the nodes during extract-transform-load processing of data cached in the volatile memory of the nodes. 17. The computer program product of claim 16 , where executing the user-defined, mapping function performs at least one of the following actions during extract-transform-load processing of data cached in volatile memory of the nodes: accessing a database; calcul

Assignees

Inventors

Classifications

  • Database cache management · CPC title

  • G06F16/254Primary

    Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10380137B2 cover?
A user-defined function (UDF) is received in a central Computer System, which causes registration of the UDF and distributes the UDF to a cluster of computer system nodes configured for performing, in volatile memory of the nodes, extract-transform-load processing of data cached in the volatile memory of the nodes. First and second job specifications that include the UDF are received by the cen…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 13 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).