Automatically completing a pipeline graph in an internet of things network

US11675838B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11675838-B2
Application numberUS-202117302728-A
CountryUS
Kind codeB2
Filing dateMay 11, 2021
Priority dateMay 11, 2021
Publication dateJun 13, 2023
Grant dateJun 13, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An approach is provided for completing a pipeline graph. Using a deep learning based sequence model, an initial data pipeline having a sequence of nodes is generated. Mismatch(es) between data formats required by input and output in the sequence of nodes is identified. Virtual gap node(s) that correct the mismatch(es) are added to the initial data pipeline. For a given virtual gap node, tentative graph structures are determined using knowledge graphs and a crowd sourced validation system. Reuse forecast scores and performance scores for the tentative graph structures are calculated. Based on the reuse forecast scores and the performance scores, a final graph structure for implementing the given virtual gap node is determined.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: using a deep learning based sequence model, generating, by one or more processors, an initial data pipeline having a sequence of nodes; identifying, by the one or more processors, one or more mismatches, a given mismatch included in the one or more mismatches indicates that a first data format specified by an output schema of a first node included in two consecutive nodes in the sequence of nodes does not match a second data format specified by an input schema of a second node included in the two consecutive nodes, and the second node following the first node in the sequence of nodes; in response to the identifying the one or more mismatches, adding to the initial data pipeline, by the one or more processors, one or more virtual gap nodes which correct the one or more mismatches; for a given virtual gap node included in the one or more virtual gap nodes, determining, by the one or more processors, tentative graph structures using knowledge graphs and a crowd sourced validation system and calculating, by the one or more processors, reuse forecast scores and performance scores for the tentative graph structures; based on the reuse forecast scores and the performance scores, determining, by the one or more processors, a final graph structure for implementing the given virtual gap node; and training, by the one or more processors, the deep learning based sequence model, wherein the generating the initial data pipeline includes generating a sequence of transformers using the trained deep learning based sequence model, and wherein the adding the one or more virtual gap nodes includes adding one or more new virtual transformers. 2. The method of claim 1 , further comprising: determining, by the one or more processors, a measure of importance of mapping an input sensor stream to an artificial intelligence (AI) model in a cloud computing system; determining, by the one or more processors, an amount of an incentive for developers to develop code for one or more nodes in the final graph structure, the amount being based on a reuse forecast score of the final graph structure and the measure of the importance of mapping the input sensor stream to the AI model; sending, by the one or more processors and using an incentive system, an offer of the incentive to the developers to develop code for the one or more nodes in the final graph structure; receiving, by the one or more processors, code from a developer for the one or more nodes in the final graph structure as a response to the offer of the incentive; using test cases, validating, by the one or more processors, the code received from the developer; and in response to the code being validated, generating, by the one or more processors, one or more coded nodes using the validated code, adding, by the one or more processors, the one or more coded nodes to a pipeline repository, replacing, by the one or more processors, the given virtual gap node with the one or more coded nodes, and sending, by the one or more processors, the incentive to the developer. 3. The method of claim 2 , further comprising generating, by the one or more processors, a complete pipeline graph that includes the initial data pipeline and the one or more coded nodes. 4. The method of claim 1 , further comprising: refactoring, by the one or more processors, a virtual gap node into multiple pipeline nodes based on the reuse forecast scores; and splitting, by the one or more processors, an incentive program provided by the incentive system into multiple incentive programs for the multiple pipeline nodes, respectively. 5. The method of claim 1 , wherein the identifying the one or more mismatches includes determining that an output schema of one transformer included in the sequence of transformers does not match an input schema of a next transformer included in the sequence of transformers. 6. The method of claim 1 , further comprising: receiving, by the one or more processors, voting results from an automated voting system in an organization, wherein the calculating the reuse forecast scores is based on the voting results; determining, by the one or more processors, respective performance profiles for the tentative graph structures, wherein the calculating the performance scores is based on the performance profiles; ranking, by the one or more processors, the tentative graph structures based on the reuse forecast scores and the performance scores; determining, by the one or more processors, a top ranked graph structure included in the tentative graph structures based on the ranked tentative graph structures; and selecting, by the one or more processors, the top ranked graph structure as the final graph structure. 7. The method of claim 1 , further comprising: providing at least one support service for at least one of creating, integrating, hosting, maintaining, and deploying computer readable program code in the computer, the program code being executed by a processor of the computer to implement the generating the initial data pipeline, the identifying the one or more mismatches, the adding to the initial data pipeline the one or more virtual gap nodes, the determining the tentative graph structures, the calculating the reuse forecast scores and the performance scores, the determining the final graph structure and the training the deep learning based sequence model. 8. A computer program product for automatically completing a pipeline graph, the computer program product comprising: one or more computer readable storage media having computer readable program code collectively stored on the one or more computer readable storage media, the computer readable program code being executed by a central processing unit (CPU) of a computer system to cause the computer system to perform a method comprising: using a deep learning based sequence model, the computer system generating an initial data pipeline having a sequence of nodes; the computer system identifying one or more mismatches, a given mismatch included in the one or more mismatches indicates that a first data format specified by an output schema of a first node included in two consecutive nodes in the sequence of nodes does not match a second data format specified by an input schema of a second node included in the two consecutive nodes, and the second node following the first node in the sequence of nodes; in response to the identifying the one or more mismatches, the computer system adding to the initial data pipeline one or more virtual gap nodes which correct the one or more mismatches; for a given virtual gap node included in the one or more virtual gap nodes, the computer system determining tentative graph structures using knowledge graphs and a crowd sourced validation system and the computer system calculating reuse forecast scores and performance scores for the tentative graph structures; based on the reuse forecast scores and the performance scores, the computer system determining a final graph structure for implementing the given virtual gap node; and the computer system training the deep learning based sequence model, wherein the generating the initial data pipeline includes generating a sequence of transformers using the trained deep learning based sequence model, and wherein the adding the one or more virtual gap nodes includes adding one or more new virtual transformers. 9. The computer program product of claim 8 , wherein the method further comprises: the computer system determining a measure of importance of mapping an input sensor stream to an artificial intelligence (AI) model in a cloud computing system; the computer system determining an amount of an incentive for devel

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • for test design, e.g. generating new test cases · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11675838B2 cover?
An approach is provided for completing a pipeline graph. Using a deep learning based sequence model, an initial data pipeline having a sequence of nodes is generated. Mismatch(es) between data formats required by input and output in the sequence of nodes is identified. Virtual gap node(s) that correct the mismatch(es) are added to the initial data pipeline. For a given virtual gap node, tentati…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/9024. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 13 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).