Systems and methods for determining machine learning training approaches based on identified impacts of one or more types of concept drift
US-2020151619-A1 · May 14, 2020 · US
US11675838B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11675838-B2 |
| Application number | US-202117302728-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 11, 2021 |
| Priority date | May 11, 2021 |
| Publication date | Jun 13, 2023 |
| Grant date | Jun 13, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An approach is provided for completing a pipeline graph. Using a deep learning based sequence model, an initial data pipeline having a sequence of nodes is generated. Mismatch(es) between data formats required by input and output in the sequence of nodes is identified. Virtual gap node(s) that correct the mismatch(es) are added to the initial data pipeline. For a given virtual gap node, tentative graph structures are determined using knowledge graphs and a crowd sourced validation system. Reuse forecast scores and performance scores for the tentative graph structures are calculated. Based on the reuse forecast scores and the performance scores, a final graph structure for implementing the given virtual gap node is determined.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: using a deep learning based sequence model, generating, by one or more processors, an initial data pipeline having a sequence of nodes; identifying, by the one or more processors, one or more mismatches, a given mismatch included in the one or more mismatches indicates that a first data format specified by an output schema of a first node included in two consecutive nodes in the sequence of nodes does not match a second data format specified by an input schema of a second node included in the two consecutive nodes, and the second node following the first node in the sequence of nodes; in response to the identifying the one or more mismatches, adding to the initial data pipeline, by the one or more processors, one or more virtual gap nodes which correct the one or more mismatches; for a given virtual gap node included in the one or more virtual gap nodes, determining, by the one or more processors, tentative graph structures using knowledge graphs and a crowd sourced validation system and calculating, by the one or more processors, reuse forecast scores and performance scores for the tentative graph structures; based on the reuse forecast scores and the performance scores, determining, by the one or more processors, a final graph structure for implementing the given virtual gap node; and training, by the one or more processors, the deep learning based sequence model, wherein the generating the initial data pipeline includes generating a sequence of transformers using the trained deep learning based sequence model, and wherein the adding the one or more virtual gap nodes includes adding one or more new virtual transformers. 2. The method of claim 1 , further comprising: determining, by the one or more processors, a measure of importance of mapping an input sensor stream to an artificial intelligence (AI) model in a cloud computing system; determining, by the one or more processors, an amount of an incentive for developers to develop code for one or more nodes in the final graph structure, the amount being based on a reuse forecast score of the final graph structure and the measure of the importance of mapping the input sensor stream to the AI model; sending, by the one or more processors and using an incentive system, an offer of the incentive to the developers to develop code for the one or more nodes in the final graph structure; receiving, by the one or more processors, code from a developer for the one or more nodes in the final graph structure as a response to the offer of the incentive; using test cases, validating, by the one or more processors, the code received from the developer; and in response to the code being validated, generating, by the one or more processors, one or more coded nodes using the validated code, adding, by the one or more processors, the one or more coded nodes to a pipeline repository, replacing, by the one or more processors, the given virtual gap node with the one or more coded nodes, and sending, by the one or more processors, the incentive to the developer. 3. The method of claim 2 , further comprising generating, by the one or more processors, a complete pipeline graph that includes the initial data pipeline and the one or more coded nodes. 4. The method of claim 1 , further comprising: refactoring, by the one or more processors, a virtual gap node into multiple pipeline nodes based on the reuse forecast scores; and splitting, by the one or more processors, an incentive program provided by the incentive system into multiple incentive programs for the multiple pipeline nodes, respectively. 5. The method of claim 1 , wherein the identifying the one or more mismatches includes determining that an output schema of one transformer included in the sequence of transformers does not match an input schema of a next transformer included in the sequence of transformers. 6. The method of claim 1 , further comprising: receiving, by the one or more processors, voting results from an automated voting system in an organization, wherein the calculating the reuse forecast scores is based on the voting results; determining, by the one or more processors, respective performance profiles for the tentative graph structures, wherein the calculating the performance scores is based on the performance profiles; ranking, by the one or more processors, the tentative graph structures based on the reuse forecast scores and the performance scores; determining, by the one or more processors, a top ranked graph structure included in the tentative graph structures based on the ranked tentative graph structures; and selecting, by the one or more processors, the top ranked graph structure as the final graph structure. 7. The method of claim 1 , further comprising: providing at least one support service for at least one of creating, integrating, hosting, maintaining, and deploying computer readable program code in the computer, the program code being executed by a processor of the computer to implement the generating the initial data pipeline, the identifying the one or more mismatches, the adding to the initial data pipeline the one or more virtual gap nodes, the determining the tentative graph structures, the calculating the reuse forecast scores and the performance scores, the determining the final graph structure and the training the deep learning based sequence model. 8. A computer program product for automatically completing a pipeline graph, the computer program product comprising: one or more computer readable storage media having computer readable program code collectively stored on the one or more computer readable storage media, the computer readable program code being executed by a central processing unit (CPU) of a computer system to cause the computer system to perform a method comprising: using a deep learning based sequence model, the computer system generating an initial data pipeline having a sequence of nodes; the computer system identifying one or more mismatches, a given mismatch included in the one or more mismatches indicates that a first data format specified by an output schema of a first node included in two consecutive nodes in the sequence of nodes does not match a second data format specified by an input schema of a second node included in the two consecutive nodes, and the second node following the first node in the sequence of nodes; in response to the identifying the one or more mismatches, the computer system adding to the initial data pipeline one or more virtual gap nodes which correct the one or more mismatches; for a given virtual gap node included in the one or more virtual gap nodes, the computer system determining tentative graph structures using knowledge graphs and a crowd sourced validation system and the computer system calculating reuse forecast scores and performance scores for the tentative graph structures; based on the reuse forecast scores and the performance scores, the computer system determining a final graph structure for implementing the given virtual gap node; and the computer system training the deep learning based sequence model, wherein the generating the initial data pipeline includes generating a sequence of transformers using the trained deep learning based sequence model, and wherein the adding the one or more virtual gap nodes includes adding one or more new virtual transformers. 9. The computer program product of claim 8 , wherein the method further comprises: the computer system determining a measure of importance of mapping an input sensor stream to an artificial intelligence (AI) model in a cloud computing system; the computer system determining an amount of an incentive for devel
Combinations of networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
for test design, e.g. generating new test cases · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.