Pretraining utilizing software dependencies

US2021286598A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021286598-A1
Application numberUS-202016813778-A
CountryUS
Kind codeA1
Filing dateMar 10, 2020
Priority dateMar 10, 2020
Publication dateSep 16, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an approach to creating code snippet auto-commenting models utilizing a pre-training model leveraging dependency data, one or more computer processors create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project. The one or more computer processors create a subsequent model with a model architecture identical to the created pre-training model. The one or more computer processors computationally reduce a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model. The one or more computer processors create deploy the subsequent model to one to more production environments.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: creating, by one or more computer processors, a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project; creating, by one or more computer processors, a subsequent model with a model architecture identical to the created pre-training model; computationally reducing, by one or more computer processors, a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model; and deploying, by one or more computer processors, the subsequent model to one to more production environments. 2 . The method of claim 1 , wherein comprises: statically analyzing, by one or more computer processors, one or more dependencies associated with the software project; creating, by one or more computer processors, one or more embeddings mapping the one or more statically analyzed dependencies; and creating, by one or more computer processors, one or more training sets utilizing the one or more created embeddings. 3 . The method of claim 1 , wherein statically analyzing one or more dependencies associated with the software project, comprises: extracting, by one or more computer processors, one or more imported dependencies; constructing, by one or more computer processors, one or more naming convention dependent tokens; extracting, by one or more computer processors, one or more contexts; and creating, by one or more computer processors, one or more code and comment pairs utilizing the one or more extracted imported dependencies, the one or more constructed naming convention dependent tokens, and the extracted one or more contexts. 4 . The method of claim 1 , further comprising: retraining, by one or more computer processors, one or more downstream software projects utilizing the created pre-training model. 5 . The method of claim 1 , wherein the created pre-training model is a recurrent neural network. 6 . The method of claim 1 , wherein the subsequent model is an auto-commenting model. 7 . The method of claim 6 , wherein the auto-commenting model is a recurrent neural network. 8 . The method of claim 7 , further comprises: integrating, by one or more computer processors, the auto-commenting model into one or more integrated development environments. 9 . The method of claim 8 , further comprising: concurrently creating, by one or more computer processors, one or more code comments utilizing the integrated auto-commenting model; and displaying, by one or more computer processors, one or more created code comment 10 . A computer program product comprising: one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the stored program instructions comprising: program instructions to create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project; program instructions to create a subsequent model with a model architecture identical to the created pre-training model; program instructions to computationally reduce a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model; and program instructions to create deploy the subsequent model to one to more production environments. 11 . The computer program product of claim 10 , wherein the program instructions, stored on the one or more computer readable storage media, comprise: program instructions to statically analyze one or more dependencies associated with the software project; program instructions to create one or more embeddings mapping the one or more statically analyzed dependencies; and program instructions to create one or more training sets utilizing the one or more created embeddings. 12 . The computer program product of claim 10 , wherein the subsequent model is an auto-commenting model. 13 . The computer program product of claim 12 , wherein the auto-commenting model is a recurrent neural network. 14 . The computer program product of claim 12 , wherein the program instructions, stored on the one or more computer readable storage media, comprise: program instructions to integrate the auto-commenting model into one or more integrated development environments. 15 . The computer program product of claim 14 , wherein the program instructions, stored on the one or more computer readable storage media, comprise: program instructions to concurrently create one or more code comments utilizing the integrated auto-commenting model; and program instructions to display one or more created code comment 16 . A computer system comprising: one or more computer processors; one or more computer readable storage media; and program instructions stored on the computer readable storage media for execution by at least one of the one or more processors, the stored program instructions comprising: program instructions to create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project; and program instructions to create a subsequent model with a model architecture identical to the created pre-training model; program instructions to computationally reduce a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model; and program instructions to create deploy the subsequent model to one to more production environments. 17 . The computer system of claim 16 , wherein the program instructions, stored on the one or more computer readable storage media, comprise: program instructions to statically analyze one or more dependencies associated with the software project; program instructions to create one or more embeddings mapping the one or more statically analyzed dependencies; and program instructions to create one or more training sets utilizing the one or more created embeddings. 18 . The computer system of claim 16 , wherein the subsequent model is an auto-commenting model. 19 . The computer system of claim 18 , wherein the auto-commenting model is a recurrent neural network. 20 . The computer system of claim 18 , wherein the program instructions, stored on the one or more computer readable storage media, comprise: program instructions to integrate the auto-commenting model into one or more integrated development environments.

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Transfer learning · CPC title

  • Supervised learning · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021286598A1 cover?
In an approach to creating code snippet auto-commenting models utilizing a pre-training model leveraging dependency data, one or more computer processors create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces a…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F8/33. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Sep 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).