Automated identification of code changes

US11048482B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11048482-B2
Application numberUS-201916523363-A
CountryUS
Kind codeB2
Filing dateJul 26, 2019
Priority dateJul 26, 2019
Publication dateJun 29, 2021
Grant dateJun 29, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations are described herein for automatically identifying, recommending, and/or automatically effecting changes to a source code base based on updates previously made to other similar code bases. Intuitively, multiple prior “migrations,” or mass updates, of complex software system code bases may be analyzed to identify changes that were made. More particularly, a particular portion or “snippet” of source code—which may include a whole source code file, a source code function, a portion of source code, or any other semantically-meaningful code unit—may undergo a sequence of edits over time. Techniques described herein leverage this sequence of edits to predict a next edit of the source code snippet. These techniques have a wide variety of applications, including but not limited to automatically updating of source code, source code completion, recommending changes to source code, etc.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented using one or more processors, comprising: accessing, from a code base, a sequence of edits made to a source code snippet over time; converting the sequence of edits to a corresponding sequence of graphs; independently applying each of the sequence of graphs as input across a graph neural network (GNN) model to generate a respective sequence of edit embeddings; iteratively applying each edit embedding of the sequence of edit embeddings as input across a recurrent neural network (RNN) to generate a respective sequence of outputs, wherein additional data generated from a previous iteration is applied as additional input across the RNN during the next iteration and each iteration produces an output that may represent a prediction of what the next edit to the source code snippet will be; once all of the edit embeddings have been iteratively applied across the RNN, determining the final output from the last iteration of the sequence of outputs generated based on the RNN, wherein the final output includes the predicted next edit of the source code snippet; and applying the next edit as a snippet in the code base or a different code base. 2. The method of claim 1 , wherein each graph of the sequence of graphs comprises an abstract syntax tree. 3. The method of claim 1 , wherein each of the outputs generated from the iteratively applying comprises a distribution over a set of candidate source code edits. 4. The method of claim 1 , wherein the next edit is applied as the snippet in the different code base. 5. A method implemented using one or more processors, comprising: accessing, from a code base, a sequence of edits made to a source code snippet over time; converting the sequence of edits to a corresponding sequence of graphs; independently applying each of the sequence of graphs as input across a graph neural network (GNN) to generate a corresponding sequence of edit embeddings; iteratively applying each edit embedding of the sequence of edit embeddings as input across a recurrent neural network (RNN) to generate a corresponding sequence of outputs, wherein additional data generated from a previous iteration is applied as additional input across the RNN during the next iteration and each iteration produces an output that may represent a prediction of what the next edit to the source code snippet will be; based on the sequence of outputs, predicting a next edit of the source code snippet following a first subset of the sequence of edits; comparing the predicted next edit to an edit contained in a second subset of the sequence of edits to determine an error, wherein the second subset is disjoint from the first subset; and training the GNN or RNN based on the error. 6. The method of claim 5 , wherein each graph of the sequence of graphs comprises an abstract syntax tree. 7. A system comprising one or more processors and memory storing instructions that, in response to execution of the instructions by the one or more processors, cause the one or more processors to: access, from a code base, a sequence of edits made to a source code snippet over time; convert the sequence of edits to a corresponding sequence of graphs; independently apply each of the sequence of graphs as input across a graph neural network (GNN) to generate a respective sequence of edit embeddings; iteratively apply each edit embedding of the sequence of edit embeddings as input across a recurrent neural network (RNN) to generate a respective sequence of outputs, wherein additional data generated from a previous iteration is applied as additional input across the RNN during the next iteration and each iteration produces an output that may represent a prediction of what the next edit to the source code snippet will be; once all of the edit embeddings have been iteratively applied across the RNN, determining the final output from the last iteration of the sequence of outputs generated based on the RNN, wherein the final output includes the predicted next edit of the source code snippet; and applying the next edit as a snippet in the code base or a different code base. 8. The system of claim 7 , wherein the sequence of graphs comprises a sequence of abstract syntax trees. 9. The system of claim 7 , wherein the final output comprises a distribution over a set of candidate source code edits. 10. The system of claim 7 , wherein the next edit is applied as the snippet in the different code base.

Assignees

Inventors

Classifications

  • Probabilistic or stochastic networks · CPC title

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11048482B2 cover?
Implementations are described herein for automatically identifying, recommending, and/or automatically effecting changes to a source code base based on updates previously made to other similar code bases. Intuitively, multiple prior “migrations,” or mass updates, of complex software system code bases may be analyzed to identify changes that were made. More particularly, a particular portion or …
Who is the assignee on this patent?
X Dev Llc
What technology area does this patent fall under?
Primary CPC classification G06F8/33. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 29 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).