Automated identification of code changes

US2021026605A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021026605-A1
Application numberUS-201916523363-A
CountryUS
Kind codeA1
Filing dateJul 26, 2019
Priority dateJul 26, 2019
Publication dateJan 28, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations are described herein for automatically identifying, recommending, and/or automatically effecting changes to a source code base based on updates previously made to other similar code bases. Intuitively, multiple prior “migrations,” or mass updates, of complex software system code bases may be analyzed to identify changes that were made. More particularly, a particular portion or “snippet” of source code—which may include a whole source code file, a source code function, a portion of source code, or any other semantically-meaningful code unit—may undergo a sequence of edits over time. Techniques described herein leverage this sequence of edits to predict a next edit of the source code snippet. These techniques have a wide variety of applications, including but not limited to automatically updating of source code, source code completion, recommending changes to source code, etc.

First claim

Opening claim text (preview).

1 . A method implemented using one or more processors, comprising: accessing a sequence of edits made to a source code snippet over time; applying data indicative of each edit of the sequence of edits as input across a first machine learning model to generate a corresponding sequence of edit embeddings; iteratively applying each edit embedding of the sequence of edit embeddings as input across a second machine learning model to generate a respective sequence of outputs; and based on a final output of the sequence of outputs generated from the applying, predicting a next edit of the source code snippet following the sequence of edits. 2 . (canceled) 3 . The method of claim 1 , wherein the second machine learning model comprises a recurrent neural network. 4 . The method of claim 1 , wherein the data indicative of the sequence of edits comprises a respective sequence of graphs. 5 . (canceled) 6 . The method of claim 1 , wherein the first machine learning model comprises a graph neural network (“GNN”). 7 . The method of claim 4 , wherein each graph of the sequence of graphs comprises an abstract syntax tree. 8 . The method of claim 1 , wherein the output generated from the applying comprises a distribution over a set of candidate source code edits, and the predicting is based on the distribution. 9 . The method of claim 1 , wherein the source code snippet is part of a to-be-updated code base, and the accessing comprises accessing, from a different code base than the to-be-updated code base, the sequence of edits made to the source code snippet over time. 10 . A method implemented using one or more processors, comprising: accessing a sequence of edits made to a source code snippet over time; applying data indicative of each edit of a first subset of the sequence of edits as input across a first machine learning model to generate a corresponding sequence of edit embeddings; iteratively applying each edit embedding of the sequence of edit embeddings as input across a second machine learning model to generate a corresponding sequence of outputs; based on the sequence of outputs, predicting a next edit of the source code snippet following the first subset of the sequence of edits; comparing the predicted next edit to an edit contained in a second subset of the sequence of edits to determine an error, wherein the second subset is disjoint from the first subset; and training the machine learning model based on the error. 11 . (canceled) 12 . The method of claim 10 , wherein the second machine learning model comprises a recurrent neural network. 13 . The method of claim 10 , wherein the data indicative of the sequence of edits comprises a respective sequence of graphs. 14 . (canceled) 15 . The method of claim 10 , wherein the first machine learning model comprises a graph neural network (“GNN”). 16 . The method of claim 13 , wherein each graph of the sequence of graphs comprises an abstract syntax tree. 17 . A system comprising one or more processors and memory storing instructions that, in response to execution of the instructions by the one or more processors, cause the one or more processors to: access a sequence of edits made to a source code snippet over time; apply data indicative of each edit of the sequence of edits as input across a first machine learning model to generate a corresponding sequence of edit embeddings; iteratively apply each edit embedding of the sequence of edit embeddings as input across a second machine learning model to generate a respective sequence of outputs; and based on a final output of the sequence of outputs generated from the applying, predict a next edit of the source code snippet following the sequence of edits. 18 . (canceled) 19 . The system of claim 17 , wherein the second machine learning model comprises a recurrent neural network. 20 . The system of claim 17 , wherein the data indicative of the sequence of edits comprises a respective sequence of graphs.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Probabilistic or stochastic networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021026605A1 cover?
Implementations are described herein for automatically identifying, recommending, and/or automatically effecting changes to a source code base based on updates previously made to other similar code bases. Intuitively, multiple prior “migrations,” or mass updates, of complex software system code bases may be analyzed to identify changes that were made. More particularly, a particular portion or …
Who is the assignee on this patent?
X Dev Llc
What technology area does this patent fall under?
Primary CPC classification G06F8/70. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 28 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).