Automated program repair tool

US2025355786A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025355786-A1
Application numberUS-202519284252-A
CountryUS
Kind codeA1
Filing dateJul 29, 2025
Priority dateMay 15, 2020
Publication dateNov 20, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An automated program repair tool utilizes a neural transformer model with attention to predict the contents of a bug repair in the context of source code having a bug of an identified bug type. The neural transformer model is trained on a large unsupervised corpus of source code using a span-masking denoising optimization objective, and fine-tuned on a large supervised dataset of triplets containing a bug-type annotation, software bug, and repair. The bug-type annotation is derived from an interprocedural static code analyzer. A bug type edit centroid is computed for each bug type and used in the inference decoding phase to generate the bug repair.

First claim

Opening claim text (preview).

What is claimed: 1 . A system comprising: one or more processors; and a memory that stores one or more programs that are configured to be executed by the one or more processors, the one or more programs including instructions that: obtain a code snippet with a source code bug and an annotated bug type; predict a bug repair for the code snippet from a neural transformer model with attention based on the code snippet and the annotated bug type; and utilize the predicted bug repair to repair the code snippet. 2 . The system of claim 1 , wherein the one or more programs include further instructions that: identify the annotated bug type using an interprocedural static analyzer. 3 . The system of claim 1 , wherein the one or more programs include further instructions that: pre-train the neural transformer model with an unsupervised training dataset, the unsupervised training dataset including source code snippets. 4 . The system of claim 3 , wherein the one or more programs include further instructions that: fine-tune the pre-trained neural transformer model with a supervised training dataset, the supervised training dataset containing translation tasks, a translation task containing a source code with a bug, a bug type annotation of the bug, and a bug fix for the bug. 5 . The system of claim 1 , wherein the one or more programs include further instructions that: generate a bug edit representation for each bug within the supervised training dataset; and compute a bug edit centroid for each bug type based on bug edit representations of each bug type, to be used during inference in place of the edit representation. 6 . The system of claim 5 , wherein the neural transformer model includes one or more encoder blocks and one or more decoder blocks. 7 . The system of claim 6 , wherein the one or more programs include further instructions that: utilize the bug-fixing edit representation in at least one or more decoder blocks during training stage, and utilize bug edit centroid in at least one or more decoder blocks during inference stage, the bug edit centroid of a same bug type as the annotated bug type. 8 . A computer-implemented method, comprising: pre-training a neural transformer model with an unsupervised training dataset, the unsupervised training dataset including a plurality of sequences of source code; fine-tuning the neural transformer model with a supervised training dataset, the supervised training dataset based a triplet including a code snippet with a bug, a code repair for the bug, and an annotated bug type; and applying the neural transformer model to generate a first code repair for a first code snippet having an identified bug and an identified bug type. 9 . The method of claim 8 , further comprising: applying a span masking function to each sequence of source code to mask out a subset of subtokens in a sequence; and wherein the neural transformer model learns original subtokens of the sequence. 10 . The method of claim 8 , wherein fine-tuning the neural transformer model with a supervised training dataset further comprises: generating a bug edit embedding representing edits made to correct a bug; and computing a bug type centroid for each bug type from the bug edit embeddings of a particular bug type. 11 . The method of claim 10 , wherein the neural transformer model with attention includes one or more encoder blocks coupled to one or more decoder blocks. 12 . The method of claim 11 , wherein fine-tuning the neural transformer model with supervised training dataset further comprises: concatenating the bug-fixing edit embedding with output from a last encoder block to input to a first decoder block or to encoder-decoder attention block, and concatenating the bug-fixing edit embedding with output embedding at each temporal step. 13 . The method of claim 8 , further comprising: identifying the annotated bug type through a static analysis of the code snippet. 14 . The method of claim 8 , wherein the neural transformer model includes one or more encoder blocks and one or more decoder blocks, wherein an encoder block contains a multi-head attention layer and a feed-forward neural network, wherein a decoder block contains a masked multi-head attention layer, an encoder-decoder multi-head attention layer, and a feed-forward neural network. 15 . The method of claim 8 , wherein the annotated bug type includes a null pointer dereference, a memory leak, an immutable cast, empty vector access, or thread safety violation. 16 . A device, comprising: at least one processor and a memory; wherein the at least one processor is configured to: train a neural transformer model with attention to learn to translate a source code snippet with a bug and bug type into a code snippet with a repair for the bug by transfer learning, wherein the transfer learning pre-trains the neural transformer model from a plurality of unsupervised training data, the plurality of unsupervised training data including code snippets from a plurality of source code programs, wherein the transfer learning fine-tunes the pre-trained neural transformer model using a plurality of translation tasks, a translation task including a code snippet with a bug, a code snippet with a repair for the bug, and a bug type for the bug; and utilize the neural transformer model to predict a code repair for a second code snippet having a bug and a bug type. 17 . The device of claim 16 , wherein the at least one processor is further configured to: utilize a static code analyzer to identify the bug type of the second code snippet. 18 . The device of claim 16 , wherein the at least one processor is further configured to: generate a bug edit representation for each translation task; and compute a bug type centroid for each bug type based on each bug edit representation of a bug type. 19 . The device of claim 18 , wherein the neural transformer model includes one or more encoder blocks coupled to one or more decoder blocks, wherein output of a last encoder block is input into each of the decoder blocks. 20 . The device of claim 19 , wherein the at least one processor is further configured to: concatenate the output of the last encoder block with a bug type centroid of a bug type of a fine-tuning triplet to a first decoder block.

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • Transfer learning · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Supervised learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025355786A1 cover?
An automated program repair tool utilizes a neural transformer model with attention to predict the contents of a bug repair in the context of source code having a bug of an identified bug type. The neural transformer model is trained on a large unsupervised corpus of source code using a span-masking denoising optimization objective, and fine-tuned on a large supervised dataset of triplets conta…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/362. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Nov 20 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).