What technology area does this patent fall under?

Primary CPC classification G06F8/35. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 03 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Machine learning model based ranking of generated code

US12566593B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12566593-B2
Application number	US-202318464536-A
Country	US
Kind code	B2
Filing date	Sep 11, 2023
Priority date	Jun 28, 2023
Publication date	Mar 3, 2026
Grant date	Mar 3, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A generative AI based pipeline has been created that ranks generated responses that are candidate software patches. The ranking is based on predicted quality measures of code fragments within a corresponding prompt to a generated AI model. The predicted quality measures are generated by a machine learning model that has been trained based on features that are values/measures of similarity metrics between code fragments, between code fragment changes, between code structures, and/or between changes of code structures.

First claim

Opening claim text (preview).

The invention claimed is: 1 . A method comprising: obtaining a plurality of code fragments generated from a generative artificial intelligence (AI) model, wherein the plurality of code fragments corresponds to a set of one or more prompts input into the generative AI model; determining values of features for input to a machine learning model trained to predict values of generated code fragments, wherein the features are metrics of similarity among code fragments in the set of one or more prompts and the generated code fragments and the metrics of similarity measure at least one of similarity of code fragments and similarity of changes to code fragments and wherein the metrics of similarity correspond to at least two of ratios of lengths among parts of a prompt, ratio of lengths between a generated code fragment and a part of a corresponding prompt, ratio of edit distances between different parts of a prompt and between a part of a prompt and a corresponding generated code fragment, ratio of edit operations between different parts of a prompt and between a part of a prompt and a corresponding generated code fragment, and measure of correlation of positions of edits between different parts of a prompt; and ranking the generated code fragments based on the predicted values from the machine learning model. 2 . The method of claim 1 , wherein determining the values of features for input to the machine learning model comprises calculating, for each prompt, at least two of similarity of text of a code fragment in the prompt and text of a corresponding one of the generated code fragments, similarity of code structure between the code fragment in the prompt and the corresponding one of the generated code fragment, and similarity of lengths of the code fragment in the prompt and the corresponding one of the generated code fragment. 3 . The method of claim 1 , wherein determining the values of features for input to the machine learning model comprises calculating, for each prompt, at least two of similarity of textual changes between a pair of reference code fragments in the prompt and textual changes between a code fragment in the prompt and a corresponding one of the generated code fragments, similarity of structural changes between the pair of reference code fragments and structural changes between the code fragment in the prompt and the corresponding one of the generated code fragments, similarity of text of the code fragment in the prompt and a first of the pair of reference code fragments, similarity of text of a second of the pair of reference code fragments and the generated code fragment, similarity of code structure of the code fragment in the prompt and the first of the pair of reference code fragments, and similarity of code structure of the second of the pair of reference code fragments and the corresponding one of the generated code fragment. 4 . The method of claim 1 , further comprising, for each prompt, generating a code structure signature for each code fragment in the prompt and for the corresponding one of the generated code fragments, wherein determining the values of features is based, at least in part, on the code structure signatures. 5 . The method of claim 4 , wherein generating the code structure signature comprises generating a representation of a code fragment without variability of names. 6 . The method of claim 4 , wherein generating the code structure signature comprises generating a representation of a code fragment that replaces each identifier name with a representative token for identifiers and each variable name with a representative token for variables. 7 . The method of claim 4 , wherein determining the values of features based, at least in part, on the code structure signatures comprises calculating values for a subset of the similarity metrics of code fragments as represented by the code structure signatures. 8 . The method of claim 1 , wherein the machine learning model is an ensemble of weak prediction models. 9 . The method of claim 1 , wherein the machine learning model is one or more regression models. 10 . The method of claim 1 , wherein the generative AI model is a language model with a transformer architecture. 11 . A non-transitory, machine-readable medium having program code stored thereon, the program code comprising instructions to: obtain a plurality of code fragments generated from a generative artificial intelligence (AI) model, wherein the plurality of code fragments corresponds to a set of one or more prompts input into the generative AI model; determine values of features for input to a machine learning model trained to predict values of the generated code fragments, wherein the features are metrics of similarity among the code fragments in the set of one or more prompts and the generated code fragments and the metrics of similarity measure at least one of similarity of code fragments and similarity of changes to code fragments and wherein the metrics of similarity correspond to at least two of ratios of lengths among parts of a prompt, ratio of lengths between a generated code fragment and a part of a corresponding prompt, ratio of edit distances between different parts of a prompt and between a part of a prompt and a corresponding generated code fragment, ratio of edit operations between different parts of a prompt and between a part of a prompt and a corresponding generated code fragment, and measure of correlation of positions of edits between different parts of a prompt; and rank the generated code fragments based on the predicted values output from the machine learning model. 12 . The non-transitory, machine-readable medium of claim 11 , wherein the instructions to determine the values of features for input to the machine learning model comprise instructions to calculate, for each prompt, at least two of similarity of text of a code fragment in the prompt and text of a corresponding one of the generated code fragments which corresponds to the prompt, similarity of code structure between the code fragment in the prompt and the corresponding one of the generated code fragments, similarity of lengths of the code fragment in the prompt and the corresponding one of the generated code fragment. 13 . The non-transitory, machine-readable medium of claim 11 , wherein the program code further has stored thereon instructions to, for each prompt, generate a code structure signature for each code fragment in the prompt and for the corresponding one of the generated code fragments, wherein the instructions to determine the values of features is based, at least in part, on the code structure signatures. 14 . The non-transitory, machine-readable medium of claim 13 , wherein the instructions to generate the code structure signature comprise instructions to generate a representation of a code fragment without variability of names. 15 . The non-transitory, machine-readable medium of claim 13 , wherein the instructions to generate the code structure signature comprise instructions to generate a representation of a code fragment that replaces each identifier name with a representative token for identifiers and each variable name with a representative token for variables. 16 . The non-transitory, machine-readable medium of claim 13 , wherein the instructions to determine the values of features based, at least in part, on the code structure signatures comprise instructions to calculate values for a subset of the similarity metrics of code fragments as represented by the code structure signatures. 17 . The non-transitory, machine-

Assignees

Veracode Inc

Inventors

Classifications

G06F8/33
Intelligent editors · CPC title
G06F8/70
Software maintenance or management · CPC title
G06F8/30
Creation or generation of source code · CPC title
G06F8/73
Program documentation · CPC title
G06F8/35Primary
model driven · CPC title

Patent family

Related publications grouped by family.

View patent family 94125967

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12566593B2 cover?: A generative AI based pipeline has been created that ranks generated responses that are candidate software patches. The ranking is based on predicted quality measures of code fragments within a corresponding prompt to a generated AI model. The predicted quality measures are generated by a machine learning model that has been trained based on features that are values/measures of similarity metri…
Who is the assignee on this patent?: Veracode Inc
What technology area does this patent fall under?: Primary CPC classification G06F8/35. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 03 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for generating code using language models trained on computer code

Systems and methods for generating natural language using language models trained on computer code

Syntax subtree code strengthening

Pull request risk prediction for bug-introducing changes

Deep q-network reinforcement learning for testing case selection and prioritization

Systems and methods for detecting and remedying software anomalies

Source code revision control with selectable file portion synchronization

Frequently asked questions