Fine-tuning large language model(s) using reinforcement learning with search engine feedback
US-2025190506-A1 · Jun 12, 2025 · US
US12530541B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12530541-B2 |
| Application number | US-202418428530-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 31, 2024 |
| Priority date | Oct 13, 2023 |
| Publication date | Jan 20, 2026 |
| Grant date | Jan 20, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for editing a large language model are provided. The large language model generates a sequence of tokens, a first probability of a pre-edit output based on the sequence of tokens, and a second probability of a target output based on the sequence of tokens. A loss function is provided based on the first probability and the second probability. A plurality of gradients of the large language model with respect to the loss function is computed. An edit location of the large language model is determined based on the plurality of gradients. The large language model is edited by editing weights at the edit location of the large language model, such that the updated large language model generates the target output for an input including the sequence of words.
Opening claim text (preview).
What is claimed is: 1 . A method of editing a large language model, the method comprising: receiving, via a data interface, a sequence of words, a target output, and a pre-edit output; generating, using the large language model, a sequence of tokens of the sequence of words; generating, using the large language model, a first probability of the pre-edit output based on the sequence of tokens; generating, using the large language model, a second probability of the target output based on the sequence of tokens; providing a loss function based on the first probability of the pre-edit output and the second probability of the target output; computing a plurality of gradients of the large language model with respect to the loss function; determining an edit location of the large language model based on the plurality of gradients; and editing the large language model by editing weights at the edit location of the large language model, such that the updated large language model generates the target output for an input including the sequence of words. 2 . The method of claim 1 , wherein the sequence of words is associated with a non-binary proposition. 3 . The method of claim 1 , wherein the sequence of words is associated with a Boolean classification prompt, and wherein the edit location includes an edit token location associated with a subject of the Boolean classification prompt. 4 . The method of claim 3 , wherein the subject is associated with a plurality of tokens including a last token, and wherein the edit token location is associated with a token of the plurality of tokens before the last token. 5 . The method of claim 1 , wherein the computing the plurality of gradients of the large language model with respect to the loss function includes: computing the plurality of gradients of the large language model with respect to the loss function over a group of tokens from the sequence of tokens and a first group of layers of the large language model. 6 . The method of claim 5 , wherein the determining the edit location of the large language model based on the plurality of gradients includes: determining an edit layer location by selecting from a second group of layers of the large language model based on the plurality of gradients associated with the second group of layers. 7 . The method of claim 1 , wherein the editing the large language includes: editing the weights at the edit location without using a subject label. 8 . A system for language model editing, the system comprising: a memory that stores a large language model and a plurality of processor-executable instructions; a communication interface that receives a sequence of words, a target output, and a pre-edit output; one or more hardware processors that read and execute the plurality of processor-executable instructions from the memory to perform operations comprising: generating, using the large language model, a sequence of tokens of the sequence of words; generating, using the large language model, a first probability of the pre-edit output based on the sequence of tokens; generating, using the large language model, a second probability of the target output based on the sequence of tokens; providing a loss function based on the first probability of the pre-edit output and the second probability of the target output; computing a plurality of gradients of the large language model with respect to the loss function; determining an edit location of the large language model based on the plurality of gradients; and editing the large language model by editing weights at the edit location of the large language model, such that the updated large language model generates the target output for an input including the sequence of words. 9 . The system of claim 8 , wherein the sequence of words is associated with a non-binary proposition. 10 . The system of claim 8 , wherein the sequence of words is associated with a Boolean classification prompt, and wherein the edit location includes an edit token location associated with a subject of the Boolean classification prompt. 11 . The system of claim 10 , wherein the subject is associated with a plurality of tokens including a last token, and wherein the edit token location is associated with a token of the plurality of tokens before the last token. 12 . The system of claim 10 , wherein the computing the plurality of gradients of the large language model with respect to the loss function includes: computing the plurality of gradients of the large language model with respect to the loss function over a group of tokens from the sequence of tokens and a first group of layers of the large language model. 13 . The system of claim 12 , wherein the determining the edit location of the large language model based on the plurality of gradients includes: determining an edit layer location by selecting from a second group of layers of the large language model based on the plurality of gradients associated with the second group of layers. 14 . The system of claim 8 , wherein the editing the weights at the edit location of the large language includes: editing the weights at the edit location without using a subject label. 15 . A non-transitory machine-readable medium comprising a plurality of machine-executable instructions which, when executed by one or more processors, are adapted to cause the one or more processors to perform operations comprising: receiving, via a data interface, a sequence of words, a target output, and a pre-edit output; generating, using the large language model, a sequence of tokens of the sequence of words; generating, using the large language model, a first probability of the pre-edit output based on the sequence of tokens; generating, using the large language model, a second probability of the target output based on the sequence of tokens; providing a loss function based on the first probability of the pre-edit output and the second probability of the target output; computing a plurality of gradients of the large language model with respect to the loss function; determining an edit location of the large language model based on the plurality of gradients; and editing the large language model by editing weights at the edit location of the large language model, such that the updated large language model generates the target output for an input including the sequence of words. 16 . The non-transitory machine-readable medium of claim 15 , wherein the sequence of words is associated with a non-binary proposition. 17 . The non-transitory machine-readable medium of claim 15 , wherein the sequence of words is associated with a Boolean classification prompt, and wherein the edit location includes an edit token location associated with a subject of the Boolean classification prompt. 18 . The non-transitory machine-readable medium of claim 17 , wherein the subject is associated with a plurality of tokens including a last token, and wherein the edit token location is associated with a token of the plurality of tokens before the last token. 19 . The non-transitory machine-readable medium of claim 15 , wherein the computing the plurality of gradients of the large language model with respect to the loss function includes: computing the plurality of gradients of the large language model with respect to the loss function over a group of tokens from the sequence of tokens and a first group of layers of the large language model.
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Editing, e.g. inserting or deleting · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.