What technology area does this patent fall under?

Primary CPC classification G06N5/013. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jun 19 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Large language model verification

US2025200392A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2025200392-A1
Application number	US-202318541988-A
Country	US
Kind code	A1
Filing date	Dec 15, 2023
Priority date	Dec 15, 2023
Publication date	Jun 19, 2025
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Verifying large language model responses involves obtaining a query and its corresponding answer from a large language model. This conversational text is then fed into a second large language model, which translates the answer into first-order logic. The verification process uses an automated theorem prover. It checks the validity of this logic translation by determining the unsatisfiability of two scenarios: one where the negation of the logic translation and domain-specific logic formulas are combined, and another where the logic translation itself is combined with these formulas. Based on this analysis, the theorem prover ascertains whether the translated answer is valid, invalid, or neither. The final step is communicating this verification status through an appropriate output medium, such as a graphical user interface, a database, or a report, providing a structured and methodical approach to assessing the accuracy and reliability of language model responses.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: converting a set of natural language text documents into a domain logic model, the domain logic model comprising a set of first-order logic formulas; enumerating a set of models of a first-order logic formula of the set of first-order logic formulas of the domain logic model; minimizing a particular model of the set of models to yield a minimized model that replaces the particular model in the set of models; translating the set of models comprising the minimized model to a set of natural-language texts; providing the set of natural-language texts in a graphical user interface; obtaining a conversational text, the conversational text comprising a query of a first large language model and an answer generated by the first large language model in response to the query; using the conversational text to obtain a first-order logic translation of the answer from a second large language model; determining whether the first-order logic translation of the answer is valid based on using a satisfiability modulo theories solver to determine whether a first query comprising a negation of the first-order logic translation of the answer is unsatisfiable; and determining whether the first-order logic translation of the answer is invalid based on using a satisfiability modulo theories solver to determine whether a second query comprising the first-order logic translation of the answer is unsatisfiable; and outputting an indication of whether the answer is valid, invalid, or neither valid nor invalid to a graphical user interface, a database, or a report. 2 . The method of claim 1 , further comprising: using the conversational text to obtain a plurality of first-order logic translations of the answer from one or more large language models; wherein the plurality of first-order logic translations comprises the first-order logic translation; and wherein the one or more large language models comprises the first large language model; partitioning the plurality of first-order logic translations of the answer based on logical equivalence with respect to the set of first order logical formulas of the domain logic model into one or more non-overlapping sets of the plurality of first-order logic translations; and selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations. 3 . The method of claim 1 , translating the set of natural-language statements into a plurality of first-order logic formulas; and using a satisfiability modulo theories solver to extract an unsatisfiable core formula from the plurality of first-order logic formulas; translating the unsatisfiable core formula to a natura-language text; and providing the natural-language text in a graphical user interface. 4 . A method for large language model verification comprising: obtaining a conversational text, the conversational text comprising a query of a first large language model and an answer generated by the first large language model in response to the query; using the conversational text to obtain a first-order logic translation of the answer from a second large language model; determining whether the first-order logic translation of the answer is valid based on using an automated theorem prover to determine whether a first query comprising a negation of the first-order logic translation of the answer is unsatisfiable; and determining whether the first-order logic translation of the answer is invalid based on using an automated theorem prover to determine whether a second query comprising the first-order logic translation of the answer is unsatisfiable; and outputting an indication of whether the answer is valid, invalid, or neither valid nor invalid to a graphical user interface, a database, or a report. 5 . The method of claim 4 , further comprising: using the conversational text to obtain a plurality of first-order logic translations of the answer from one or more large language models; wherein the plurality of first-order logic translations comprises the first-order logic translation; and wherein the one or more large language models comprises the first large language model; partitioning the plurality of first-order logic translations of the answer based on logical equivalence with respect to the set of first order logical formulas of the domain logic model into one or more non-overlapping sets of the plurality of first-order logic translations; and selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations. 6 . The method of claim 5 , wherein selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations is based on: identifying a non-overlapping set of the one or more non-overlapping sets that comprises all of the plurality of first-order logic translations; and selecting the first-order logic translation from the non-overlapping set. 7 . The method of claim 5 , wherein selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations is based on: identifying a non-overlapping set of the one or more non-overlapping sets that comprises more than half of the plurality of first-order logic translations; and selecting the first-order logic translation from the non-overlapping set. 8 . The method of claim 4 , further comprising: converting a set of natural language text documents into the set of first-order logic formulas of the domain logic model. 9 . The method of claim 4 , further comprising: enumerating a set of models of a first-order logic formula of the set of first-order logic formulas of the domain logic model; minimizing a particular model of the set of models to yield a minimized model that replaces the particular model in the set of models; translating the set of models comprising the minimized model to a set of natural-language texts; and providing the set of natural-language texts in a graphical user interface. 10 . The method of claim 4 , further comprising: translating a set of natural-language statements into a plurality of first-order logic formulas; and using an automated theorem prover to extract an unsatisfiable core formula from the plurality of first-order logic formulas; translating the unsatisfiable core formula to a natura-language text; and providing the natural-language text in a graphical user interface. 11 . The method of claim 4 , wherein: the first query comprises is unsatisfiable; the second query is satisfiable; the indication output indicates that the answer is valid. 12 . The method of claim 4 , further comprising: the first query is satisfiable; the second query is unsatisfiable; the indication output indicates that the answer is invalid. 13 . The method of claim 4 , further comprising: the first query is satisfiable; the second query is satisfiable; and the indication output indicates that the answer is neither valid nor invalid. 14 . A system comprising: a first one or more programmable electronic devices to implement an automated theorem prover; and a second one or more programmable electronic devices to implement a large language model verification system, the large language model verification system comprising instructions which when executed cause the large language model verification system to perform: obtaining a conversational text, the con

Assignees

Amazon Tech Inc

Inventors

Classifications

G06F40/20
Natural language analysis (semantic analysis of natural language G06F40/30) · CPC title
G06F40/56
Natural language generation · CPC title
G06F40/35
Discourse or dialogue representation · CPC title
G06F40/30
Semantic analysis · CPC title
G06N3/08
Learning methods · CPC title

Patent family

Related publications grouped by family.

View patent family 94284128

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025200392A1 cover?: Verifying large language model responses involves obtaining a query and its corresponding answer from a large language model. This conversational text is then fed into a second large language model, which translates the answer into first-order logic. The verification process uses an automated theorem prover. It checks the validity of this logic translation by determining the unsatisfiability of…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G06N5/013. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jun 19 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).