Shadow satisfiability modulo theories solver systems
US-2024330709-A1 · Oct 3, 2024 · US
US2025200392A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025200392-A1 |
| Application number | US-202318541988-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 15, 2023 |
| Priority date | Dec 15, 2023 |
| Publication date | Jun 19, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Verifying large language model responses involves obtaining a query and its corresponding answer from a large language model. This conversational text is then fed into a second large language model, which translates the answer into first-order logic. The verification process uses an automated theorem prover. It checks the validity of this logic translation by determining the unsatisfiability of two scenarios: one where the negation of the logic translation and domain-specific logic formulas are combined, and another where the logic translation itself is combined with these formulas. Based on this analysis, the theorem prover ascertains whether the translated answer is valid, invalid, or neither. The final step is communicating this verification status through an appropriate output medium, such as a graphical user interface, a database, or a report, providing a structured and methodical approach to assessing the accuracy and reliability of language model responses.
Opening claim text (preview).
What is claimed is: 1 . A method comprising: converting a set of natural language text documents into a domain logic model, the domain logic model comprising a set of first-order logic formulas; enumerating a set of models of a first-order logic formula of the set of first-order logic formulas of the domain logic model; minimizing a particular model of the set of models to yield a minimized model that replaces the particular model in the set of models; translating the set of models comprising the minimized model to a set of natural-language texts; providing the set of natural-language texts in a graphical user interface; obtaining a conversational text, the conversational text comprising a query of a first large language model and an answer generated by the first large language model in response to the query; using the conversational text to obtain a first-order logic translation of the answer from a second large language model; determining whether the first-order logic translation of the answer is valid based on using a satisfiability modulo theories solver to determine whether a first query comprising a negation of the first-order logic translation of the answer is unsatisfiable; and determining whether the first-order logic translation of the answer is invalid based on using a satisfiability modulo theories solver to determine whether a second query comprising the first-order logic translation of the answer is unsatisfiable; and outputting an indication of whether the answer is valid, invalid, or neither valid nor invalid to a graphical user interface, a database, or a report. 2 . The method of claim 1 , further comprising: using the conversational text to obtain a plurality of first-order logic translations of the answer from one or more large language models; wherein the plurality of first-order logic translations comprises the first-order logic translation; and wherein the one or more large language models comprises the first large language model; partitioning the plurality of first-order logic translations of the answer based on logical equivalence with respect to the set of first order logical formulas of the domain logic model into one or more non-overlapping sets of the plurality of first-order logic translations; and selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations. 3 . The method of claim 1 , translating the set of natural-language statements into a plurality of first-order logic formulas; and using a satisfiability modulo theories solver to extract an unsatisfiable core formula from the plurality of first-order logic formulas; translating the unsatisfiable core formula to a natura-language text; and providing the natural-language text in a graphical user interface. 4 . A method for large language model verification comprising: obtaining a conversational text, the conversational text comprising a query of a first large language model and an answer generated by the first large language model in response to the query; using the conversational text to obtain a first-order logic translation of the answer from a second large language model; determining whether the first-order logic translation of the answer is valid based on using an automated theorem prover to determine whether a first query comprising a negation of the first-order logic translation of the answer is unsatisfiable; and determining whether the first-order logic translation of the answer is invalid based on using an automated theorem prover to determine whether a second query comprising the first-order logic translation of the answer is unsatisfiable; and outputting an indication of whether the answer is valid, invalid, or neither valid nor invalid to a graphical user interface, a database, or a report. 5 . The method of claim 4 , further comprising: using the conversational text to obtain a plurality of first-order logic translations of the answer from one or more large language models; wherein the plurality of first-order logic translations comprises the first-order logic translation; and wherein the one or more large language models comprises the first large language model; partitioning the plurality of first-order logic translations of the answer based on logical equivalence with respect to the set of first order logical formulas of the domain logic model into one or more non-overlapping sets of the plurality of first-order logic translations; and selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations. 6 . The method of claim 5 , wherein selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations is based on: identifying a non-overlapping set of the one or more non-overlapping sets that comprises all of the plurality of first-order logic translations; and selecting the first-order logic translation from the non-overlapping set. 7 . The method of claim 5 , wherein selecting the first-order logical translation of the answer from the one or more non-overlapping sets of the plurality of first-order logical translations is based on: identifying a non-overlapping set of the one or more non-overlapping sets that comprises more than half of the plurality of first-order logic translations; and selecting the first-order logic translation from the non-overlapping set. 8 . The method of claim 4 , further comprising: converting a set of natural language text documents into the set of first-order logic formulas of the domain logic model. 9 . The method of claim 4 , further comprising: enumerating a set of models of a first-order logic formula of the set of first-order logic formulas of the domain logic model; minimizing a particular model of the set of models to yield a minimized model that replaces the particular model in the set of models; translating the set of models comprising the minimized model to a set of natural-language texts; and providing the set of natural-language texts in a graphical user interface. 10 . The method of claim 4 , further comprising: translating a set of natural-language statements into a plurality of first-order logic formulas; and using an automated theorem prover to extract an unsatisfiable core formula from the plurality of first-order logic formulas; translating the unsatisfiable core formula to a natura-language text; and providing the natural-language text in a graphical user interface. 11 . The method of claim 4 , wherein: the first query comprises is unsatisfiable; the second query is satisfiable; the indication output indicates that the answer is valid. 12 . The method of claim 4 , further comprising: the first query is satisfiable; the second query is unsatisfiable; the indication output indicates that the answer is invalid. 13 . The method of claim 4 , further comprising: the first query is satisfiable; the second query is satisfiable; and the indication output indicates that the answer is neither valid nor invalid. 14 . A system comprising: a first one or more programmable electronic devices to implement an automated theorem prover; and a second one or more programmable electronic devices to implement a large language model verification system, the large language model verification system comprising instructions which when executed cause the large language model verification system to perform: obtaining a conversational text, the con
Related publications grouped by family.
Answers are generated from the same data shown on this page.