In-context text-to-SQL with reduced labeled data

US12554709B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12554709-B2
Application numberUS-202318225277-A
CountryUS
Kind codeB2
Filing dateJul 24, 2023
Priority dateApr 28, 2023
Publication dateFeb 17, 2026
Grant dateFeb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Aspects of the disclosure are directed to methods, systems, and non-transitory computer readable media for automatically generating queries on a database from natural language text using in-context learning to leverage zero-shot and few-shot adaptation capabilities of large language models (LLMs). The methods, systems, and non-transitory computer readable media can consider database information, employ execution based consistency decoding, and employ a mixture of prompts and/or LLMs.

First claim

Opening claim text (preview).

The invention claimed is: 1 . A method for processing queries, the method comprising: receiving, by one or more processors, a natural language query and database information; converting, by the one or more processors, the natural language query into a database language query using the database information, wherein converting the natural language query into the database language query comprises: generating various database description prompts from the natural language query and the database information; sampling a large language model (LLM) multiple times with the various database description prompts using a mixing coefficient for weighting the various description prompts to generate a plurality of potential database language queries; executing the plurality of potential database language queries to generate a plurality of potential results; and selecting the database language query that provides a result consistent with a threshold amount of the plurality of potential results; and executing, by the one or more processors, the database language query to generate a result for the natural language query. 2 . The method of claim 1 , wherein the database information comprises database schema, database content, primary keys that uniquely identify rows of each table of the database schema, and foreign keys that join one or more tables of the data schema. 3 . The method of claim 1 , wherein the various database description prompts comprise a concise prompt and a verbose prompt. 4 . The method of claim 1 , wherein converting the natural language query into a database language query comprises removing errors from the plurality of potential results. 5 . The method of claim 1 , wherein converting the natural language query into a database language query comprises concatenating the plurality of potential results. 6 . The method of claim 1 , wherein the threshold amount comprises a majority of the plurality of potential results. 7 . The method of claim 1 , wherein the database language query comprises at least one of standard query language (SQL) or graph query language (GraphQL). 8 . A system comprising: one or more processors; and one or more storage devices coupled to the one or more processors and storing instructions that, when executed by the one or more processors, cause the one or more processors to perform operations for processing queries, the operations comprising: receiving a natural language query and database information; converting the natural language query into a database language query using the database information, wherein converting the natural language query into the database language query comprises: generating various database description prompts from the natural language query and the database information; sampling a large language model (LLM) multiple times with the various database description prompts using a mixing coefficient for weighting the various description prompts to generate a plurality of potential database language queries; executing the plurality of potential database language queries to generate a plurality of potential results; and selecting the database language query that provides a result consistent with a threshold amount of the plurality of potential results; and executing the database language query to generate a result for the natural language query. 9 . The system of claim 8 , wherein the database information comprises database schema, database content, primary keys that uniquely identify rows of each table of the database schema, and foreign keys that join one or more tables of the data schema. 10 . The system of claim 8 , wherein the various database description prompts comprise a concise prompt and a verbose prompt. 11 . The system of claim 8 , wherein converting the natural language query into a database language query comprises removing errors from the plurality of potential results. 12 . The system of claim 8 , wherein converting the natural language query into a database language query comprises concatenating the plurality of potential results. 13 . The system of claim 8 , wherein the threshold amount comprises a majority of the plurality of potential results. 14 . The system of claim 8 , wherein the database language query comprises at least one of standard query language (SQL) or graph query language (GraphQL). 15 . A non-transitory computer readable medium for storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations for processing queries, the operations comprising: receiving a natural language query and database information; converting the natural language query into a database language query using the database information, wherein converting the natural language query into the database language query comprises: generating various database description prompts from the natural language query and the database information; sampling a large language model (LLM) multiple times with the various database description prompts using a mixing coefficient for weighting the various description prompts to generate a plurality of potential database language queries; executing the plurality of potential database language queries to generate a plurality of potential results; and selecting the database language query that provides a result consistent with a threshold amount of the plurality of potential results; and executing the database language query to generate a result for the natural language query. 16 . The non-transitory computer readable medium of claim 15 , wherein converting the natural language query into a database language query comprises removing errors from the plurality of potential results. 17 . The non-transitory computer readable medium of claim 15 , wherein converting the natural language query into a database language query comprises concatenating the plurality of potential results.

Assignees

Inventors

Classifications

  • Query languages · CPC title

  • Natural language query formulation · CPC title

  • Translation of natural language queries to structured queries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12554709B2 cover?
Aspects of the disclosure are directed to methods, systems, and non-transitory computer readable media for automatically generating queries on a database from natural language text using in-context learning to leverage zero-shot and few-shot adaptation capabilities of large language models (LLMs). The methods, systems, and non-transitory computer readable media can consider database information…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/2433. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).