Embedding-free retriever-augmented generation (RAG) architectures

US12585896B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-12585896-B1
Application numberUS-202519268792-A
CountryUS
Kind codeB1
Filing dateJul 14, 2025
Priority dateNov 6, 2024
Publication dateMar 24, 2026
Grant dateMar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes obtaining an input query associated with a document and using a first generative AI model to identify whether different passages of the document are or are not relevant to the input query. The method also includes identifying at least one specific passage in the document and extracting text from the document. The extracted text includes each specific passage and portions of text preceding and following that specific passage. The method further includes generating a prompt requesting that the first generative AI model or a second generative AI model generate a response to the input query using the extracted text. Using the first generative AI model includes generating initial prompts requesting that the first generative AI model indicate whether different chunks of the document are or are not relevant to the input query and identifying relevant chunks based on results generated by the first generative AI model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: obtaining, from a user, an input query associated with a document; using a first generative artificial intelligence (AI) model to identify whether different passages of the document are or are not relevant to the input query; identifying at least one specific passage in the document based on results generated by the first generative AI model; extracting text from the document, the extracted text including, for each specific passage, (i) the specific passage, (ii) a portion of text from the document preceding the specific passage, and (iii) a portion of text from the document following the specific passage; generating a prompt requesting that the first generative AI model or a second generative AI model generate a response to the input query using the extracted text from the document; and providing the response to the user by initiating display of the response; wherein using the first generative AI model to identify whether different passages of the document are or are not relevant to the input query comprises: generating initial prompts requesting that the first generative AI model indicate whether different chunks of the document are or are not relevant to the input query; identifying initial relevant chunks based on results generated by the first generative AI model using the initial prompts; and iteratively generating additional prompts and identifying relevant chunks based on results generated by the first generative AI model using the additional prompts until the identified relevant chunks have a size that is less than or equal to a specified threshold size. 2 . The method of claim 1 , wherein identifying the at least one specific passage in the document comprises: generating at least one quotation prompt requesting that the first generative AI model generate at least one quotation based on the input query; and identifying the at least one specific passage in the document as being most similar to the at least one generated quotation. 3 . The method of claim 2 , wherein fuzzy matching based on Levenshtein distance is used to identify the at least one specific passage in the document that is most similar to the at least one generated quotation. 4 . The method of claim 1 , wherein the extracted text includes, for each specific passage, (i) the specific passage, (ii) a specified number of sentences from the document preceding the specific passage, and (iii) a specified number of sentences from the document following the specific passage. 5 . The method of claim 4 , wherein the specified number of sentences from the document preceding each specific passage equals the specified number of sentences from the document following that specific passage. 6 . The method of claim 1 , wherein the document is obtained via a communications network from a user device that also provides the input query. 7 . The method of claim 1 , wherein extracting the text from the document comprises extracting text from the document without using an embedding model. 8 . An apparatus comprising: at least one processing device configured to: obtain, from a user, an input query associated with a document; use a first generative artificial intelligence (AI) model to identify whether different passages of the document are or are not relevant to the input query; identify at least one specific passage in the document based on results generated by the first generative AI model; extract text from the document, the extracted text including, for each specific passage, (i) the specific passage, (ii) a portion of text from the document preceding the specific passage, and (iii) a portion of text from the document following the specific passage; generate a prompt requesting that the first generative AI model or a second generative AI model generate a response to the input query using the extracted text from the document; and provide the response to the user by initiating display of the response; wherein, to use the first generative AI model to identify whether different passages of the document are or are not relevant to the input query, the at least one processing device is configured to: generate initial prompts requesting that the first generative AI model indicate whether different chunks of the document are or are not relevant to the input query; identify initial relevant chunks based on results generated by the first generative AI model using the initial prompts; and iteratively generate additional prompts and identify relevant chunks based on results generated by the first generative AI model using the additional prompts until the identified relevant chunks have a size that is less than or equal to a specified threshold size. 9 . The apparatus of claim 8 , wherein, to identify the at least one specific passage in the document, the at least one processing device is configured to: generate at least one quotation prompt requesting that the first generative AI model generate at least one quotation based on the input query; and identify the at least one specific passage in the document as being most similar to the at least one generated quotation. 10 . The apparatus of claim 9 , wherein the at least one processing device is configured to use fuzzy matching based on Levenshtein distance to identify the at least one specific passage in the document that is most similar to the at least one generated quotation. 11 . The apparatus of claim 8 , wherein the extracted text includes, for each specific passage, (i) the specific passage, (ii) a specified number of sentences from the document preceding the specific passage, and (iii) a specified number of sentences from the document following the specific passage. 12 . The apparatus of claim 11 , wherein the specified number of sentences from the document preceding each specific passage equals the specified number of sentences from the document following that specific passage. 13 . A method comprising: obtaining, from a user, an input query associated with a document; generating multiple prompts requesting that a first generative artificial intelligence (AI) model indicate whether different passages of the document are or are not relevant to the input query; identifying at least one specific passage in the document based on results generated by the first generative AI model using the multiple prompts; extracting text from the document, the extracted text including, for each specific passage, (i) the specific passage, (ii) a portion of text from the document preceding the specific passage, and (iii) a portion of text from the document following the specific passage; generating an additional prompt requesting that the first generative AI model or a second generative AI model generate a response to the input query using the extracted text from the document; and providing the response to the user by initiating display of the response; wherein generating the multiple prompts comprises: generating first prompts; identifying relevant chunks based on results generated by the first generative AI model using the first prompts; and iteratively generating additional prompts and identifying relevant chunks based on results generated by the first generative AI model using the additional prompts until the identified relevant chunks have a size that is less than or equal to a specified threshold size. 14 . The method of claim 11 , wherein identifying the at least one specific passage in the document comprises: generating at least one quotation prompt requesting that the first generative AI model generate at least one quotation based on t

Assignees

Inventors

Classifications

  • G06F40/58Primary

    Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12585896B1 cover?
A method includes obtaining an input query associated with a document and using a first generative AI model to identify whether different passages of the document are or are not relevant to the input query. The method also includes identifying at least one specific passage in the document and extracting text from the document. The extracted text includes each specific passage and portions of te…
Who is the assignee on this patent?
Goldman Sachs & Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/58. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).