Intent classification for executing a retrieval augmented generation pipeline for natural language tasks using a generate machine learning model

US2025111091A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025111091-A1
Application numberUS-202318478766-A
CountryUS
Kind codeA1
Filing dateSep 29, 2023
Priority dateSep 29, 2023
Publication dateApr 3, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Intent classification is performed for executing a retrieval augmented generation pipeline for natural language tasks using a generative machine learning model. A natural language generative application with associated data repositories may submit a natural language task. A classification machine learning model is used to determine an intent for the natural language request. A number of iterations of a retrieval pipeline may be determined to perform the natural language task based on the intent. The natural language request may be processed through a retrieval pipeline according to the determined number of iterations before returning a result to the request.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system, comprising: a plurality of computing devices, respective comprising at least one processor and a memory, configured to implement a natural language generative application service, wherein the natural language generative application service is configured to: receive, via an interface of the natural language generative application service, a natural language request to perform a natural language task for a generative natural language application using one or more data repositories associated with the generative natural language application; cause a classification machine learning model, trained to determine intents of natural language requests, to determine an intent for the natural language request; determine a number of iterations of a retrieval pipeline to perform the natural language task of the natural language request based, at least in part, on the intent for the natural language request; cause processing of the natural language request through the retrieval pipeline according to the determine number of iterations, wherein the retrieval pipeline comprises: rewriting the natural language request to perform the natural language task for at least one of the number of iterations; retrieving data to perform the natural language task from the one or more data repositories for at least one of the number of iterations; generate a prompt for a generative machine learning model based, at least in part, on the retrieved data, to perform the natural language task; instruct the generative machine learning model according to the prompt to generate a result; and return, via the interface of the natural language generative application service, a response to the natural language request based, at least in part, on a result received from the generative machine learning model. 2 . The system of claim 1 , wherein the intent labels the natural language request as a non-retrieval instruction, wherein the determined number of iterations is zero, and wherein the natural language request is provided to the generative machine learning model to obtain the result. 3 . The system of claim 1 , wherein the natural language generative application service is configured to: obtain local user information and local group information for data sources with the natural language generative application; create an application principal store that maps one or more local users found in the local user information to a service user; and provide the application principal store for enforcing access control at the one or more data access repositories associated with the natural language generative application. 4 . The system of claim 1 , wherein the natural language generative application service is configured to: receive request to create the generative natural language application to be hosted by the natural language generative application service; provision one or more computing resources to host the generative natural language application; and provide a network endpoint for accessing the generative natural language application at the one or more computing resources, wherein the natural language request is submitted via an application interface of the generative natural language application. 5 . A method, comprising: receiving, via an interface of a generative machine learning service, a natural language request to perform a natural language task for a generative natural language application using one or more data repositories associated with the generative natural language application; causing, by the generative machine learning service, a classification machine learning model, trained to determine intents of natural language requests, to determine an intent for the natural language request; determining, by the generative machine learning service, a number of iterations of a retrieval pipeline to perform the natural language task of the natural language request based, at least in part, on the intent for the natural language request; processing, by the generative machine learning service, the natural language request through the retrieval pipeline according to the determine number of iterations, wherein the retrieval pipeline comprises: retrieving data to perform the natural language task from the one or more data repositories for at least one of the number of iterations; and prompting a generative machine learning model based, at least in part, on the retrieved data, to perform the natural language task; returning, via the interface of the generative machine learning service, a response to the natural language request based, at least in part, on a result received from the generative machine learning model. 6 . The method of claim 5 , wherein the intent labels the natural language request as a non-retrieval instruction, wherein the determined number of iterations is zero, and wherein the natural language request is provided to the generative machine learning model to obtain the result. 7 . The method of claim 5 , further comprising obtaining conversation history for the generative natural language application, wherein the natural language request through the retrieval pipeline is processed based, at least in part, on the conversation history. 8 . The method of claim 5 , further comprising including one or more source attributions based on the one or more data repositories in the result. 9 . The method of claim 5 , wherein the intent labels the natural language task as including a plurality of subtasks and wherein the determined number of iterations corresponds to two or more of the plurality of sub-tasks. 10 . The method of claim 5 , further comprising validating, by the generative machine learning service, the result of the generative machine learning model before providing as the result. 11 . The method of claim 5 , further comprising: receiving, at the generative machine learning service, local user information and local group information for data sources with the natural language generative application; creating, by the generative machine learning service, an application principal store that maps one or more local users found in the local user information to a service user; and providing, by the generative machine learning service, the application principal store for enforcing access control at the one or more data access repositories associated with the natural language generative application. 12 . The method of claim 11 , wherein retrieving the data to perform the natural language task from the one or more data repositories accesses the service user to obtain a local user to access at least one of the one or more data repositories. 13 . The method of claim 5 , further comprising: receiving, by the generative machine learning service, request to create the generative natural language application to be hosted by the natural language generative application service; provisioning, by the generative machine learning service, one or more computing resources to host the generative natural language application; and providing, by the generative machine learning service, a network endpoint for accessing the generative natural language application at the one or more computing resources, wherein the natural language request is submitted via an application interface of the generative natural language application. 14 . One or more non-transitory, computer-readable storage media, storing program instructions that when executed on or across one or more computing devices cause the one or more computing devices to implement a generative machine learning service: receiving, via an interface of t

Assignees

Inventors

Classifications

  • G06F21/629Primary

    to features or functions of an application · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025111091A1 cover?
Intent classification is performed for executing a retrieval augmented generation pipeline for natural language tasks using a generative machine learning model. A natural language generative application with associated data repositories may submit a natural language task. A classification machine learning model is used to determine an intent for the natural language request. A number of iterati…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/629. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 03 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).