System and method for implementing an advisory assistant to a generative artifical intelligence tool

US2025053453A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025053453-A1
Application numberUS-202418798221-A
CountryUS
Kind codeA1
Filing dateAug 8, 2024
Priority dateAug 8, 2023
Publication dateFeb 13, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The invention relates to computer-implemented systems and methods that implement an innovative generative AI service based on proprietary expertise and industry knowledge. The generative AI service provides unique autonomous features, such as combining separate and distinct LLM responses and prompts to create unique results. Other autonomous features may include an ability to handle scaling and auto deployment of models and rerouting requests autonomously to ensure user load is balanced across the entire globally distributed Generative AI infrastructure estate. The generative AI service may further deploy new Production instances of models on demand by predefined system criteria as well as by explicit user request based on projected demand increase and/or the need for specific instance for further model fine-tuning.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented system for implementing a middleware platform that provides a series of generative AI services, the system comprising: a user interface that is configured to receive one or more requests from a user, via a communication network; a Digital Matrix Application Gateway that is configured to receive the one or more requests and route the one or more requests to a set of API Compute Resources, wherein the set of API Compute Resources comprises one or more API Apps and one or more API App Functions, wherein each of the set of API Compute Resources is configured to make calls to one of a plurality of APIs, wherein the set of API Compute Resources interacts with a plurality of different large language models (LLMs) that collectively generate a response to the one or more requests; a data storage component that reads and writes configuration and operational data associated with the API Compute Resources; an insights analytics processing component that is configured to receive application telemetry data from one or more API Compute Resources; and a log analytics component that stores log data from the insights component. 2 . The system of claim 1 , wherein the response represents a combination of LLM responses from the plurality of LLMs. 3 . The system of claim 1 , wherein the plurality of LLMs have access to a proprietary knowledgebase. 4 . The system of claim 1 , wherein user load balancing is applied across the plurality of different LLMs on an entire globally distributed generative AI infrastructure. 5 . The system of claim 1 , wherein the user interface comprises a generative AI chat interface. 6 . The system of claim 1 , wherein the user interface comprises a cognitive search interface. 7 . The system of claim 1 , wherein each of the plurality of LLMs is independently run and secured in a virtual container. 8 . The system of claim 1 , wherein a copy of a LLM from the plurality of LLMs is created to respond in a predetermined way through a training process to create a tuned version of the LLM. 9 . The system of claim 1 , wherein the log data is globally and centrally managed to determine model usage and performance metrics across the plurality of different LLMs. 10 . The system of claim 1 , wherein the plurality of LLMs are selected based on model optimization. 11 . A computer-implemented method for implementing a middleware platform that provides a series of generative AI services, the method comprising the steps of: receiving, via a user interface, one or more requests from a user, via a communication network; receiving, via a Digital Matrix Application Gateway, the one or more requests and routing the one or more requests to a set of API Compute Resources, wherein the set of API Compute Resources comprises one or more API Apps and one or more API App Functions, wherein each of the set of API Compute Resources is configured to make calls to one of a plurality of APIs, wherein the set of API Compute Resources interacts with a plurality of different large language models (LLMs) that collectively generate a response to the one or more requests; reading and writing, via a data storage component, configuration and operational data associated with the API Compute Resources; receiving, via an insights analytics processing component, application telemetry data from one or more API Compute Resources; storing, via a log analytics component, log data from the insights component; and transmitting, via user interface, the response. 12 . The method of claim 11 , wherein the response represents a combination of LLM responses from the plurality of LLMs. 13 . The method of claim 11 , wherein the plurality of LLMs have access to a proprietary knowledgebase. 14 . The method of claim 11 , wherein user load balancing is applied across the plurality of different LLMs on an entire globally distributed generative AI infrastructure. 15 . The method of claim 11 , wherein the user interface comprises a generative AI chat interface. 16 . The method of claim 11 , wherein the user interface comprises a cognitive search interface. 17 . The method of claim 11 , wherein each of the plurality of LLMs is independently run and secured in a virtual container. 18 . The method of claim 11 , wherein a copy of a LLM from the plurality of LLMs is created to respond in a predetermined way through a training process to create a tuned version of the LLM. 19 . The method of claim 11 , wherein the log data is globally and centrally managed to determine model usage and performance metrics across the plurality of different LLMs. 20 . The method of claim 11 , wherein the plurality of LLMs are selected based on model optimization.

Assignees

Inventors

Classifications

  • G06F9/5027Primary

    the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

  • Machine learning · CPC title

  • G06N5/022Primary

    Knowledge engineering; Knowledge acquisition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025053453A1 cover?
The invention relates to computer-implemented systems and methods that implement an innovative generative AI service based on proprietary expertise and industry knowledge. The generative AI service provides unique autonomous features, such as combining separate and distinct LLM responses and prompts to create unique results. Other autonomous features may include an ability to handle scaling and…
Who is the assignee on this patent?
Kpmg Llp
What technology area does this patent fall under?
Primary CPC classification G06F9/5027. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Feb 13 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).