System for reducing transaction failure
US-12175472-B2 · Dec 24, 2024 · US
US2026037877A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2026037877-A1 |
| Application number | US-202418789980-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 31, 2024 |
| Priority date | Jul 31, 2024 |
| Publication date | Feb 5, 2026 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods of evaluating and fine-tuning a generative AI tool on a communication platform. The communication platform accesses a dataset comprising a user query and a response generated by an AI-based query system. The communication platform evaluates the response with respect to the user query using multiple AI-based scoring models to obtain multiple evaluation results. In response to determining that the multiple evaluation results are inconsistent, the communication platform evaluates the response with respect to the user query using a reference large language model (LLM) to provide a reference evaluation result. In response to determining that the reference evaluation result is decisive, the communication platform classifies, based on the reference evaluation result, the dataset to a data category of one or more data categories. The communication platform fine-tunes the AI-based query system based on a group of datasets in the data category.
Opening claim text (preview).
That which is claimed is: 1 . A method comprising: accessing a dataset comprising a user query and a response generated by an artificial intelligence (AI)-based query system; evaluating the response with respect to the user query using multiple AI-based scoring models to obtain multiple evaluation results; in response to determining that the multiple evaluation results are inconsistent, evaluating the response with respect to the user query using a reference large language model (LLM) to provide a reference evaluation result; in response to determining that the reference evaluation result is decisive, classifying, based on the reference evaluation result, the dataset to a data category of one or more data categories; and fine-tuning the AI-based query system based on a group of datasets in the data category. 2 . The method of claim 1 , wherein the multiple AI-based scoring models comprises small language models or a lite version of an LLM. 3 . The method of claim 1 , wherein the reference LLM comprises a generative pre-trained transform (GPT) model. 4 . The method of claim 1 , further comprises: accessing multiple intermediate datasets generated by the AI-based query system during a process of generating the response with respect to the user query, the multiple intermediate datasets comprising analytics data of the user query, multiple search results, a ranking of the search results for response generation, and semantic analytics data of the response; and evaluating the response by evaluating the multiple intermediate datasets using the multiple AI-based scoring models to obtain multiple evaluation results. 5 . The method of claim 1 , further comprising: in response to determining that the multiple evaluation results are consistent, classifying the response and the corresponding user query based on the multiple evaluation results. 6 . The method of claim 1 , further comprising: in response to determining that the reference evaluation result is indeterminate, providing the response and the corresponding user query to a human evaluator for manual evaluation. 7 . The method of claim 1 , wherein the reference evaluation result comprises a description or a score. 8 . The method of claim 1 , wherein the one or more data categories comprise a “positive datasets” category and a “negative datasets” category. 9 . The method of claim 1 , further comprising: retraining one or more of the multiple AI-based scoring models by using the group of datasets in the data category. 10 . A system comprising: a communications interface; a non-transitory computer-readable medium; and one or more processors communicatively coupled to the communications interface and the non-transitory computer-readable medium, the one or more processors configured to execute processor-executable instructions stored in the non-transitory computer-readable medium to: access a dataset comprising a user query and a response generated by an artificial intelligence (AI)-based query system; evaluate the response with respect to the user query using multiple AI-based scoring models to obtain multiple evaluation results; in response to determining that the multiple evaluation results are inconsistent, evaluate the response with respect to the user query using a reference large language model (LLM) to provide a reference evaluation result; in response to determining that the reference evaluation result is decisive, classify, based on the reference evaluation result, the dataset to a data category of one or more data categories; and fine-tune the AI-based query system based on a group of datasets in the data category. 11 . The system of claim 10 , wherein the multiple AI-based scoring models comprises small language models or a lite version of an LLM, wherein the reference LLM comprises a generative pre-trained transform (GPT) model. 12 . The system of claim 10 , wherein the one or more processors are configured to execute further processor-executable instructions stored in the non-transitory computer-readable medium to: accesses multiple intermediate datasets generated by the AI-based query system during a process of generating the response with respect to the user query, the multiple intermediate datasets comprising analytics data of the user query, multiple search results, a ranking of the search results for response generation, and semantic analytics data of the response; and evaluate the response by evaluating the multiple intermediate datasets using the multiple AI-based scoring models to obtain the multiple evaluation results. 13 . The system of claim 10 , wherein the one or more processors are configured to execute further processor-executable instructions stored in the non-transitory computer-readable medium to: in response to determining that the multiple evaluation results are consistent, classify the response and the corresponding user query based on the multiple evaluation results. 14 . The system of claim 10 , wherein the one or more processors are configured to execute further processor-executable instructions stored in the non-transitory computer-readable medium to: in response to determining that the reference evaluation result is indeterminate, provide the response and the corresponding user query to a human evaluator for manual evaluation. 15 . The system of claim 10 , wherein the reference evaluation result comprises a description or a score, and wherein the one or more data categories comprise a “positive datasets” category and a “negative datasets” category. 16 . The system of claim 10 , wherein the one or more processors are configured to execute further processor-executable instructions stored in the non-transitory computer-readable medium to: retrain one or more of the multiple AI-based scoring models by using the group of datasets in the data category. 17 . A non-transitory computer-readable medium comprising processor-executable instructions configured to cause one or more processors to: access a dataset comprising a user query and a response generated by an artificial intelligence (AI)-based query system; evaluate the response with respect to the user query using multiple AI-based scoring models to obtain multiple evaluation results; in response to determining that the multiple evaluation results are inconsistent, evaluate the response with respect to the user query using a reference large language model (LLM) to provide a reference evaluation result; in response to determining that the reference evaluation result is decisive, classify, based on the reference evaluation result, the dataset to a data category of one or more data categories; and fine-tune the AI-based query system based on a group of datasets in the data category. 18 . The non-transitory computer-readable medium of claim 17 , further comprising processor-executable instructions configured to cause one or more processors to: accesses multiple intermediate datasets generated by the AI-based query system during a process of generating the response with respect to the user query, the multiple intermediate datasets comprising analytics data of the user query, multiple search results, a ranking of the search results for response generation, and semantic analytics data of the response; and evaluate the response by evaluating the multiple intermediate datasets using the multiple AI-based scoring models to obtain the multiple evaluation results. 19 . The non-transitory computer-readable medium of claim 17 , further comprising p
Ensemble learning · CPC title
in dialogue systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.