Llm as a transcription filter
US-2025157473-A1 · May 15, 2025 · US
US2024412720A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024412720-A1 |
| Application number | US-202418738363-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 10, 2024 |
| Priority date | Jun 11, 2023 |
| Publication date | Dec 12, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An artificial intelligence (AI) assistant system and a method for providing a contextualized response to a user using AI are disclosed. The system comprises an audio input device for receiving voice input, an audio output device for providing output, a processor, a wireless communication device, a contextual memory unit for storing conversational context data on a sliding window basis, and a non-volatile system memory unit. The processor executes instructions to receive voice input, determine user identification, update conversational context data with user identification and a tokenized representation of the voice input, process the voice input using a transformer-based language model to generate a response, update the conversational context data with a tokenized representation of the generated response, and output the response via the audio output device. The method comprises receiving voice input, determining user identification, updating conversational context data, processing voice input, and generating and outputting a conversational response.
Opening claim text (preview).
What is claimed is: 1 . An artificial intelligence (AI) assistant system, comprising: an audio input device configured to receive voice input from one or more users; an audio output device configured to provide audio output; a processor; a wireless communication device; a contextual memory unit configured to store conversational context data on a sliding window basis; and a non-volatile system memory unit, wherein the processor is configured to execute instructions to: receive the voice input from the audio input device, determine user identification information based on the voice input, update the conversational context data within the contextual memory unit to include the determined user identification information and a tokenized representation of the voice input, process the voice input using a transformer-based language model to generate a conversational response, update the conversational context data within the contextual memory unit to include a tokenized representation of the generated conversational response, and output the generated conversational response to the one or more users via the audio output device. 2 . The AI assistant system of claim 1 , wherein the contextual memory unit is configured to store the conversational context data for a predetermined time period, and wherein the processor is further configured to execute instructions to dynamically adjust the predetermined time period based on at least one of a user input, a system parameter, and a contextual factor. 3 . The AI assistant system of claim 1 , wherein the contextual memory unit is configured to store the conversational context data for a predetermined time period, and wherein the processor is further configured to execute instructions to dynamically adjust the predetermined time period based on at least one of user preferences, system performance, and contextual relevance. 4 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to dynamically adjust a context window size based on at least one parameter selected from a group consisting of available memory, processor speed, and estimated latency for processing user commands. 5 . The AI assistant system of claim 1 , further comprising a display device, wherein the processor is further configured to execute instructions to generate visual content based on the conversational context data and the generated conversational response, and output the generated visual content to the display device as part of the generated conversational response. 6 . The AI assistant system of claim 1 , further comprising a camera configured to capture visual input, wherein the processor is further configured to execute instructions to analyze the visual input captured by the camera, extract relevant visual information from the visual input, and update the conversational context data based on the extracted relevant visual information. 7 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to access an external knowledge base via the wireless communication device to retrieve relevant information based on the conversational context data, and utilize the retrieved relevant information in conjunction with the conversational context data stored in the contextual memory unit to generate the conversational response. 8 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to store user-specific information in a user profile database within the non-volatile system memory unit, retrieve the user-specific information from the user profile database based on the determined user identification information, and personalize the generated conversational response based on the retrieved user-specific information. 9 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to perform sentiment analysis on the voice input to determine an emotional state of the one or more users, and adapt the generated conversational response based on the determined emotional state. 10 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to solicit user feedback on the generated conversational response, process the solicited user feedback to generate processed feedback data, update the transformer-based language model based on the processed feedback data, and utilize an active learning algorithm to select conversational responses for which to solicit user feedback. 11 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to proactively generate a plurality of candidate conversational responses based on the conversational context data prior to receiving a subsequent user query or command, store the plurality of candidate conversational responses in memory, and select a conversational response from the stored plurality of candidate conversational responses based on the subsequent user query or command and the conversational context data. 12 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to receive a complex user request from the voice input, break down the complex user request into a plurality of manageable sub-tasks, coordinate the execution of the plurality of manageable sub-tasks; and generate a portion of the conversational response based on the execution of the plurality of manageable sub-tasks. 13 . The AI assistant system of claim 1 , further comprising at least one agentic task processing unit (ATPU) configured to autonomously perform a task in a background, wherein the processor is further configured to execute instructions to: detect a command based on the conversational context data, in response to detecting the command, cause the at least one ATPU to initiate performance of the task, receive a result of the task from the at least one ATPU, generate a conversational response indicating a result of the task, and output the conversational response indicating the result of the task via the audio output device. 14 . The AI assistant system of claim 1 , wherein the processor is further configured to execute instructions to: detect a command based on the conversational context data stored in the contextual memory unit; in response to detecting the command, autonomously perform a multi-step task in a background, wherein performing the multi-step task comprises: decomposing the multi-step task into a plurality of subtasks; assigning the plurality of subtasks to a plurality of agentic task processing units; executing the plurality of subtasks across the plurality of agentic task processing units, wherein executing the plurality of subtasks comprises: generating queries to retrieve data from at least one of the non-volatile system memory unit, the contextual memory unit, or an external data source accessed via the wireless communication device; analyzing the retrieved data using at least one of natural language processing or machine learning models; generating a plurality of results based on analyzing the retrieved data; monitoring a progress of executing the plurality of subtasks; aggregating the plurality of results from the plurality of agentic task processing units; and generating a conversational response indicating a result of the multi-step task based on the aggregated plurality of results; and output the generated conversational response via the audio output device. 15 . The AI assistant system of claim 1 , further compris
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Natural language query formulation or dialogue systems · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
in dialogue systems · CPC title
Discourse or dialogue representation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.