Technologies for agent interaction analysis using artificial intelligence
US-2025080654-A1 · Mar 6, 2025 · US
US12499894B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-12499894-B1 |
| Application number | US-202318383751-A |
| Country | US |
| Kind code | B1 |
| Filing date | Oct 25, 2023 |
| Priority date | Sep 15, 2023 |
| Publication date | Dec 16, 2025 |
| Grant date | Dec 16, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for video conference transcript querying using artificial intelligence are provided. In an example method, a video conference provider joins a first client device of a plurality of client devices to a video conference hosted by a video conference provider. The video conference provider receives an audio stream from the first client device and generates, based on the audio stream, a portion of a transcript of the video conference. The video conference provider processes the portion of the transcript to configure an AI service to respond to queries based on the video conference. The video conference provider receives a query relating to the video conference and causes the AI service to process the query and the portion of the transcript. The video conference provider outputs a response, generated by the AI service, to the query. The video conference provider then deletes the transcript, responsive to the video conference concluding.
Opening claim text (preview).
What is claimed is: 1 . A method, comprising: joining a first client device of a plurality of client devices to a video conference hosted by a video conference provider; receiving an audio stream from the first client device; generating, based on the audio stream, a portion of a transcript of the video conference; prior to the video conference concluding: processing the portion of the transcript to configure an AI service to respond to queries based on the video conference, including the portion of the transcript; receiving a query relating to the video conference; causing the AI service to process the query and the portion of the transcript; and outputting a response, generated by the AI service, to the query; and responsive to the video conference concluding, deleting the portion of the transcript for configuring the AI service. 2 . The method of claim 1 , further comprising responsive to the video conference concluding, deleting the transcript. 3 . The method of claim 1 , further comprising receiving a deletion election, wherein deleting the portion of the transcript for configuring the AI service is further responsive to the deletion election. 4 . The method of claim 1 , wherein generating the portion of the transcript of the video conference comprises: determining, from the audio stream, one or more audio stream portions; responsive to an end of meeting signal, generating the transcript based on the one or more audio stream portions; and designating the transcript as the portion of the transcript. 5 . The method of claim 1 , wherein generating the portion of the transcript of the video conference comprises: determining, from the audio stream, one or more audio stream portions; and generating, based on the one or more audio stream portions, a first portion of the transcript. 6 . The method of claim 5 , wherein the one or more audio stream portions comprise one or more utterances, each utterance comprising at least a portion of a word. 7 . The method of claim 5 , wherein generating the portion of the transcript of the video conference further comprises: determining, from the audio stream, one or more second audio stream portions; generating, based on the one or more second audio stream portions, a second portion of the transcript; generating, based on the first portion of the transcript and the second portion of the transcript, the portion of the transcript; determining, from the audio stream, one or more third audio stream portions; responsive to an end of meeting signal, generating a third portion of the transcript based on the one or more third audio stream portions; generating the transcript based on the first portion of the transcript, the second portion of the transcript, and the third portion of the transcript; and designating the transcript as the portion of the transcript. 8 . The method of claim 1 , wherein the AI service comprises a large language model. 9 . A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations including: joining a first client device of a plurality of client devices to a video conference hosted by a video conference provider; receiving an audio stream from the first client device; generating, based on the audio stream, a portion of a transcript of the video conference; prior to the video conference concluding: processing the portion of the transcript to configure an AI service to respond to queries based on the video conference, including the portion of the transcript; receiving a query relating to the video conference; causing the AI service to process the query and the portion of the transcript; and outputting a response, generated by the AI service, to the query; and responsive to the video conference concluding, deleting the portion of the transcript for configuring the AI service. 10 . The non-transitory computer-readable medium of claim 9 , further responsive to the video conference concluding, deleting the transcript. 11 . The non-transitory computer-readable medium of claim 9 , wherein generating the portion of the transcript of the video conference comprises: determining, from the audio stream, one or more audio stream portions; and generating, based on the one or more audio stream portions, a first portion of the transcript; determining, from the audio stream, one or more second audio stream portions; generating, based on the one or more second audio stream portions, a second portion of the transcript; generating, based on the first portion of the transcript and the second portion of the transcript, the portion of the transcript; determining, from the audio stream, one or more third audio stream portions; responsive to an end of meeting signal, generating a third portion of the transcript based on the one or more third audio stream portions; generating the transcript based on the first portion of the transcript, the second portion of the transcript, and the third portion of the transcript; and designating the transcript as the portion of the transcript. 12 . The non-transitory computer-readable medium of claim 9 , wherein the portion of the transcript is generated in near-real-time. 13 . The non-transitory computer-readable medium of claim 9 , further comprising receiving an indication of a period of time, wherein the AI service is configured to base the response on the portion of the transcript included in the period of time. 14 . The non-transitory computer-readable medium of claim 9 , wherein the query is human-readable and the query is a request to summarize the video conference. 15 . A system comprising: one or more processors; and one or more computer-readable storage media storing instructions which, when executed by the one or more processors, cause the one or more processors to perform operations including: joining a first client device of a plurality of client devices to a video conference hosted by a video conference provider; receiving an audio stream from the first client device; generating, based on the audio stream, a portion of a transcript of the video conference; prior to the video conference concluding: processing the portion of the transcript to configure an AI service to respond to queries based on the video conference, including the portion of the transcript; receiving a query relating to the video conference; causing the AI service to process the query and the portion of the transcript; and outputting a response, generated by the AI service, to the query; and responsive to the video conference concluding, deleting the portion of the transcript for configuring the AI service. 16 . The system of claim 15 , further comprising receiving a deletion election comprising an election to delete the transcript, wherein deleting the portion of the transcript for configuring the AI service is further responsive to the deletion election. 17 . The system of claim 15 , wherein the query is human-readable and wherein the query comprises query context information and the query is a request to generate one or more tasks based on the video conference. 18 . The system of claim 15 , wherein deleting the transcript comprises removing information from at least one of a hard disk, a memory device, or a memory cache. 19 . The system of claim 15 , wherein the portion of the transcript is generated in near-real-time and the query comprises query parameters, wherein the query
Conducting the conference, e.g. admission, detection, selection or grouping of participants, correlating users to one or more conference sessions, prioritising transmission · CPC title
involving storage of or access to video conference sessions (tracking arrangements for later retrieval of a computer conference content or participants activities H04L12/1831) · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Tracking arrangements for later retrieval, e.g. recording contents, participants activities or behavior, network status · CPC title
Querying · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.