Ignoring trigger words in streamed media content
US-2019341035-A1 · Nov 7, 2019 · US
US11769520B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11769520-B2 |
| Application number | US-202016995000-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 17, 2020 |
| Priority date | Aug 17, 2020 |
| Publication date | Sep 26, 2023 |
| Grant date | Sep 26, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are provided for evaluating multiple machine learning models to identify issues with a communication. One method comprises applying an audio signal associated with a communication to at least two of: (i) a trigger word analysis module that evaluates contextual information to determine if a trigger word is detected in the audio signal; (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected; and (iii) a communication application analysis module that evaluates features provided by a communication application relative to applicable thresholds; and combining results of the at least two of the trigger word analysis module, the audio activity pattern analysis module and the communication application analysis module to identify a communication issue. The combining may evaluate an accuracy of the trigger word analysis module, the audio activity pattern analysis module and/or the communication application analysis module to combine the results.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: applying, by a communication issue detector, a representation of an audio signal associated with a communication to: (i) a trigger word analysis module that determines if one or more trigger words are detected in the audio signal from the audio signal using a trained trigger word detection model that is trained using (a) a set of trigger words indicative of a technical device issue with one or more devices associated with the communication and (b) contextual information comprising one or more of a nearby additional trigger word feature, a time-in-meeting of trigger word feature and a before/after silence pattern feature, relative to a user-provided feedback score; and (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected in the audio signal using a trained audio activity model, wherein the audio activity pattern analysis module evaluates a length of an audio activity portion of the audio signal relative to a length of a silence portion of the audio signal to identify the silence pattern anomaly, wherein the silence pattern anomaly is indicative of a technical device issue with one or more devices associated with the communication; combining results of the trigger word analysis module and the audio activity pattern analysis module to identify a communication issue for the communication; and implementing one or more remedial actions responsive to the identification of the communication issue; wherein the method is performed by at least one processing device comprising a processor coupled to a memory. 2. The method of claim 1 , wherein the representation of the audio signal comprises one or more spectrograms associated with the communication. 3. The method of claim 1 , wherein the applying the audio signal to the trigger word analysis module further comprises evaluating a relevance score generated by the trained trigger word detection model. 4. The method of claim 1 , wherein the trained trigger word detection model is further trained using a plurality of additional words and a plurality of background samples. 5. The method of claim 1 , wherein the combining employs an ensemble model that combines the results to identify the communication issue for the communication. 6. The method of claim 1 , wherein the combining evaluates an accuracy of the trigger word analysis module and the audio activity pattern analysis module to combine the results. 7. The method of claim 1 , further comprising applying the representation of the audio signal associated with a communication to a communication application analysis module that evaluates one or more features provided by a communication application relative to one or more thresholds, wherein the communication application is provided by a different provider than a provider of the communication issue detector. 8. The method of claim 7 , wherein the features provided by the communication application comprise one or more of an audio device not found feature, a number of screen share sessions feature, a number of connection attempts feature and a poor connection events feature. 9. An apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured to implement the following steps: applying, by a communication issue detector, a representation of an audio signal associated with a communication to: (i) a trigger word analysis module that determines if one or more trigger words are detected in the audio signal from the audio signal using a trained trigger word detection model that is trained using (a) a set of trigger words indicative of a technical device issue with one or more devices associated with the communication and (b) contextual information comprising one or more of a nearby additional trigger word feature, a time-in-meeting of trigger word feature and a before/after silence pattern feature, relative to a user-provided feedback score; and (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected in the audio signal using a trained audio activity model, wherein the audio activity pattern analysis module evaluates a length of an audio activity portion of the audio signal relative to a length of a silence portion of the audio signal to identify the silence pattern anomaly, wherein the silence pattern anomaly is indicative of a technical device issue with one or more devices associated with the communication; combining results of the trigger word analysis module and the audio activity pattern analysis module to identify a communication issue for the communication; and implementing one or more remedial actions responsive to the identification of the communication issue. 10. The apparatus of claim 9 , wherein the applying the audio signal to the trigger word analysis module further comprises evaluating a relevance score generated by the trained trigger word detection model. 11. The apparatus of claim 9 , wherein the trained trigger word detection model is trained using a set of trigger words, a plurality of additional words and a plurality of background samples. 12. The apparatus of claim 9 , wherein the combining employs an ensemble model that combines the results to identify the communication issue for the communication. 13. The apparatus of claim 9 , wherein the combining evaluates an accuracy of the trigger word analysis module and the audio activity pattern analysis module to combine the results. 14. The apparatus of claim 9 , wherein the representation of the audio signal comprises one or more spectrograms associated with the communication. 15. The apparatus of claim 9 , further comprising applying the representation of the audio signal associated with a communication to a communication application analysis module that evaluates one or more features provided by a communication application relative to one or more thresholds, wherein the communication application is provided by a different provider than a provider of the communication issue detector. 16. A non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device to perform the following steps: applying, by a communication issue detector, a representation of an audio signal associated with a communication to: (i) a trigger word analysis module that determines if one or more trigger words are detected in the audio signal from the audio signal using a trained trigger word detection model that is trained using (a) a set of trigger words indicative of a technical device issue with one or more devices associated with the communication and (b) contextual information comprising one or more of a nearby additional trigger word feature, a time-in-meeting of trigger word feature and a before/after silence pattern feature, relative to a user-provided feedback score; and (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected in the audio signal using a trained audio activity model, wherein the audio activity pattern analysis module evaluates a length of an audio activity portion of the audio signal relative to a length of a silence portion of the audio signal to identify the silence pattern anomaly, wherein the silence pattern anomaly is indicative of a technical device issue with one or more devices associated with the communication; combining results of the trigger word analysis m
for measuring the quality of voice signals · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
the extracted parameters being spectral information of each sub-band · CPC title
Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.