Textual Echo Cancellation
US-2021390975-A1 · Dec 16, 2021 · US
US12010260B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12010260-B2 |
| Application number | US-202117453431-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 3, 2021 |
| Priority date | Nov 3, 2021 |
| Publication date | Jun 11, 2024 |
| Grant date | Jun 11, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In some implementations, a system may capture audio from a call between a calling device and a called device. The system may filter the captured audio to generate a background audio layer. The system may generate an audio footprint that is a representation of sound in the background audio layer. The system may determine that the audio footprint includes a triggering sound footprint based on one or more audio characteristics of the audio footprint. The system may detect synthetic sound based on the audio footprint and after determining that the audio footprint includes the triggering sound footprint, wherein the synthetic sound is indicative of a sound recording. The system may transmit a notification to one or more devices associated with the call based on detecting the synthetic sound.
Opening claim text (preview).
What is claimed is: 1. A system for detecting synthetic sounds in call audio, the system comprising: one or more memories; and one or more processors, communicatively coupled to the one or more memories, configured to: capture audio from a call between a calling device and a called device; filter, based on one or more audio characteristics of the captured audio, the captured audio to remove one or more voices; generate a background audio layer based on filtering the captured audio to remove the one or more voices; generate an audio footprint that is a representation of sound in the background audio layer; determine that the background audio layer includes a triggering sound pattern based on analyzing the audio footprint; detect a synthetic sound pattern in the background audio layer based on determining that the background audio layer includes the triggering sound pattern, wherein the synthetic sound pattern is detected based on at least one of: a first portion of the audio footprint sufficiently matching a second portion of the audio footprint, or comparing a portion of the audio footprint to a plurality of stored audio footprints based on a plurality of priority indicators corresponding to the plurality of stored audio footprints and the portion of the audio footprint sufficiently matching a stored audio footprint, wherein the stored audio footprint is not generated from the captured audio; and transmit a notification to one or more devices associated with the call based on detecting the synthetic sound pattern. 2. The system of claim 1 , wherein the one or more processors, to detect the synthetic sound pattern, are configured to: detect the synthetic sound pattern based on the first portion of the audio footprint sufficiently matching the second portion of the audio footprint, wherein one of the first portion or the second portion includes the triggering sound pattern, and wherein a duration of the triggering sound pattern satisfies a threshold. 3. The system of claim 1 , wherein the one or more processors, to detect the synthetic sound pattern, are configured to: detect the synthetic sound pattern based on the portion of the audio footprint sufficiently matching the stored audio footprint, wherein the portion of the audio footprint includes the triggering sound pattern, and wherein a duration of the triggering sound pattern satisfies a threshold. 4. The system of claim 1 , wherein the one or more processors are further configured to: determine a category of the triggering sound pattern; and identify the plurality of stored audio footprints, including the stored audio footprint, based on the category of the triggering sound pattern. 5. The system of claim 1 , wherein the one or more processors are further configured to: determine an order for comparing the portion of the audio footprint to the plurality of stored audio footprints based on the plurality of priority indicators corresponding to the plurality of stored audio footprints, wherein each priority indicator indicates a quantity of times that a corresponding stored audio footprint was sufficiently matched; wherein the one or more processors, to detect the synthetic sound pattern, are configured to: compare the audio footprint to one or more stored audio footprints, of the plurality of stored audio footprints, based on the order; and detect the synthetic sound pattern based on the portion of the audio footprint sufficiently matching the stored audio footprint, wherein the stored audio footprint is included in the plurality of stored audio footprints; and wherein the one or more processors are further configured to: modify a priority indicator associated with the stored audio footprint based on detecting the synthetic sound pattern. 6. The system of claim 1 , wherein the one or more processors, to transmit the notification to the one or more devices associated with the call, are configured to: transmit the notification to a device associated with a called party, that is using the called device to participate in the call, based on a determination that the call is ongoing. 7. The system of claim 1 , wherein the one or more processors, to transmit the notification to the one or more devices associated with the call, are configured to: transmit the notification to a device associated with a user for which a record was updated in connection with the call. 8. The system of claim 1 , wherein the one or more processors are further configured to: determine whether the call is ongoing or has ended; and identify the one or more devices to which the notification is to be transmitted based on determining whether the call is ongoing or has ended. 9. The system of claim 1 , wherein the one or more processors are further configured to: identify a change to a record that was made in connection with the call; and reverse the change to the record based on detecting the synthetic sound pattern. 10. A method for detecting synthetic audio in call audio, comprising: capturing, by a system, audio from a call between a calling device and a called device; filtering, by the system and based on one or more audio characteristics of the captured audio, the captured audio to generate a background audio layer; generating, by the system, an audio footprint that is a representation of sound in the background audio layer; determining, by the system, that the audio footprint includes a triggering sound footprint based on one or more audio characteristics of the audio footprint; detecting, by the system, synthetic sound based on the audio footprint and after determining that the audio footprint includes the triggering sound footprint, wherein the synthetic sound is indicative of a sound recording, and wherein the synthetic sound is detected based on at least one of: a first portion of the audio footprint sufficiently matching a second portion of the audio footprint, or comparing a portion of the audio footprint to a plurality of stored audio footprints based on a plurality of priority indicators corresponding to the plurality of stored audio footprints and the portion of the audio footprint sufficiently matching a stored audio footprint, wherein the stored audio footprint is not generated from the captured audio; and transmitting, by the system, a notification to one or more devices associated with the call based on detecting the synthetic sound. 11. The method of claim 10 , wherein filtering the captured audio to generate the background audio layer comprises: removing, from the captured audio, all audio originating from the called device; and removing a voice from audio originating from the calling device. 12. The method of claim 10 , wherein detecting the synthetic sound includes determining that the audio footprint includes a third portion and a fourth portion that are inconsistent with one another. 13. The method of claim 10 , wherein detecting the synthetic sound includes determining that the audio footprint includes a third portion, indicative of a muted sound, and a fourth portion indicative of an unmuted sound other than voice audio. 14. The method of claim 10 , wherein detecting the synthetic sound includes determining that the audio footprint includes a characteristic indicative of recorded audio. 15. The method of claim 10 , further comprising: determining an order for comparing the portion of the audio footprint to the plurality of stored audio footprints based on the plurality of priority indicators corresponding to the plurality of stored audio footprints, wherein each priority indicator indicates a quant
for comparison or discrimination · CPC title
using speech recognition · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Management of recordings · CPC title
using properties of sound source · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.