What technology area does this patent fall under?

Primary CPC classification G10L25/51. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Detecting synthetic sounds in call audio

US12413667B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12413667-B2
Application number	US-202418644169-A
Country	US
Kind code	B2
Filing date	Apr 24, 2024
Priority date	Nov 3, 2021
Publication date	Sep 9, 2025
Grant date	Sep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In some implementations, a system may capture audio from a call between a calling device and a called device. The system may filter the captured audio to generate a background audio layer. The system may generate an audio footprint that is a representation of sound in the background audio layer. The system may determine that the audio footprint includes a triggering sound footprint based on one or more audio characteristics of the audio footprint. The system may detect synthetic sound based on the audio footprint and after determining that the audio footprint includes the triggering sound footprint, wherein the synthetic sound is indicative of a sound recording. The system may transmit a notification to one or more devices associated with the call based on detecting the synthetic sound.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: generating, by a system, an audio footprint that is associated with a background audio layer generated based on filtering audio from a call to remove a voice audio layer; detecting, by the system, synthetic sound based on detecting, based on comparing a portion of the audio footprint to a plurality of stored audio footprints based on a plurality of priority indicators associated with the plurality of stored audio footprints, that the portion of the audio footprint sufficiently matches a stored audio footprint, of the plurality of stored audio footprints; and performing, by the system, one or more actions based on detecting the synthetic sound. 2. The method of claim 1 , wherein generating the audio footprint that is associated with the background audio layer generated based on filtering the audio from the call to remove the voice audio layer comprises: detecting audio signals corresponding to more than one voice originating from a calling device; determining that at least one audio signal of the audio signals corresponds to a voice other than a user's voice; and including the at least one audio signal in the audio footprint. 3. The method of claim 1 , wherein: a location associated with a first portion of the audio footprint corresponds to a portion of the audio footprint containing a triggering sound footprint, and a location associated with a second portion of the audio footprint corresponds to a remaining portion of the audio footprint. 4. The method of claim 1 , further comprising: detecting the synthetic sound based on determining that the audio footprint satisfies a condition that indicates a likelihood of fraud associated with the call. 5. The method of claim 1 , further comprising: detecting that the audio footprint includes a triggering sound footprint; determining a category associated with the triggering sound footprint; and comparing, based on determining the category, the portion of the audio footprint to stored audio footprints, of the plurality of stored audio footprints, that are associated with the category. 6. The method of claim 1 , further comprising: detecting the synthetic sound based on one or more rules, wherein the one or more rules define what conditions constitute an inconsistency between a first portion of the audio footprint and a second portion of the audio footprint. 7. The method of claim 1 , wherein the plurality of stored audio footprints originate from the call. 8. A system, comprising: one or more memories; and one or more processors, coupled to the one or more memories, configured to: generate an audio footprint that is associated with a background audio layer generated based on filtering audio from a call to remove a voice audio layer; detect synthetic sound based on detecting, based on comparing a portion of the audio footprint to a plurality of stored audio footprints based on a plurality of priority indicators associated with the plurality of stored audio footprints, that the portion of the audio footprint sufficiently matches a stored audio footprint, of the plurality of stored audio footprints; and perform one or more actions based on detecting the synthetic sound. 9. The system of claim 8 , wherein the one or more processors, to generate the audio footprint that is associated with the background audio layer generated based on filtering the audio from the call to remove the voice audio layer, are configured to: detect audio signals corresponding to more than one voice originating from a calling device; determine that at least one audio signal of the audio signals corresponds to a voice other than a user's voice; and include the at least one audio signal in the audio footprint. 10. The system of claim 8 , wherein the one or more processors are further configured to: detect the synthetic sound based on determining that the audio footprint includes a triggering sound footprint. 11. The system of claim 8 , wherein the one or more processors are further configured to: detect the synthetic sound based on determining that the audio footprint satisfies a condition that indicates a likelihood of fraud associated with the call. 12. The system of claim 8 , wherein the one or more processors are further configured to: determine that the audio footprint includes a triggering sound footprint; determine a category associated with the triggering sound footprint; and compare, based on determining the category, the portion of the audio footprint to the plurality of stored audio footprints. 13. The system of claim 8 , wherein the one or more processors are further configured to: detect the synthetic sound based on one or more rules, wherein the one or more rules define what conditions constitute an inconsistency between a first portion of the audio footprint and a second portion of the audio footprint. 14. The system of claim 8 , wherein the plurality of stored audio footprints originate from the call. 15. A non-transitory computer-readable medium storing a set of instructions, the set of instructions comprising: one or more instructions that, when executed by one or more processors of a system, cause the system to: filter captured audio from a call to generate a background audio layer; generate an audio footprint that is associated with the background audio layer; detect synthetic sound based on detecting, based on comparing a portion of the audio footprint to a plurality of stored audio footprints based on a plurality of priority indicators associated with the plurality of stored audio footprints, that the portion of the audio footprint sufficiently matches a stored audio footprint, of the plurality of stored audio footprints; and perform one or more actions based on detecting the synthetic sound. 16. The non-transitory computer-readable medium of claim 15 , wherein the one or more instructions, that cause the system to filter the captured audio from the call to generate the background audio layer, cause the system to: detect audio signals corresponding to more than one voice originating from a calling device; determine that at least one audio signal of the audio signals corresponds to a voice other than a user's voice; and include the at least one audio signal in the audio footprint. 17. The non-transitory computer-readable medium of claim 15 , wherein the one or more instructions further cause the system to: detect the synthetic sound based on determining that the audio footprint includes a triggering sound footprint. 18. The non-transitory computer-readable medium of claim 15 , wherein the one or more instructions further cause the system to: detect the synthetic sound based on determining that the audio footprint satisfies a condition that indicates a likelihood of fraud associated with the call. 19. The non-transitory computer-readable medium of claim 15 , wherein the one or more instructions further cause the system to: determine that the audio footprint includes a triggering sound footprint; determine a category associated with the triggering sound footprint; and compare, based on determining the category, stored audio footprints, of the plurality of stored audio footprints, that are associated with the category. 20. The non-transitory computer-readable medium of claim 15 , wherein the one or more instructions further cause the system to: detect the synthetic sound based on one or more rules, wherein the one or more rules define what conditions constitute an inconsisten

Assignees

Capital One Services Llc

Inventors

Classifications

G10L25/51Primary
for comparison or discrimination · CPC title
H04M2201/40
using speech recognition · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
H04M2203/301
Management of recordings · CPC title
G10L21/028
using properties of sound source · CPC title

Patent family

Related publications grouped by family.

View patent family 86145119

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12413667B2 cover?: In some implementations, a system may capture audio from a call between a calling device and a called device. The system may filter the captured audio to generate a background audio layer. The system may generate an audio footprint that is a representation of sound in the background audio layer. The system may determine that the audio footprint includes a triggering sound footprint based on one…
Who is the assignee on this patent?: Capital One Services Llc
What technology area does this patent fall under?: Primary CPC classification G10L25/51. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).