Deepfake detection
US-2024355334-A1 · Oct 24, 2024 · US
US9978373B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9978373-B2 |
| Application number | US-201615185298-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 17, 2016 |
| Priority date | May 27, 1997 |
| Publication date | May 22, 2018 |
| Grant date | May 22, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.
Opening claim text (preview).
What is claimed is: 1. A method comprising: comparing, via a processor, a feature coefficient generated from a speech signal to a user-specific codebook associated with a user who provided the speech signal, to yield a similarity value, wherein the user-specific codebook comprises codes that are formed from cepstrum coefficients generated from utterances spoken by the user; when the similarity value meets a threshold: adding the speech signal to a database of reference speech signals; and adding the feature coefficient to the user-specific codebook; and performing a speaker verification process of the user based on the database of reference speech signals and the user-specific codebook. 2. The method of claim 1 , wherein the user-specific codebook utilizes utterances from both the user and a group of non-users. 3. The method of claim 1 , further comprising: mixing the speech signal with a second speech signal, to yield a mixed speech signal; and adding the mixed speech signal to the database of reference speech signals. 4. The method of claim 2 , wherein the speech signal and the second speech signal are received from the user. 5. The method of claim 1 , wherein the threshold is determined using a Chi-squared detector. 6. The method of claim 1 , further comprising: when the similarity value does not meet the threshold, requesting the speech signal be repeated. 7. The method of claim 1 , further comprising verifying an identity of the user based on the similarity value. 8. A system comprising: a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: comparing a feature coefficient generated from a speech signal to a user-specific codebook associated with a user who provided the speech signal, to yield a similarity value, wherein the user-specific codebook comprises codes that are formed from cepstrum coefficients generated from utterances spoken by the user; when the similarity value meets a threshold: adding the speech signal to a database of reference speech signals; and adding the feature coefficient to the user-specific codebook; and performing a speaker verification process of the user based on the database of reference speech signals and the user-specific codebook. 9. The system of claim 8 , wherein the user-specific codebook utilizes utterances from both the user and a group of non-users. 10. The system of claim 8 , the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising: mixing the speech signal with a second speech signal, to yield a mixed speech signal; and adding the mixed speech signal to the database of reference speech signals. 11. The system of claim 9 , wherein the speech signal and the second speech signal are received from the user. 12. The system of claim 8 , wherein the threshold is determined using a Chi-squared detector. 13. The system of claim 8 , the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising: when the similarity value does not meet the threshold, requesting the speech signal be repeated. 14. The system of claim 8 , the computer-readable storage medium having additional instructions stored which result in operations comprising verifying an identity of the user based on the similarity value. 15. A non-transitory computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising: comparing a feature coefficient generated from a speech signal to a user-specific codebook associated with a user who provided the speech signal, to yield a similarity value, wherein the user-specific codebook comprises codes that are formed from cepstrum coefficients generated from utterances spoken by the user; when the similarity value meets a threshold: adding the speech signal to a database of reference speech signals; and adding the feature coefficient to the user-specific codebook; and performing a speaker verification process of the user based on the database of reference speech signals and the user-specific codebook. 16. The non-transitory computer-readable storage device of claim 15 , wherein the user-specific codebook utilizes utterances from both the user and a group of non-users. 17. The non-transitory computer-readable storage device of claim 15 , having additional instructions stored which, when executed by the computing device, cause the computing device to perform operations comprising: mixing the speech signal with a second speech signal, to yield a mixed speech signal; and adding the mixed speech signal to the database of reference speech signals. 18. The non-transitory computer-readable storage device of claim 15 , wherein the threshold is determined using a Chi-squared detector.
Training, enrolment or model building · CPC title
Interactive information services, e.g. directory enquiries {; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals} · CPC title
Arrangements at the exchange for service or number selection by voice (at the terminal H04M1/27) · CPC title
using distance or distortion measures between unknown speech and reference templates · CPC title
Interactive procedures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.