Dynamic Face and Voice Signature Authentication for Enhanced Security
US-2018232591-A1 · Aug 16, 2018 · US
US9626971B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9626971-B2 |
| Application number | US-201314119156-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 20, 2013 |
| Priority date | Sep 28, 2012 |
| Publication date | Apr 18, 2017 |
| Grant date | Apr 18, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Method for text-dependent Speaker Recognition using a speaker adapted Universal Background Model, wherein the speaker adapted Universal Background Model is a speaker adapted Hidden Markov Model comprising channel correction.
Opening claim text (preview).
What is claimed is: 1. A method for text-dependent Speaker Recognition using a speaker model obtained by adaptation of a Universal Background Model, wherein the speaker model is a speaker adapted Hidden Markov Model, wherein the speaker model uses Bayesian inference to link observed parameters and hidden parameters, wherein the observed parameters are the feature vectors x nmt of utterance m of speaker n and time index t, and wherein the hidden parameters are at least one of a group of: the speaker factor y n for each speaker n, the channel factors u nm of the utterance m of speaker n, the active state s nmt generating the feature vector x nmt , and the active component z nmt generating the feature vector x nmt . 2. The method for text-dependent Speaker Recognition according to claim 1 , wherein the Universal Background Model is unsupervised adapted based on enrolment utterances of the speaker. 3. The method for text-dependent Speaker Recognition according to claim 1 , wherein only mean vectors and transition probabilities are adapted in the speaker model or wherein all parameters are adapted in the speaker model. 4. The method for text-dependent Speaker Recognition according to claim 1 , wherein the Universal Background Model of the text-dependent system is trained in an unsupervised training before it is adapted. 5. The method for text-dependent Speaker Recognition according to claim 1 , wherein utterances of a plurality of speakers, which may speak more than 5 different languages are used for an unsupervised training of the Universal Background Model of the text dependent system. 6. The method for text-dependent Speaker Recognition according to claim 1 , wherein the topology of the Universal Background Model of the text-dependent system is selected to comprise a transition possibility from each possible state to itself and each possible other state. 7. The method for text-dependent Speaker Recognition according to claim 1 , wherein the number of states is set to a number estimated by an analysis of the spectral properties of a signal. 8. The method for text-dependent Speaker Recognition according to claim 1 , further comprising adapting one or more parameters to a lexical content. 9. The method for text-dependent Speaker Recognition according to claim 1 , wherein the eigenvoices matrix and eigenchannel matrix are trained from the generic Universal Background Model in a development session. 10. The method for text-dependent Speaker Recognition according to claim 1 , further comprising the step of verifying in an unsupervised way whether a test signal was spoken by a target person. 11. The method for text-dependent Speaker Recognition according to claim 1 , wherein the speaker adapted model is used only to determine the most likely path, but not to compute the statistics, which are useable to extract the log likelihood ratios, wherein the channel may be compensated. 12. The method for text-dependent Speaker Recognition according to claim 1 , wherein verifying whether the test signal was spoken by the targeted person comprises calculating the difference between the two terms of the log likelihood of the testing audio and the speaker model and the log product of the transition probabilities of the most likely path obtained with the speaker model and the log likelihood of the testing audio and the generic Universal Background Model and the log product of the transition probabilities of the most likely path obtained with the generic Universal Background Model. 13. The method for text-dependent Speaker Recognition according to claim 1 , wherein the method further comprises identifying a target person by identifying the speaker adapted model with the highest likelihood score. 14. The method for text-dependent Speaker Recognition according to claim 1 , wherein the Universal Background Model is a Hidden Markov Model. 15. The method for text-dependent Speaker Recognition according to claim 1 , wherein the mean vectors and the transition probabilities of the Universal Background Model are adapted for the speaker model using a Maximum A Posteriori adaptation. 16. The method for text-dependent Speaker Recognition according to claim 1 , wherein the channel factors are compensated in the speaker adapted model. 17. The method for text-dependent Speaker Recognition according to claim 1 , wherein the following variables are used in the complete model: a sequence of speaker factors Y a sequence of channel factors U a sequence of the feature vectors X a sequence of Hidden Markov Model states S a sequence of Gaussian components Z. 18. The method for text-dependent Speaker Recognition according to claim 1 , wherein the dependencies of the variables are described by a Bayesian network. 19. The method for text-dependent Speaker Recognition according to claim 1 , wherein an iterative Expectation Maximization algorithm is applied for the training of the Universal Background Model given the development data. 20. The method for text-dependent Speaker Recognition according to claim 19 , wherein in the iterative algorithm in some of the iterations an additional step is introduced for maintaining boundary conditions or a step is replaced by a step for maintaining boundary conditions. 21. The method for text-dependent Speaker Recognition according to claim 1 , wherein a speaker dependent Hidden Markov Model is created by adapting the mean vectors and the eigenvoice matrix of the Universal Background Model according to the enrollment data. 22. The method for text-dependent Speaker Recognition according to claim 1 , wherein for the training of the Universal Background Model the model is initialized with values found by training a full covariance Universal Background Model. 23. The method for text-dependent Speaker Recognition according to claim 1 , wherein the method is used for speaker verification. 24. A method for text-dependent Speaker Recognition using a text-dependent and a text-independent system, wherein a model for the text-dependent system is adapted in an unsupervised way, and wherein, in addition, a model for the text-independent system for the speaker and the phrase is built, wherein the model uses Bayesian inference to link observed parameters and hidden parameters, wherein the observed parameters are the feature vectors x nmt of utterance m of speaker n and time index t, and wherein the hidden parameters are at least one of a group of: the speaker factor y n for each speaker n, the channel factors u nm of the utterance m of speaker n, the active state s nmt generating the feature vector x nmt , and the active component z nmt generating the feature vector x nmt . 25. The method for text-dependent Speaker Recognition according to claim 24 , wherein text-dependent speaker recognition according to claim 1 is used. 26. The method for text-dependent Speaker Recognition according to claim 24 , further comprising the step of verifying in an unsupervised way whether a test signal was spoken by the target person. 27. The method for text-dependent Speaker Recognition according to claim 24 , wherein the method further comprises a step of identifying a target person by identifying the speaker adapted model with the highest likelihood score. 28. The method for text-dependent Speaker Recognition according to claim 24 , wherein the scalar weights f
the user being prompted to utter a password or a predefined phrase · CPC title
Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems · CPC title
Training, enrolment or model building · CPC title
Phonemes, fenemes or fenones being the recognition units · CPC title
Use of phonemic categorisation or speech recognition prior to speaker recognition or verification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.