What technology area does this patent fall under?

Primary CPC classification G10L17/04. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Reverberation compensation for far-field speaker recognition

US10096321B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10096321-B2
Application number	US-201615242882-A
Country	US
Kind code	B2
Filing date	Aug 22, 2016
Priority date	Aug 22, 2016
Publication date	Oct 9, 2018
Grant date	Oct 9, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are provided for reverberation compensation for far-field speaker recognition. A methodology implementing the techniques according to an embodiment includes receiving an authentication audio signal associated with speech of a user and extracting features from the authentication audio signal. The method also includes scoring results of application of one or more speaker models to the extracted features. Each of the speaker models is trained based on a training audio signal processed by a reverberation simulator to simulate selected far-field environmental effects to be associated with that speaker model. The method further includes selecting one of the speaker models, based on the score, and mapping the selected speaker model to a known speaker identification or label that is associated with the user.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor-implemented method for speaker recognition, the method comprising: receiving, by a processor, an authentication audio signal associated with speech of a user; extracting, by the processor, features from the authentication audio signal; scoring, by the processor, results of application of one or more speaker models to the extracted features, wherein each of the speaker models is trained based on a training audio signal, the training audio signal processed by a reverberation simulator to simulate selected far-field environmental effects to be associated with the speaker model; selecting, by the processor, one of the speaker models, the selected speaker model associated with the highest of the scores; and recognizing, by the processor, an identity of the user based on a known speaker identification (ID) associated with the selected speaker model, the recognized identity for use to authenticate the user. 2. The method of claim 1 , wherein the training of the speaker models further comprises: capturing a plurality of the training audio signals from a plurality of users; receiving a speaker ID for each of the users; and processing each of the plurality of training audio signals by the reverberation simulator to generate a plurality of reverberation processed training audio signals for each of the training audio signals, wherein each of the reverberation processed training audio signals is associated with a unique far-field environmental effect. 3. The method of claim 2 , wherein the training of the speaker models further comprises: generating feature sets of extracted features from each of the training audio signals and from each of the reverberation processed training audio signals; generating speaker models based on each feature set; and assigning the speaker ID as the known speaker ID associated with the generated speaker model. 4. The method of claim 1 , wherein the authentication audio signal is captured in a far-field of the microphone and the training audio signal is captured in a near-field of the microphone. 5. The method of claim 4 , wherein the far-field is a distance greater than three feet from the microphone and the near-field is a distance closer than three feet from the microphone. 6. A processor-implemented method for configuring a reverberation simulator for speaker recognition, the method comprising: receiving, by a processor, a first audio signal associated with speech of a user, the first audio signal captured at a first distance from a microphone; selecting, by the processor, a trial set of parameters for a reverberation simulator; generating, by the processor, a speaker model based on extracted features of an application of the reverberation simulator to the first audio signal; receiving, by the processor, one or more additional audio signals associated with speech of the user, the additional audio signals captured at a second distance from the microphone, the second distance greater than the first distance; scoring, by the processor, results of application of the speaker model to extracted features of each of the additional audio signals; and associating, by the processor, a summation of the scores with the trial set of parameters the summation of the scores to indicate a relative effectiveness of the trial set of parameters for modeling a far-field environment of the microphone at the second distance. 7. The method of claim 6 , further comprising selecting the trial set of parameters as an operational set of parameters based on the summation of scores associated with the trial set of parameters. 8. The method of claim 6 , further comprising generating an updated trial set of parameters for the reverberation simulator using an optimization algorithm based on the summation of scores. 9. The method of claim 8 , wherein the optimization algorithm is one of a genetic algorithm or a gradient descent algorithm. 10. The method of claim 6 , wherein the reverberation simulator is a Schroeder reverberator and the reverberation parameters comprise one or more of an effect mix parameter, a room size parameter, a damping parameter, and a stereo width parameter. 11. The method of claim 6 , wherein the second distance is in the far-field of the microphone and the first distance is in the near-field of the microphone. 12. At least one non-transitory computer readable storage medium having instructions encoded thereon that, when executed by one or more processors, result in the following operations for speaker recognition, the operations comprising: receiving an authentication audio signal associated with speech of a user; extracting features from the authentication audio signal; scoring results of application of one or more speaker models to the extracted features, wherein each of the speaker models is trained based on a training audio signal, the training audio signal processed by a reverberation simulator to simulate selected far-field environmental effects to be associated with the speaker model; selecting one of the speaker models, the selected speaker model associated with the highest of the scores; and recognizing an identity of the user based on a known speaker identification (ID) associated with the selected speaker model, the recognized identity for use to authenticate the user. 13. The computer readable storage medium of claim 12 , wherein the training of the speaker models further comprises the operations: capturing a plurality of the training audio signals from a plurality of users; receiving a speaker ID for each of the users; and processing each of the plurality of training audio signals by the reverberation simulator to generate a plurality of reverberation processed training audio signals for each of the training audio signals, wherein each of the reverberation processed training audio signals is associated with a unique far-field environmental effect. 14. The computer readable storage medium of claim 13 , wherein the training of the speaker models further comprises the operations: generating feature sets of extracted features from each of the training audio signals and from each of the reverberation processed training audio signals; generating speaker models based on each feature set; and assigning the speaker ID as the known speaker ID associated with the generated speaker model. 15. The computer readable storage medium of claim 12 , wherein the authentication audio signal is captured in a far-field of the microphone and the training audio signal is captured in a near-field of the microphone. 16. The computer readable storage medium of claim 15 , wherein the far-field is a distance greater than three feet from the microphone and the near-field is a distance closer than three feet from the microphone. 17. At least one non-transitory computer readable storage medium having instructions encoded thereon that, when executed by one or more processors, result in the following operations for configuring a reverberation simulator for speaker recognition, the operations comprising: receiving a first audio signal associated with speech of a user, the first audio signal captured at a first distance from a microphone; selecting a trial set of parameters for a reverberation simulator; generating a speaker model based on extracted features of an application of the reverberation simulator to the first audio signal; receiving one or more additional audio signals associated with speech of the user, the additional audio signals captured at a second distance from the microphone, the second distance greate

Assignees

Intel Corp

Inventors

Classifications

G10L17/04Primary
Training, enrolment or model building · CPC title
G10L17/06
Decision making techniques; Pattern matching strategies · CPC title
G10L21/0208
Noise filtering · CPC title
G10L2021/02082
the noise being echo, reverberation of the speech · CPC title
G10L17/12
Score normalisation · CPC title

Patent family

Related publications grouped by family.

View patent family 61190744

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10096321B2 cover?: Techniques are provided for reverberation compensation for far-field speaker recognition. A methodology implementing the techniques according to an embodiment includes receiving an authentication audio signal associated with speech of a user and extracting features from the authentication audio signal. The method also includes scoring results of application of one or more speaker models to the …
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G10L17/04. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Channel-Compensated Low-Level Features For Speaker Recognition

Audio processing for an acoustical environment

Sound enhancement through deverberation

Systems and methods for audio command recognition

Speaker verification

Method and apparatus for performing function by speech input

Frequently asked questions