Sound Enhancement through Deverberation
US-2016232914-A1 · Aug 11, 2016 · US
US10529353B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10529353-B2 |
| Application number | US-201715837223-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 11, 2017 |
| Priority date | Dec 11, 2017 |
| Publication date | Jan 7, 2020 |
| Grant date | Jan 7, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A mechanism is described for facilitating multi-device reverberation estimation according to one embodiment. An apparatus of embodiments, as described herein, includes detection and capture logic to facilitate a microphone of a first voice-enabled device of multiple voice-enabled devices to detect a command from a user. The apparatus further includes calculation logic to facilitate a second voice-enabled device and a third voice-enabled device to calculate speech to reverberation modulation energy ratio (SRMR) values based on the command, where the calculation logic us further to estimate reverberation times (RTs) based on the SRMR values. The apparatus further includes decision and application logic to perform dereverberation based on the estimated RTs of the reverberations.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: one or more processors to: facilitate a microphone of a first voice-enabled device of multiple voice-enabled devices to detect a command from a user; facilitate a second voice-enabled device and a third voice-enabled device in a multi-device environment to calculate speech to reverberation modulation energy ratio (SRMR) values based on the command; estimate reverberation times (RTs) based on the SRMR values; and perform dereverberation based on the estimated RTs of the reverberations, and recognize the command based on the estimated RTs. 2. The apparatus of claim 1 , wherein the RTs relate to reverberations associated with one or more of the first, second, and third voice-enabled devices, wherein the first, second, and third voice-enable devices are coupled with each other over a communication medium including one or more of a proximity network, a cloud network, and the Internet. 3. The apparatus of claim 1 , wherein the first voice-enabled device is further to convert the command into a text-to-speech (TTS) command, wherein one of the first, second, and third voice-enabled devices serves as a centralized unit positioned locally with the first, second, and third voice-enabled devices or remotely in communication over the communication medium. 4. The apparatus of claim 1 , wherein the one or more processors are further to update one or more SRMR tables based on the calculated SRMR values. 5. The apparatus of claim 1 , wherein the one or more processors are further to select one of the second and third voice-enabled devices to issue a response to the command. 6. The apparatus of claim 1 , wherein a relation between the SRMR values and the RTs is fixed, wherein the first, second, and third voice-enabled devices comprise one or more of smart speakers, laptop computers, mobile devices, smart wearable devices, smart household appliances, and smart locks. 7. The apparatus of claim 1 , wherein each of the first, second, and third voice-enabled devices comprise one or more processors including a graphics processor co-located with an application processor on a common semiconductor package. 8. A method comprising: facilitating a microphone of a first voice-enabled device of multiple voice-enabled devices to detect a command from a user; facilitating a second voice-enabled device and a third voice-enabled device in a multi-device environment to calculate speech to reverberation modulation energy ratio (SRMR) values based on the command; estimating reverberation times (RTs) based on the SRMR values; and performing dereverberation based on the estimated RTs of the reverberations, and recognize the command based on the estimated RTs. 9. The method of claim 8 , wherein the RTs relate to reverberations associated with one or more of the first, second, and third voice-enabled devices, wherein the first, second, and third voice-enable devices are coupled with each other over a communication medium including one or more of a proximity network, a cloud network, and the Internet. 10. The method of claim 8 , wherein the first voice-enabled device is further to convert the command into a text-to-speech (TTS) command, wherein one of the first, second, and third voice-enabled devices serves as a centralized unit positioned locally with the first, second, and third voice-enabled devices or remotely in communication over the communication medium. 11. The method of claim 8 , further comprising updating one or more SRMR tables based on the calculated SRMR values. 12. The method of claim 8 , further comprising selecting one of the second and third voice-enabled devices to issue a response to the command. 13. The method of claim 8 , wherein a relation between the SRMR values and the RTs is fixed, wherein the first, second, and third voice-enabled devices comprise one or more of smart speakers, laptop computers, mobile devices, smart wearable devices, smart household appliances, and smart locks. 14. The method of claim 8 , wherein each of the first, second, and third voice-enabled devices comprise one or more processors including a graphics processor co-located with an application processor on a common semiconductor package. 15. At least one non-transitory machine-readable medium comprising instructions which, when executed by a computing device, cause the computing device to perform operations comprising: facilitating a microphone of a first voice-enabled device of multiple voice-enabled devices to detect a command from a user; facilitating a second voice-enabled device and a third voice-enabled device in a multi-device environment to calculate speech to reverberation modulation energy ratio (SRMR) values based on the command; estimating reverberation times (RTs) based on the SRMR values; and performing dereverberation based on the estimated RTs of the reverberations, and recognize the command based on the estimated RTs. 16. The non-transitory machine-readable medium of claim 15 , wherein the RTs relate to reverberations associated with one or more of the first, second, and third voice-enabled devices, wherein the first, second, and third voice-enable devices are coupled with each other over a communication medium including one or more of a proximity network, a cloud network, and the Internet. 17. The non-transitory machine-readable medium of claim 15 , wherein the first voice-enabled device is further to convert the command into a text-to-speech (TTS) command, wherein one of the first, second, and third voice-enabled devices serves as a centralized unit positioned locally with the first, second, and third voice-enabled devices or remotely in communication over the communication medium. 18. The non-transitory machine-readable medium of claim 15 , further comprising updating one or more SRMR tables based on the calculated SRMR values. 19. The non-transitory machine-readable medium of claim 15 , further comprising selecting one of the second and third voice-enabled devices to issue a response to the command. 20. The non-transitory machine-readable medium of claim 15 , wherein a relation between the SRMR values and the RTs is fixed, wherein the first, second, and third voice-enabled devices comprise one or more of smart speakers, laptop computers, mobile devices, smart wearable devices, smart household appliances, and smart locks, wherein each of the first, second, and third voice-enabled devices comprise one or more processors including a graphics processor co-located with an application processor on a common semiconductor package.
the noise being echo, reverberation of the speech · CPC title
characterised by the type of extracted parameters · CPC title
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
Noise filtering · CPC title
using distance or distortion measures between unknown speech and reference templates · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.