What technology area does this patent fall under?

Primary CPC classification G10L25/51. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Removing recurring environmental sounds

US9799329B1 · US · B1

Patent metadata
Field	Value
Publication number	US-9799329-B1
Application number	US-201414559687-A
Country	US
Kind code	B1
Filing date	Dec 3, 2014
Priority date	Dec 3, 2014
Publication date	Oct 24, 2017
Grant date	Oct 24, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure describes, in part, techniques and devices for identifying recurring environmental sounds in an environment such that these sounds may be canceled out of corresponding audio signals to increase signal-to-noise ratios (SNRs) of the signals and, hence, improve automatic speech recognition (ASR) on the signals. Recurring environmental sounds may include the ringing of a mobile phone, the beeping sound of a microphone, the buzzing of a washing machine, or the like.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device comprising: one or more microphones; one or more processors; and one or more computer-readable media storing computer-executable instructions that, when executed, cause the one or more processors to perform acts comprising: receiving a first audio signal generated by the one or more microphones based on first sound in an environment; determining frequency and amplitude of the first audio signal; determining a direction within the environment from which the first sound originated; creating a signature of the first sound based at least in part on the frequency, the amplitude, and the direction; determining that the signature corresponds to a stored signature associated with an environmental sound; incrementing a number of times that the environmental sound has been captured within the environment; determining that the number of times is greater than a threshold; storing an indication that the environmental sound is to be canceled from a subsequent audio signal that is generated by the microphone and that indicates the environmental sound; receiving a second audio signal generated by the one or more microphones based on second sound in the environment, the second audio signal including a first component corresponding to the environmental sound and a second component corresponding to a voice command uttered by a user; identifying the first component based at least in part on frequency, amplitude, and direction of at least a portion of the second audio signal; removing, from the second audio signal, the first component to generate a modified second audio signal; and performing automatic speech recognition (ASR) on the modified second audio signal to identify the voice command uttered by the user. 2. An electronic device as recited in claim 1 , the acts further comprising: performing ASR on the first audio signal; and determining that the first audio signal does not include a voice command from the user. 3. An electronic device as recited in claim 1 , the acts further comprising selecting, based at least in part on at least one of frequency or amplitude of the second audio signal, one of multiple methods to implement to remove the first component, wherein the multiple methods include: utilizing a filter to remove at least one specified frequency range from the second audio signal and; subtracting the first component of the second audio signal from the second audio signal. 4. An electronic device as recited in claim 1 , the acts further comprising determining that the user uttered a keyword, and wherein the receiving of the first audio signal occurs at least partly in response to determining that the user uttered the keyword. 5. A method comprising: receiving, by a computing device, a first audio signal representative of a first sound in an environment; determining frequency and amplitude of the first audio signal; determining a first signature of the first sound based at least in part on the frequency and the amplitude; determining, using the first signature, a number of times that a second sound has previously been received, the second sound comprising a second signature that matches the first signature; determining that the number of times is greater than a threshold; generating an indication that the first signature corresponds to an environmental sound; storing the indication in a datastore; removing, by a filter of the computing device, the first audio signal corresponding to the environmental sound from subsequent audio signals; and sending the subsequent audio signals for processing. 6. A method as recited in claim 5 , wherein the determining the number of times comprises determining the number of times that the second sound has been received without a voice command being detected within a predefined amount of time. 7. A method as recited in claim 5 , further comprising determining a direction within the environment from which the first sound originated, and wherein the first signature is further based at least in part on the direction. 8. A method as recited in claim 5 , further comprising: receiving a second audio signal having a first component corresponding to the environmental sound and a second component corresponding to a voice command uttered by a user; and determining that the first component corresponds to the environmental sound based at least in part on at least one of frequency or amplitude of the second audio signal, wherein removing the first audio signal corresponding to the environmental sound from subsequent audio signals comprises removing the first component from the second audio signal. 9. A method as recited in claim 5 , wherein the filter corresponds to at least one frequency range associated with the environmental sound. 10. A method as recited in claim 8 , wherein the removing the first component comprises subtracting an amplitude of the first component from an amplitude of the second audio signal. 11. A method as recited in claim 8 , further comprising selecting one of multiple methods to implement to remove the first component from the second audio signal, the selecting based at least in part on at least one of the frequency or the amplitude of the second audio signal. 12. A method as recited in claim 5 , further comprising determining that a user in the environment uttered a keyword, and wherein the receiving of the first audio signal occurs at least partly in response to determining that the user uttered the keyword. 13. One or more computing devices comprising: one or more processors; and one or more computer-readable media storing computer-executable instructions that, when executed, cause the one or more processors to perform acts comprising: receiving a first audio signal representative of a first sound in an environment; determining frequency and amplitude of the first audio signal; determining a first signature of the first sound based at least in part on the frequency and the amplitude; determining, using the first signature, a number of times that a second sound has previously been received, the second sound comprising a second signature that matches the first signature; determining that the number of times is greater than a threshold; storing an indication that the first signature corresponds to an environmental sound; removing the environmental sound from a subsequent audio signal; and causing automatic speech recognition to be performed on the subsequent audio signal. 14. One or more computing devices as recited in claim 13 , wherein the determining the number of times comprises determining the number of times that the second sound has been received without a voice command being detected within a predefined amount of time. 15. One or more computing devices as recited in claim 13 , further comprising determining a direction within the environment from which the first sound originated, and wherein the first signature is further based at least in part on the direction. 16. One or more computing devices as recited in claim 13 , the acts further comprising: receiving a second audio signal having a first component corresponding to the environmental sound and a second component corresponding to a voice command uttered by a user; determining that the first component corresponds to the environment sound based at least in part on at least one of frequency or amplitude of the second audio signal; and removing the first component from the second audio signal. 17. One or more computing devices as recited in claim 16 , wherein the removing com

Assignees

Amazon Tech Inc

Inventors

Classifications

G10L25/51Primary
for comparison or discrimination · CPC title
G10L15/065
Adaptation · CPC title
G10L15/20Primary
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
G10L15/063
Training · CPC title
G10L17/22
Interactive procedures; Man-machine interfaces · CPC title

Patent family

Related publications grouped by family.

View patent family 60082641

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9799329B1 cover?: This disclosure describes, in part, techniques and devices for identifying recurring environmental sounds in an environment such that these sounds may be canceled out of corresponding audio signals to increase signal-to-noise ratios (SNRs) of the signals and, hence, improve automatic speech recognition (ASR) on the signals. Recurring environmental sounds may include the ringing of a mobile phon…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G10L25/51. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).