Open earphone
US-2024422466-A1 · Dec 19, 2024 · US
US9947333B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9947333-B1 |
| Application number | US-201213371294-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 10, 2012 |
| Priority date | Feb 10, 2012 |
| Publication date | Apr 17, 2018 |
| Grant date | Apr 17, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A voice interaction architecture has a hands-free, electronic voice controlled assistant that permits users to verbally request information from cloud services. The voice controlled assistant may be positioned in a room to receive voice commands from the user. The voice controlled assistant may also pick up background sources of speech, music, or other noise, such as from a television or stereo system, which may adversely impact the user's intended vocal input to the assistant. The assistant transmits the aggregated audio data (user command and background noise) over a network to the cloud services, which implements noise cancellation functionality to remove the background noise while isolating and preserving the user's command. Once isolated, the cloud serves can process and interpret the user input to perform some function, and return the response over the network to the voice controlled assistant for audible output to the user.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a voice controlled assistant having a microphone to receive voice input and background noise; the voice controlled assistant further having a network interface to transmit aggregated audio data representing the voice input and the background noise over a network; a command response system remote from the voice controlled assistant and communicatively coupled to the voice controlled assistant to receive the aggregated audio data from the voice controlled assistant via the network, the command response system configured to: identify a source of the background noise at least by: identifying first audio content from the background noise; sending a request to a remote server for second audio content that is associated with the first audio content; and receiving the second audio content from the remote server; remove, using the second audio content, at least a part of the background noise from the aggregated audio data; identify the voice input; produce an audio response for the voice controlled assistant, the audio response representative of a speech; send the audio response over the network to the voice controlled assistant; and the voice controlled assistant being configured to receive the audio response and to audibly emit the audio response representative of the speech through a speaker. 2. The system of claim 1 , wherein the background noise includes content from a television. 3. The system of claim 1 , wherein the command response system comprises: one or more processors; memory accessible by the one or more processors; one or more computer-executable instructions stored in the memory and executable on the one or more processors to at least partially remove the background noise using an adaptive noise cancellation algorithm. 4. The system of claim 1 , wherein the command response system comprises: one or more processors; memory accessible by the one or more processors; and a noise source identifier stored in the memory and executable on the one or more processors to identify a source of the background noise. 5. The system of claim 1 , wherein the operation performed by the command response system comprises one or more of: forming a search query to include information from the voice input; performing a look-up for a response associated with the voice input; initiating a transaction using the voice input; conducting online commerce; or requesting delivery of entertainment content. 6. The system of claim 1 , wherein the command response system comprises a natural language processing engine to interpret the voice input prior to performing the operation. 7. The system of claim 1 , wherein the command response system is implemented as a network accessible platform that is accessible by the voice controlled assistant over the network. 8. The system of claim 1 , wherein the identifying the source of the background noise further comprises determining that the first audio content from the background noise corresponds to stored audio associated with a previously identified source of a previous background noise, the stored audio being stored at the remote server. 9. A system comprising: a network accessible infrastructure of one or more processors and memory accessible by the one or more processors, the network accessible infrastructure residing at a data center location and being configured to receive over a network aggregated audio data from a first device that is at a user-based location distant and separate from the data center location; one or more computer-executable instructions stored in the memory and executable on the one or more processors to: receive the aggregated audio data from the first device, the aggregated audio data representing a voice command from a user and background noise from an environment surrounding the user, the background noise comprising audio data representing speech produced from a second device that is at the user-based location; identify content in the background noise contained in the aggregated audio data by accessing content preferences previously associated with a profile of for the user and compare a portion of audio associated with the content preferences to the background noise; at least partially remove the background noise from the aggregated audio data using the content; and process the voice command extracted from the aggregated audio data after the background noise has been at least partially removed; and a response encoder to generate a response for the first device. 10. The system of claim 9 , wherein the background noise includes additional content from the second device. 11. The system of claim 9 , wherein the one or more computer-executable instructions are further executable on the one or more processors to maintain the content preferences for the user, the content preferences comprising at least one of television viewing patterns of the user, most frequently viewed television programs, most frequently played music, or most frequently played video games. 12. The system of claim 9 , wherein the one or more computer-executable instructions are further executable on the one or more processors to analyze the background noise from the aggregated audio data and discern a signature of the background noise to be used to identify the content of the background noise. 13. The system of claim 9 , wherein the one or more computer-executable instructions are further executable on the one or more processors to retrieve the content. 14. The system of claim 9 , wherein the one or more computer-executable instructions are further executable by the one or more processors to apply an adaptive noise cancellation algorithm to at least partially remove the background noise from the aggregated audio data. 15. The system of claim 9 , wherein the one or more computer-executable instructions are further executable by the one or more processors to convert the voice command from audio to text data. 16. The system of claim 9 , wherein the one or more computer-executable instructions are further executable by the one or more processors to: form a search query to include information from the voice command; perform a look-up for a response associated with the voice command; initiate a transaction using the voice command; conduct online commerce; or request delivery of entertainment content. 17. The system of claim 9 , wherein the response encoder is stored in the memory. 18. One or more non-transitory computer readable media storing instructions that, when executed on one or more processors, performs acts comprising: receiving aggregated audio data from a first device, the aggregated audio data containing an audio command from a user and background noise having content emitted from a second device, the background noise comprising audio data representing speech produced from the second device; analyzing content preferences associated with a user account of the user with the content emitted from the second device, the content preference including at least one of television viewing habits of the user or frequently viewed television programs associated with the user; identifying the content emitted from the second device based at least in part on the content preferences; at least partially removing the content emitted from the second device from the aggregated audio data to capture the audio command; processing the audio command to generate a response representative of speech; and sending the response back to the first device.
the extracted parameters being correlation coefficients · CPC title
the extracted parameters being spectral information of each sub-band · CPC title
Noise filtering · CPC title
for comparison or discrimination · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.