Processing spoken commands to control distributed audio outputs

US12094461B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12094461-B2
Application numberUS-202017128982-A
CountryUS
Kind codeB2
Filing dateDec 21, 2020
Priority dateFeb 12, 2016
Publication dateSep 17, 2024
Grant dateSep 17, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system that is capable of controlling multiple entertainment systems and/or speakers using voice commands. The system receives voice commands and may determine audio sources and speakers indicated by the voice commands. The system may generate audio data from the audio sources and may send the audio data to the speakers using multiple interfaces. For example, the system may send the audio data directly to the speakers using a network address, may send the audio data to the speakers via a voice-enabled device or may send the audio data to the speakers via a speaker controller. The system may generate output zones including multiple speakers and may associate input devices with speakers within the output zones. For example, the system may receive a voice command from an input device in an output zone and may reduce output audio generated by speakers in the output zone.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: outputting, by a first device in a first environment, first audio in the first environment; detecting, by the first device, input audio corresponding to an utterance, wherein the utterance is not requesting a volume of output audio in the first environment be reduced; determining a second device casuing second audio to be output in the first environment; based at least in part on detecting the input audio: reducing a volume of the first audio; and sending, to a networking component associated with the second device, a command to reduce a volume of the second audio; determining the utterance has concluded; and after determining the utterance has concluded, using the networking component to cause the second device to increase the volume of the second audio. 2. The computer-implemented method of claim 1 , wherein the first device is paired with the second device using a wireless connection. 3. The computer-implemented method of claim 1 , further comprising: determining an identifier corresponding to the second device, wherein sending the command is further based at least in part on the identifier. 4. The computer-implemented method of claim 1 , further comprising: determining that the second device is causing the second audio to be output at least partially during detection of the input audio, wherein sending the command is further based at least in part on determining that the second device is causing the second audio to be output at least partially during detection of the input audio. 5. The computer-implemented method of claim 1 , further comprising: determining that the input audio comprises a wakeword, wherein sending the command is further based at least in part on determining that the input audio comprises the wakeword. 6. The computer-implemented method of claim 1 , wherein the command causes the second device to output the second audio at a first volume, the first volume being below a second volume of the utterance. 7. A system comprising: at least one processor; and at least one memory including instructions that, when executed by the at least one processor, cause the system to: output, by a first device in a first environment, first audio in the first environment; detect, by the first device, input audio corresponding to an utterance, wherein the utterance is not requesting a volume of output audio in the first environment be reduced; determine a second device causing second audio to be output in the first environment; based at least in part on detection of the input audio: reduce a volume of the first audio; and send, to a networking component associated with the second device, a command to reduce a volume of the second audio; determine the utterance has concluded; and after determining the utterance has concluded, use the networking component to cause the second device to increase the volume of the second audio. 8. The system of claim 7 , wherein the first device is paired with the second device using a wireless connection. 9. The system of claim 7 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine an identifier corresponding to the second device, wherein the instructions that cause the system to send the command further cause the system to send the command based at least in part on the identifier. 10. The system of claim 7 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine that the second device is causing the second audio to be output at least partially during detection of the first audio, wherein the instructions that cause the system to send the command further cause the system to send the command based at least in part on determining that the output device is causing the second audio to be output at least partially during detection of the input audio. 11. The system of claim 7 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine that the input audio comprises a wakeword, wherein the instructions that cause the system to send the command further cause the system to send the command based at least in part on determining that the input audio comprises the wakeword. 12. The system of claim 7 , wherein the command causes the second device to output the second audio at a first volume, the first volume being below a second volume of the utterance. 13. A computer-implemented method comprising: outputting, by a first device in a first environment, first audio in the first environment; detecting, by the first device, input audio including an utterance comprising a wakeword; determining an second device causing second audio to be output in the first environment; based at least in part on detecting the input audio including the wakeword: reducing a volume of the first audio; and sending, to a networking component associated with the second device, a command to reduce a volume of the first audio; determining the utterance has concluded; and after determining the utterance has concluded, using the networking component to cause the second device to increase the volume of the second audio. 14. The computer-implemented method of claim 13 , wherein the first device is paired with the second device using a wireless connection. 15. The computer-implemented method of claim 13 , further comprising: determining that the second device is causing the second audio to be output at least partially during detection of the first audio, wherein sending the command is further based at least in part on determining that the second device is causing the second audio to be output at least partially during detection of the input audio.

Assignees

Inventors

Classifications

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Execution procedure of a spoken command · CPC title

  • for improving intelligibility · CPC title

  • Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

  • Aspects of volume control, not necessarily automatic, in sound systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12094461B2 cover?
A system that is capable of controlling multiple entertainment systems and/or speakers using voice commands. The system receives voice commands and may determine audio sources and speakers indicated by the voice commands. The system may generate audio data from the audio sources and may send the audio data to the speakers using multiple interfaces. For example, the system may send the audio dat…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).