Audio Routing System for Routing Audio Data to and from a Mobile Device
US-2015086034-A1 · Mar 26, 2015 · US
US11854547B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11854547-B2 |
| Application number | US-202117549034-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 13, 2021 |
| Priority date | Jun 12, 2019 |
| Publication date | Dec 26, 2023 |
| Grant date | Dec 26, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one aspect, a playback device includes a voice assistant service (VAS) wake-word engine and a command keyword engine. The playback device detects, via the command keyword engine, a first command keyword of in voice input of sound detected by one or more microphones of the playback device. The playback device determines an intent based on at least one keyword in the voice input via a local natural language unit (NLU). After detecting the first command keyword event and determining the intent, the playback device performs a first playback command corresponding to the first command keyword and according to the determined intent. When the playback device detects, via the wake-word engine, a wake-word in voice input, the playback device streams sound data corresponding to at least a portion of the voice input to one or more remote servers associated with the VAS.
Opening claim text (preview).
The invention claimed is: 1. A playback device comprising: a network interface; at least one microphone configured to detect sound; at least one speaker; at least one processor; and a housing carrying the network interface, the at least one microphone, the at least one speaker; the at least one processor, and data storage including instructions that are executable by the at least one processor such that the playback device is configured to: capture, via the at least one microphone, at least one input data stream; detect a wake word in a first portion of the at least one input data stream; based on detection of the wake word, trigger a wake-word event based on a first voice input captured via the at least one microphone, wherein the first voice input comprises the wake word and an utterance, and wherein the wake word does not correspond to a command; stream, via the network interface, sound data representing at least a portion of the first voice input to one or more remote servers of a voice assistant service for remote processing via a voice assistant of the one or more remote servers; after the first voice input is processed, a first command keyword in a second portion of the at least one input data stream, wherein the first command keyword is preceded in the at least one input data stream by a period of inactivity that excludes the wake word; based on detection of the first command keyword, trigger a first command keyword event to locally process a second voice input represented in the second portion of the at least one input data stream, wherein the second voice input comprises a first command keyword and at least one keyword from a set of keywords supported by a local voice assistant, wherein the first command keyword is one of a plurality of command keywords supported by the local voice assistant of the playback device, and wherein the second voice input excludes the wake word; determine, via the local voice assistant, (i) a particular command corresponding to the first command keyword and (ii) one or parameters corresponding to the at least one keyword, the one or more parameters modifying the particular command; and cause at least one local network device to carry out the particular command according to the one or more parameters. 2. The playback device of claim 1 , wherein the instructions are executable by the at least one processor such that the playback device is further configured to: detect a second command keyword in a third portion of the at least one input data stream; based on detection of the second command keyword, trigger a second command keyword event to locally process a third voice input represented in the second portion of the at least one input data stream, wherein the third voice input comprises the second command keyword, and wherein the second command keyword is one of the plurality of command keywords supported by the local voice assistant of the playback device; determine that the local voice assistant is unable to process a particular command corresponding to the second command keyword; and after the determination that the local voice assistant is unable to process the particular command corresponding to the second command keyword, stream, via the network interface, sound data representing at least a portion of the third voice input to the one or more remote servers of the voice assistant service for remove processing of the third voice input via the voice assistant of the one or more remote servers. 3. The playback device of claim 2 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to determine that the local voice assistant is unable to process the particular command corresponding to the second command keyword comprise instructions that are executable by the at least one processor such that the playback device is configured to: determine that a confidence score produced by the local voice assistant in processing the third voice input is below a threshold. 4. The playback device of claim 1 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to determine the one or parameters corresponding to the at least one keyword comprise instructions that are executable by the at least one processor such that the playback device is configured to: determine that the at least one keyword of the first voice input includes one or more particular keywords representing a room name; and determine that the room name corresponds to a particular room including the at least one local network device; and assign the particular room to a target parameter for the particular command. 5. The playback device of claim 4 , wherein the instructions are executable by the at least one processor such that the playback device is further configured to: populate the set of keywords supported by the local voice assistant with keywords corresponding to respective room names of rooms configured according to one or more smart home protocols. 6. The playback device of claim 1 , wherein the playback device is connected to a local area network, and wherein the instructions are executable by the at least one processor such that the playback device is further configured to: discover, via the network interface, local network devices connected to the local area network; and populate the set of keywords supported by the local voice assistant with keywords corresponding to respective names of the discovered local network devices. 7. The playback device of claim 1 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to detect the first command keyword event comprise instructions that are executable by the at least one processor such that the playback device is configured to: determine that one or more conditions corresponding to the first command keyword are satisfied. 8. The playback device of claim 7 , wherein the one or more conditions corresponding to the first command keyword comprise a particular condition representing an absence of background speech, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to determine that the one or more conditions corresponding to the first command keyword are satisfied comprise instructions that are executable by the at least one processor such that the playback device is configured to: determine an absence of background speech in sound detected by the at least one microphone during capture of the second voice input. 9. The playback device of claim 1 , wherein the at least one local network device comprises an additional playback device, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to cause the at least one local network device to carry out the particular command according to the one or more parameters comprise instructions that are executable by the at least one processor such that the playback device is configured to: cause, via the network interface, the additional playback device to play back audio content according to the particular command. 10. The playback device of claim 1 , wherein the at least one local network device comprises a smart illumination device, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to cause the at least one local network device to carry out the particular command according to the one or more parameters comprise instructions that are executable by the at least one processor such that the playback device is configur
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Management of the audio stream, e.g. setting of volume, audio stream path · CPC title
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.