Controller for audio device and associated operation method
US-2015063580-A1 · Mar 5, 2015 · US
US11869503B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11869503-B2 |
| Application number | US-202117548921-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 13, 2021 |
| Priority date | Dec 20, 2019 |
| Publication date | Jan 9, 2024 |
| Grant date | Jan 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
As noted above, example techniques relate to offline voice control. A local voice input engine may process voice inputs locally when processing voice inputs via a cloud-based voice assistant service is not possible. Some techniques involve local (on-device) voice-assisted set-up of a cloud-based voice assistant service. Further example techniques involve local voice-assisted troubleshooting the cloud-based voice assistant service. Other techniques relate to interactions between local and cloud-based processing of voice inputs on a device that supports both local and cloud-based processing.
Opening claim text (preview).
The invention claimed is: 1. A playback device comprising: at least one audio transducer; one or more microphones; a network interface; at least one processor; a housing carrying the one or more microphones, the network interface, the at least one processor, and data storage including instructions that are executable by the at least one processor such that the playback device is configured to: while the playback device is in an offline mode: monitor, via a local voice assistant, a sound data stream from the one or more microphones for local keywords from a local natural language unit library of the local voice assistant, wherein in the offline mode, a voice assistant service (VAS) wake-word engine is inactive; generate a local wake-word event corresponding to a first voice input when the local voice assistant detects sound data matching one or more local keywords in a first portion of the sound data stream, wherein the one or more local keywords comprise a local wake word; determine, via the local voice assistant, an intent of the first voice input; select, from among a plurality of pre-determined responses, a pre-determined response corresponding to the determined intent of the first voice input; carry out the selected pre-determined response; and while the playback device is in an online mode: monitor, via the VAS wake-word engine, the sound data stream from the one or more microphones for one or more VAS wake words of a cloud-based voice assistant service, wherein in the online mode, the VAS wake-word engine is active; and generate a VAS wake-word event corresponding to a second voice input when the VAS wake-word engine detects sound data matching a particular VAS wake word in a second portion of the sound data stream, wherein, when the VAS wake word event is generated, the playback device streams sound data representing the second voice input to one or more servers of the cloud-based voice assistant service. 2. The playback device of claim 1 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to select the pre-determined response corresponding to the determined intent of the first voice input comprise instructions that are executable by the at least one processor such that the playback device is configured to select one or more pre-determined audible responses from a plurality of pre-determined audible responses corresponding to respective intents, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to carry out the selected pre-determined response comprise instructions that are executable by the at least one processor such that the playback device is configured to play back the selected one or more pre-determined audible responses via the at least one audio transducer. 3. The playback device of claim 2 , wherein the determined intent represents a command to configure a voice assistant service on the playback device, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to play back the selected one or more pre-determined audible responses via the at least one audio transducer comprise instructions that are executable by the at least one processor such that the playback device is configured to play back one or more audible prompts to configure the VAS wake-word engine for the cloud-based VAS. 4. The playback device of claim 1 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to select the pre-determined response corresponding to the determined intent of the first voice input comprise instructions that are executable by the at least one processor such that the playback device is configured to select a particular Internet-Of-Things (IoT) device from among a plurality of IoT devices that are connected to a local area network, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to carry out the selected pre-determined response comprise instructions that are executable by the at least one processor such that the playback device is configured to send, via the network interface, instructions to toggle a state of the particular IoT device. 5. The playback device of claim 4 , wherein the IoT device comprises a smart illumination device, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to send the instructions to toggle the state of the particular IoT device comprise instructions that are executable by the at least one processor such that the playback device is configured to send, via the network interface, instructions to toggle an illumination state of the smart illumination device. 6. The playback device of claim 1 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to select the pre-determined response corresponding to the determined intent of the first voice input comprise instructions that are executable by the at least one processor such that the playback device is configured to select a particular playlist from among a plurality of playlists, and wherein the instructions that are executable by the at least one processor such that the playback device is configured to carry out the selected pre-determined response comprise instructions that are executable by the at least one processor such that the playback device is configured to play back the playlist via the at least one audio transducer. 7. The playback device of claim 6 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to select the particular playlist from among the plurality of playlists comprise instructions that are executable by the at least one processor such that the playback device is configured to: select the particular playlist based on the one or more local keywords corresponding to metadata associated with the particular playlist. 8. The playback device of claim 1 , wherein the instructions are executable by the at least one processor such that the playback device is further configured to: while in the online mode, monitor, via the local voice assistant, the sound data stream from the one or more microphones for local keywords from the local natural language unit library of the local voice assistant concurrently with monitoring, via the VAS wake-word engine, the sound data stream for one or more VAS wake words; generate an additional local wake-word event corresponding to a third voice input when the local voice assistant detects sound data matching one or more additional local keywords in a third portion of the sound data stream; determine, via the local voice assistant, an intent of the third voice input; select, from among a plurality of pre-determined responses, an additional pre determined response, the additional pre-determined response corresponding to the determined intent of the third voice input; and carry out the selected additional pre-determined response. 9. The playback device of claim 8 , wherein the instructions that are executable by the at least one processor such that the playback device is configured to determine the intent of the third voice input comprise instructions that are executable by the at least one processor such that the playback device is configured to determine that the intent of the third voice input corresponds to a command to disable the VAS wake-word engine, and wherein the instructions that are executable by the at least one processor such that the playback device i
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
to the speaker · CPC title
Speech classification or search · CPC title
by checking connectivity · CPC title
Word spotting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.