Network microphone device with command keyword eventing

US11200894B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11200894-B2
Application numberUS-201916439032-A
CountryUS
Kind codeB2
Filing dateJun 12, 2019
Priority dateJun 12, 2019
Publication dateDec 14, 2021
Grant dateDec 14, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one aspect, a playback device includes a voice assistant service (VAS) wake-word engine and a command keyword engine. The playback device detects, via the command keyword engine, a first command keyword of in voice input of sound detected by one or more microphones of the playback device. The playback device determines an intent based on at least one keyword in the voice input via a local natural language unit (NLU). After detecting the first command keyword event and determining the intent, the playback device performs a first playback command corresponding to the first command keyword and according to the determined intent. When the playback device detects, via the wake-word engine, a wake-word in voice input, the playback device streams sound data corresponding to at least a portion of the voice input to one or more remote servers associated with the VAS.

First claim

Opening claim text (preview).

We claim: 1. A playback device comprising: a network interface; one or more processors; at least one microphone configured to detect sound; at least one speaker; data storage having instructions stored thereon that are executable by the one or more processors to cause the playback device to perform functions comprising: monitoring an input sound-data stream representing the sound detected by the at least one microphone for (i) a wake-word event and (ii) a first command keyword event; detecting the wake-word event, wherein detecting the wake-word event comprises after detecting a first sound via the one or more microphones, determining that that the detected first sound includes a first voice input comprising a wake word, wherein the first voice input further comprises an utterance, and wherein the wake word does not correspond to a playback command; streaming, via the network interface, sound data corresponding to at least a portion of the first voice input comprising the utterance to one or more remote servers of a voice assistant service for processing of the first voice input via a remote natural language unit (NLU) of the one or more remote servers; detecting the first command keyword event, wherein detecting the first command keyword event comprises after detecting a second sound via the one or more microphones, determining that the detected second sound includes a second voice input comprising a first command keyword and at least one keyword, wherein the first command keyword is one of a plurality of command keywords supported by the playback device, wherein the first command keyword corresponds to a particular playback command, and wherein the second voice input excludes the wake word; determining, via a local natural language unit (NLU), an intent based on the at least one keyword, wherein the NLU includes a pre-determined library of keywords comprising the at least one keyword; and after (a) detecting the first command keyword event and (b) determining the intent, performing the particular playback command according to the determined intent. 2. The playback device of claim 1 , wherein the functions further comprise: detecting a second command keyword event, wherein detecting the second command keyword event comprises after detecting a third sound via the at least one microphone, determining that the third sound includes a third voice input comprising the second command keyword; determining that the third voice input comprising the second command keyword does not include at least one other keyword from the pre-determined library of keywords; and after determining that the third voice input comprising the second command keyword does not include the at one least keyword from the pre-determined library of keywords, streaming sound data representing at least a portion of the third voice input comprising the second command keyword to one or more servers of the voice assistant service for processing by the one or more remote servers of the voice assistance service. 3. The playback device of claim 2 , wherein the functions further comprise: playing back an audio prompt to request confirmation for invoking the voice assistant service to process the second command keyword; and after playing back the audio prompt, receiving data representing confirmation to invoke the voice assistant service for processing the second command keyword, wherein streaming the sound data representing at least a portion of the third voice input comprising the second command keyword to one or more servers of the voice assistant service occurs only after receiving the data representing confirmation to invoke the voice assistant service. 4. The playback device of claim 1 , wherein the functions further comprise: detecting a second command keyword event, wherein detecting the second command keyword event comprises after detecting a third sound via the at least one microphone, determining that the third sound includes a third voice input comprising the second command keyword; determining that the third voice input comprising the second command keyword does not include at least one other keyword from the pre-determined library of keywords; and after determining that the third voice input comprising the second command keyword does not include the at one least keyword from the pre-determined library of keywords, performing the particular playback command according to one or more default parameters. 5. The playback device of claim 1 , wherein a first keyword of the at least one keyword in the detected second sound represents a zone name corresponding to a first zone of a media playback system, wherein performing the particular playback command according to the determined intent comprises sending one or more instructions to perform the particular playback command in the first zone, and wherein the media playback system comprises the playback device. 6. The playback device of claim 5 , wherein the functions further comprise: populating the pre-determined library of keywords with zones names corresponding to respective zones within the media playback system, wherein each zone comprises one or more respective playback devices, and wherein the pre-determined library of keywords is populated with the zone name corresponding to the first zone of the media playback system. 7. The playback device of claim 1 , wherein the functions further comprise: discovering, via the network interface, smart home devices connected to a local area network; and populating the pre-determined library of keywords with names corresponding to respective smart home devices discovered on the local area network. 8. The playback device of claim 1 , wherein a media playback system comprises the playback device, wherein the media playback system is registered to one or more user profiles, and wherein the functions further comprise: populating the pre-determined library of keywords with names corresponding to playlists that are designated as favorites by the one or more user profiles. 9. The playback device of claim 8 , wherein a first user profile of the one or more user profiles is associated with a user account of a first streaming audio service and a user account of a second streaming audio service, and wherein the playlists comprise a first playlist of a first streaming audio service designated as a favorite by the user account of the first streaming audio service and a second playlist comprising audio tracks from the first streaming audio service and the second streaming audio service. 10. The playback device of claim 1 , wherein detecting the first command keyword event further comprises determining that one or more playback conditions corresponding to the first command keyword are satisfied. 11. The playback device of claim 10 , wherein the one or more playback conditions corresponding to the first command keyword comprises a first condition of an absence of background speech in the detected first sound. 12. A method to be performed by a playback device comprising at least one microphone configured to detect sound, the method comprising: monitoring an input sound-data stream representing the sound detected by the at least one microphone for (i) a wake-word event and (ii) a first command keyword event; detecting the wake-word event, wherein detecting the wake-word event comprises after detecting a first sound via the one or more microphones, determining that that the detected first sound includes a first voice input comprising a wake word, wherein the first voice input further comprises an utterance, and wherein the wake word does not correspond to a playback command; streaming, via a network interface, sound data

Assignees

Inventors

Classifications

  • Word spotting · CPC title

  • Management of the audio stream, e.g. setting of volume, audio stream path · CPC title

  • Execution procedure of a spoken command · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11200894B2 cover?
In one aspect, a playback device includes a voice assistant service (VAS) wake-word engine and a command keyword engine. The playback device detects, via the command keyword engine, a first command keyword of in voice input of sound detected by one or more microphones of the playback device. The playback device determines an intent based on at least one keyword in the voice input via a local na…
Who is the assignee on this patent?
Sonos Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).