Wake-word detection suppression

US12340802B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12340802-B2
Application numberUS-202318396279-A
CountryUS
Kind codeB2
Filing dateDec 26, 2023
Priority dateAug 7, 2017
Publication dateJun 24, 2025
Grant dateJun 24, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Example techniques involve suppressing a wake word response to a local wake word. An example implementation involves a playback device receiving audio content for playback by the playback device and providing a sound data stream representing the received audio content to a voice assistant service (VAS) wake-word engine and a local keyword engine. The playback device plays back a first portion of the audio content and detects, via the local keyword engine, that a second portion of the received audio content includes sound data matching one or more particular local keywords. Before the second portion of the received audio content is played back, the playback device disables a local keyword response of the local keyword engine to the one or more particular local keywords and then plays back the second portion of the audio content via one or more speakers.

First claim

Opening claim text (preview).

The invention claimed is: 1. A first network microphone device comprising: a network interface; at least one first microphone; at least one processor; and at least one non-transitory computer-readable medium comprising instructions that are executable by the at least one processor such that the first network microphone device is configured to: receive media content comprising audio; provide a sound data stream representing the audio to a first wake-word engine, wherein the first wake-word engine is operable to generate a first wake-word response when the first wake-word engine detects a particular wake word in a first microphone sound data stream representing first sound detected by the at least one first microphone; stream, via the network interface, one or more first audio signals representing a first portion of the audio to one or more playback devices for playback; detect, via the first wake-word engine, that a second portion of the audio includes sound data matching the particular wake word; before the second portion of the audio is played back by the one or more playback devices, cause, via the network interface, a second network microphone device to temporarily disable a wake-word response of a second wake-word engine, wherein the second wake-word engine is operable to (a) generate a second wake-word response when the second wake-word engine detects the particular wake word in a second microphone sound data stream representing second sound detected by at least one second microphone and (b) send sound data representing the second sound detected by the at least one second microphone to a voice assistant when the second wake-word response is generated; and stream, via the network interface, one or more second audio signals representing the second portion of the audio to the one or more playback devices for playback. 2. The first network microphone device of claim 1 , wherein at least one playback device of the one or more playback devices comprises the second network microphone device. 3. The first network microphone device of claim 1 , wherein the first network microphone device is connected to a local area network, wherein the second network microphone device is connected to the local area network, and wherein the instructions that are executable by the at least one processor such that the first network microphone device is configured to cause the second network microphone device to temporarily disable the wake-word response of the second wake-word engine comprises instructions that are executable by the at least one processor such that the first network microphone device is configured to: send, via the network interface over the local area network to the second network microphone device, instructions to temporarily disable the wake-word response of the second wake-word engine. 4. The first network microphone device of claim 1 , wherein the at least one non-transitory computer-readable medium further comprises instructions that are executable by the at least one processor such that the first network microphone device is configured to: before the second portion of the audio is played back by the one or more playback devices, temporarily disable the wake-word response of the first wake-word engine, wherein the first wake-word engine is operable to send sound data representing the first sound detected by the at least one second microphone to the voice assistant when the first wake-word response is generated by the first wake-word engine. 5. The first network microphone device of claim 1 , further comprising a digital audio/video interface connected to a television, wherein the media content comprises video, and wherein the instructions that are executable by the at least one processor such that the first network microphone device is configured to stream the one or more first audio signals representing the first portion of the audio to the one or more playback devices for playback comprises instructions that are executable by the at least one processor such that the first network microphone device is configured to: cause, via the network interface, the one or more playback devices to play back the one or more first audio signals representing the first portion of the audio while the television plays a first portion of the video corresponding to the first portion of the audio. 6. The first network microphone device of claim 1 , wherein the one or more playback devices comprise multiple playback devices configured in a group, and wherein the instructions that are executable by the at least one processor such that the first network microphone device is configured to stream the one or more first audio signals representing the first portion of the audio to the one or more playback devices for playback comprises instructions that are executable by the at least one processor such that the first network microphone device is configured to: send, via the network interface to the multiple playback devices, (i) respective first audio signals and (ii) timing information, wherein the multiple playback devices play the respective first audio signals in synchrony according to the timing information. 7. The first network microphone device of claim 1 , wherein the at least one non-transitory computer-readable medium further comprises instructions that are executable by the at least one processor such that the first network microphone device is configured to: before the second portion of the audio is played back by the one or more playback devices, cause, via the network interface, a third network microphone device to temporarily disable a wake-word response of a third wake-word engine, wherein the third wake-word engine is operable to (a) generate a third wake-word response when the third wake-word engine detects the particular wake word in a third microphone sound data stream representing third sound detected by at least one third microphone and (b) send sound data representing the second sound detected by the at least one third microphone to the voice assistant when the third wake-word response is generated. 8. The first network microphone device of claim 7 , wherein the second network microphone device and the third network microphone device are a subset of a plurality of network microphones devices connected to a local area network, and wherein the subset are in audible range of playback by the one or more playback devices. 9. At least one non-transitory computer-readable medium comprising instructions that are executable by at least one processor such that a first network microphone device is configured to: receive media content comprising audio; provide a sound data stream representing the audio to a first wake-word engine, wherein the first wake-word engine is operable to generate a first wake-word response when the first wake-word engine detects a particular wake word in a first microphone sound data stream representing first sound detected by at least one first microphone; stream, via a network interface, one or more first audio signals representing a first portion of the audio to one or more playback devices for playback; detect, via the first wake-word engine, that a second portion of the audio includes sound data matching the particular wake word; before the second portion of the audio is played back by the one or more playback devices, cause, via the network interface, a second network microphone device to temporarily disable a wake-word response of a second wake-word engine, wherein the second wake-word engine is operable to (a) generate a second wake-word response when the second wake-word engine detects the particular wake word in a second microphone sound data stream representing second sound detected by at least one second m

Assignees

Inventors

Classifications

  • G06F3/165Primary

    Management of the audio stream, e.g. setting of volume, audio stream path · CPC title

  • G06F3/167Primary

    Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • sound input device, e.g. microphone · CPC title

  • Execution procedure of a spoken command · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12340802B2 cover?
Example techniques involve suppressing a wake word response to a local wake word. An example implementation involves a playback device receiving audio content for playback by the playback device and providing a sound data stream representing the received audio content to a voice assistant service (VAS) wake-word engine and a local keyword engine. The playback device plays back a first portion o…
Who is the assignee on this patent?
Sonos Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/165. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 24 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).