Voice detection by multiple devices

US10699711B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10699711-B2
Application numberUS-201916416752-A
CountryUS
Kind codeB2
Filing dateMay 20, 2019
Priority dateJul 15, 2016
Publication dateJun 30, 2020
Grant dateJun 30, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are example techniques for voice detection by multiple NMDs. An example implementation may involve one or more servers receiving, via a network interface, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective NMD of the multiple NMDs, wherein the voice input comprises a detected wake-word. Based on respective sound pressure levels of the multiple audio recordings of the voice input, the servers (i) select a particular NMD of the multiple NMDs and (ii) forego selection of other NMDs of the multiple NMDs. The servers send, via the network interface to the particular NMD, data representing a playback command that corresponds to a voice command in the voice input represented in the multiple audio recordings, wherein the data representing the playback command causes the particular NMD to play back audio content according to the playback command.

First claim

Opening claim text (preview).

The invention claimed is: 1. A first networked microphone device (NMD) comprising: one or more amplifiers configured to drive one or more speakers; a microphone array; a network interface; one or more processors; data storage having stored therein instructions executable by the one or more processors to cause the first NMD to perform functions comprising: recording, via the microphone array, audio into a buffer; monitoring the recorded audio in the buffer for wake words; when a wake-word is detected in the recorded audio, querying, via the network interface, one or more servers of a particular voice assistant service with a voice command following the detected wake-word within the recorded audio; receiving, from one or more servers of the particular voice assistant service via the network interface in response to the query, a voice response corresponding to the voice command; in response to receiving the voice response corresponding to the voice command, sending, via the network interface to one or more second NMDs connected via to the first NMD via a local area network, instructions to cause the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs; and playing back the voice response via the one or more amplifiers configured to drive one or more speakers. 2. The first NMD of claim 1 , wherein the functions further comprise: selecting the first NMD to handle the voice response; and foregoing selection of the one or more second NMDs to handle the voice response. 3. The first NMD of claim 2 , wherein the first NMD is in a synchrony group with the one or more second NMDS, and wherein selecting the first NMD to handle the voice response comprises selecting the first NMD handle the voice response based on state information indicating that the first NMD is group coordinator of the synchrony group, wherein the group coordinator is configured to provide, to group members, at least one of (i) synchrony timing information and (ii) synchrony audio information. 4. The first NMD of claim 1 , wherein one or more servers of the particular voice assistant service select the first NMD to handle the voice response and forego selection of the one or more second NMDs to handle the voice response. 5. The first NMD of claim 4 , wherein the one or more servers of the particular voice assistant service select the first NMD to handle the voice response and forego selection of the one or more second NMDs to handle the voice response based on respective sound pressure levels of the voice command in multiple recording corresponding to respective NMDS of the first NMD and the one or more second NMDs. 6. The first NMD of claim 1 , wherein sending instructions to cause the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs comprising sending instructions that cause the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for a pre-defined time period. 7. The first NMD of claim 6 , wherein the functions further comprise determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command; and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the pre-defined time period has fully elapsed. 8. The first NMD of claim 1 , wherein the functions further comprise: forming a synchrony group including the first NMD and the one or more second NMDS, wherein playing back the voice response comprises playing back the voice response in synchrony with the one or more second NMDS. 9. A method to be performed by first networked microphone device (NMD), the method comprising: recording, via a microphone array of the first NMD, audio into a buffer; monitoring the recorded audio in the buffer for wake words; when a wake-word is detected in the recorded audio, querying, via a network interface of the first NMD, one or more servers of a particular voice assistant service with a voice command following the detected wake-word within the recorded audio; receiving, from one or more servers of the particular voice assistant service via the network interface in response to the query, a voice response corresponding to the voice command; in response to receiving the voice response corresponding to the voice command, sending, via the network interface to one or more second NMDs connected via to the first NMD via a local area network, instructions to cause the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs; and playing back the voice response via one or more amplifiers configured to drive one or more speakers, the first NMD comprising the one or more amplifiers and the one or more speakers. 10. The method of claim 9 , further comprising: selecting the first NMD to handle the voice response; and foregoing selection of the one or more second NMDs to handle the voice response. 11. The method of claim 10 , wherein the first NMD is in a synchrony group with the one or more second NMDS, and wherein selecting the first NMD to handle the voice response comprises selecting the first NMD handle the voice response based on state information indicating that the first NMD is group coordinator of the synchrony group, wherein the group coordinator is configured to provide, to group members, at least one of (i) synchrony timing information and (ii) synchrony audio information. 12. The method of claim 9 , wherein one or more servers of the particular voice assistant service select the first NMD to handle the voice response and forego selection of the one or more second NMDs to handle the voice response. 13. The method of claim 12 , wherein the one or more servers of the particular voice assistant service select the first NMD to handle the voice response and forego selection of the one or more second NMDs to handle the voice response based on respective sound pressure levels of the voice command in multiple recording corresponding to respective NMDS of the first NMD and the one or more second NMDs. 14. The method of claim 9 , wherein sending instructions to cause the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs comprising sending instructions that cause the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for a pre-defined time period. 15. The method of claim 14 , further comprising: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command; and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the pre-defined time period has fully elapsed. 16. The method of claim 9 , further comprising: forming a synchrony group including the first NMD and the one or more second NMDS, wherein playing back the voice response comprises playing back the voice response in synchro

Assignees

Inventors

Classifications

  • Execution procedure of a spoken command · CPC title

  • Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title

  • Feature extraction for speech recognition; Selection of recognition unit · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10699711B2 cover?
Disclosed herein are example techniques for voice detection by multiple NMDs. An example implementation may involve one or more servers receiving, via a network interface, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective NMD of the multiple NMDs, wherein the voice input comprises a detected wake-word. Based on res…
Who is the assignee on this patent?
Sonos Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).