Voice detection by multiple devices

US10152969B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10152969-B2
Application numberUS-201615211748-A
CountryUS
Kind codeB2
Filing dateJul 15, 2016
Priority dateJul 15, 2016
Publication dateDec 11, 2018
Grant dateDec 11, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are example techniques for voice detection by multiple NMDs. An example implementation may involve receiving a set of voice recordings from a set of NMDs, and identifying a subset of voice recordings from which to determine a given voice command. The example implementation may further involve causing the identified subset of voice recordings to be analyzed to determine the given voice command.

First claim

Opening claim text (preview).

We claim: 1. A first networked microphone device (NMD) comprising: one or more amplifiers configured to drive one or more speakers; a microphone array; a network interface; one or more processors; tangible, non-transitory computer-readable media having stored therein instructions executable by the one or more processors to cause the first NMD to perform a method comprising: continuously recording, via the microphone array, audio into a buffer; detecting, in the recorded audio, a wake-word; in response to detecting the wake-word, (i) listening, via the microphone array, for a voice command following the wake-word in the recorded audio and (ii) sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via a local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for a pre-defined time period; querying, via the network interface, one or more servers of a particular voice assistant service with the voice command following the detected wake-word within the recorded audio; receiving, from one or more servers of the particular voice assistant service via the network interface in response to the query, a playback command corresponding to the voice command; and playing back audio content according to the playback command via the one or more amplifiers configured to drive one or more speakers. 2. The first NMD of claim 1 , wherein the voice command includes an indication of a period of time for the first NMD to listen for the voice command, and wherein the method further comprises sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for the period of time indicated by the voice command. 3. The first NMD of claim 2 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the period of time indicated by the voice command has fully elapsed. 4. The first NMD of claim 1 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the pre-defined time period has fully elapsed. 5. The first NMD of claim 1 , wherein the playback command comprises a command to play back particular audio content in a first zone that includes the first NMD and a second zone that includes a second NMD, and wherein the method further comprises: instructing, via the network interface, the second NMD of the second zone to play back the audio content according to the playback command in synchrony with playback of the audio content by the first NMD of the first zone. 6. The first NMD of claim 1 , wherein a first zone of a media playback system includes the first NMD, and wherein the first zone is configured into a zone group with a second zone that includes one or more playback devices, and wherein playing back the audio content according to the playback command comprises playing back the audio content in synchrony with one or more playback devices of the second zone. 7. The first NMD of claim 1 , wherein a first zone of a media playback system includes the first NMD and a second NMD in a bonded zone configuration in which the first NMD and the second NMD play respective channels of the audio content, and wherein playing back the audio content according to the playback command comprises playing back a first channel of the audio content in synchrony the second NMD playing back a second channel of the audio content. 8. Tangible, non-transitory, computer-readable media having instructions encoded therein, wherein the instructions, when executed by one or more processors, cause a first networked microphone device (NMD) to perform a method comprising: continuously recording, via a microphone array of the first NMD, audio into a buffer; detecting, in the recorded audio, a wake-word; in response to detecting the wake-word, (i) listening, via a microphone of the first NMD, for a voice command following the wake-word in the recorded audio and (ii) sending, via a network interface, instructions to one or more second NMDs connected via to the first NMD via a local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for a pre-defined time period; querying, via a network interface of the first NMD, one or more servers of a particular voice assistant service with the voice command following the detected wake-word within the recorded audio; receiving, from one or more servers of the particular voice assistant service via the network interface in response to the query, a playback command corresponding to the voice command; and playing back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. 9. The tangible, computer readable media of claim 8 , wherein the voice command includes an indication of a period of time for the first NMD to listen for the voice command, and wherein the method further comprises sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for the period of time indicated by the voice command. 10. The tangible, computer readable media of claim 9 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the period of time indicated by the voice command has fully elapsed. 11. The tangible, computer readable media of claim 8 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the pre-defined time period has fully elapsed.

Assignees

Inventors

Classifications

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title

  • Execution procedure of a spoken command · CPC title

  • Feature extraction for speech recognition; Selection of recognition unit · CPC title

  • Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10152969B2 cover?
Disclosed herein are example techniques for voice detection by multiple NMDs. An example implementation may involve receiving a set of voice recordings from a set of NMDs, and identifying a subset of voice recordings from which to determine a given voice command. The example implementation may further involve causing the identified subset of voice recordings to be analyzed to determine the give…
Who is the assignee on this patent?
Sonos Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).