What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Voice detection by multiple devices

US10152969B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10152969-B2
Application number	US-201615211748-A
Country	US
Kind code	B2
Filing date	Jul 15, 2016
Priority date	Jul 15, 2016
Publication date	Dec 11, 2018
Grant date	Dec 11, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are example techniques for voice detection by multiple NMDs. An example implementation may involve receiving a set of voice recordings from a set of NMDs, and identifying a subset of voice recordings from which to determine a given voice command. The example implementation may further involve causing the identified subset of voice recordings to be analyzed to determine the given voice command.

First claim

Opening claim text (preview).

We claim: 1. A first networked microphone device (NMD) comprising: one or more amplifiers configured to drive one or more speakers; a microphone array; a network interface; one or more processors; tangible, non-transitory computer-readable media having stored therein instructions executable by the one or more processors to cause the first NMD to perform a method comprising: continuously recording, via the microphone array, audio into a buffer; detecting, in the recorded audio, a wake-word; in response to detecting the wake-word, (i) listening, via the microphone array, for a voice command following the wake-word in the recorded audio and (ii) sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via a local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for a pre-defined time period; querying, via the network interface, one or more servers of a particular voice assistant service with the voice command following the detected wake-word within the recorded audio; receiving, from one or more servers of the particular voice assistant service via the network interface in response to the query, a playback command corresponding to the voice command; and playing back audio content according to the playback command via the one or more amplifiers configured to drive one or more speakers. 2. The first NMD of claim 1 , wherein the voice command includes an indication of a period of time for the first NMD to listen for the voice command, and wherein the method further comprises sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for the period of time indicated by the voice command. 3. The first NMD of claim 2 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the period of time indicated by the voice command has fully elapsed. 4. The first NMD of claim 1 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the pre-defined time period has fully elapsed. 5. The first NMD of claim 1 , wherein the playback command comprises a command to play back particular audio content in a first zone that includes the first NMD and a second zone that includes a second NMD, and wherein the method further comprises: instructing, via the network interface, the second NMD of the second zone to play back the audio content according to the playback command in synchrony with playback of the audio content by the first NMD of the first zone. 6. The first NMD of claim 1 , wherein a first zone of a media playback system includes the first NMD, and wherein the first zone is configured into a zone group with a second zone that includes one or more playback devices, and wherein playing back the audio content according to the playback command comprises playing back the audio content in synchrony with one or more playback devices of the second zone. 7. The first NMD of claim 1 , wherein a first zone of a media playback system includes the first NMD and a second NMD in a bonded zone configuration in which the first NMD and the second NMD play respective channels of the audio content, and wherein playing back the audio content according to the playback command comprises playing back a first channel of the audio content in synchrony the second NMD playing back a second channel of the audio content. 8. Tangible, non-transitory, computer-readable media having instructions encoded therein, wherein the instructions, when executed by one or more processors, cause a first networked microphone device (NMD) to perform a method comprising: continuously recording, via a microphone array of the first NMD, audio into a buffer; detecting, in the recorded audio, a wake-word; in response to detecting the wake-word, (i) listening, via a microphone of the first NMD, for a voice command following the wake-word in the recorded audio and (ii) sending, via a network interface, instructions to one or more second NMDs connected via to the first NMD via a local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for a pre-defined time period; querying, via a network interface of the first NMD, one or more servers of a particular voice assistant service with the voice command following the detected wake-word within the recorded audio; receiving, from one or more servers of the particular voice assistant service via the network interface in response to the query, a playback command corresponding to the voice command; and playing back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. 9. The tangible, computer readable media of claim 8 , wherein the voice command includes an indication of a period of time for the first NMD to listen for the voice command, and wherein the method further comprises sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to stop recording audio via respective microphone arrays of the one or more second NMDs for the period of time indicated by the voice command. 10. The tangible, computer readable media of claim 9 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the period of time indicated by the voice command has fully elapsed. 11. The tangible, computer readable media of claim 8 , wherein the method further comprises: determining, based on the recorded audio in the buffer, that the first NMD is no longer receiving the voice command, and based on determining that the first NMD is no longer receiving the voice command, sending, via the network interface, instructions to one or more second NMDs connected via to the first NMD via the local area network, the instructions causing the one or more second NMDs to start recording audio via respective microphone arrays of the one or more second NMDs before the pre-defined time period has fully elapsed.

Assignees

Sonos Inc

Inventors

Classifications

G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/20
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
G10L2015/223
Execution procedure of a spoken command · CPC title
G10L15/02
Feature extraction for speech recognition; Selection of recognition unit · CPC title
G10L15/34
Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing · CPC title

Patent family

Related publications grouped by family.

View patent family 59684011

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10152969B2 cover?: Disclosed herein are example techniques for voice detection by multiple NMDs. An example implementation may involve receiving a set of voice recordings from a set of NMDs, and identifying a subset of voice recordings from which to determine a given voice command. The example implementation may further involve causing the identified subset of voice recordings to be analyzed to determine the give…
Who is the assignee on this patent?: Sonos Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).