Voice control of a media playback system

US11736860B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11736860-B2
Application numberUS-202117562412-A
CountryUS
Kind codeB2
Filing dateDec 27, 2021
Priority dateFeb 22, 2016
Publication dateAug 22, 2023
Grant dateAug 22, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multiple aspects of systems and methods for voice control and related features and functionality for various embodiments of media playback devices, networked microphone devices, microphone-equipped media playback devices, and speaker-equipped networked microphone devices are disclosed and described herein, including but not limited to designating and managing default networked devices, audio response playback, room-corrected voice detection, content mixing, music service selection, metadata exchange between networked playback systems and networked microphone systems, handling loss of pairing between networked devices, actions based on user identification, and other voice control of networked devices.

First claim

Opening claim text (preview).

The invention claimed is: 1. A system comprising: at least one processor; at least one tangible, non-transitory computer-readable medium; and program instructions stored on the at least one tangible, non-transitory computer-readable medium that are executable by the at least one processor such that the system is configured to: obtain metadata from a network computing device relating to a configuration of a media playback system, wherein the metadata indicates that (i) a first playback device is configured to operate in a first playback zone and (ii) the first playback device and a second playback device are configured to operate in a second playback zone; cause the first playback device to operate in the first playback zone in a given playback state comprising play back of one or more media items identified in a playback queue associated with the first playback zone; while the first playback device is operating in the given playback state: receive data corresponding to a detected voice input, wherein the data comprises an indication within the voice input of (i) a command word and (ii) one or more zone variable instances; and determine, based on the command word and the one or more zone variable instances, an intent to transfer the given playback state to the second playback zone; and after determining the intent to transfer the given playback state to the second playback zone, transfer the given playback state to the second playback zone, thereby causing the second playback device in the second playback zone to play back the one or more media items identified in the playback queue. 2. The system of claim 1 , wherein: the voice input does not identify any playback device of the media playback system that is to execute a command corresponding to the voice input. 3. The system of claim 1 , further comprising: at least one microphone, wherein the program instructions are executable by the at least one processor such that the system is configured to detect the voice input via the at least one microphone. 4. The system of claim 3 , wherein one of the first playback device and the second playback device comprise the at least one microphone. 5. The system of claim 3 , wherein the program instructions are executable by the at least one processor such that the system is configured to: receive an indication of a direction of the voice input received via the at least one microphone; and direct audio output of at least one of the first playback device and the second playback device based on the indication of the direction of the voice input. 6. The system of claim 1 , wherein the program instructions are executable by the at least one processor such that the system is configured to: play back the one or more media items at a first volume level; play back audio content associated with a response to the voice input at a second volume level; and adjust playback of the one or more media items at a third volume level for at least a duration of the playback of the audio content associated with the response to the voice input, wherein the third volume level is lower than each of the first volume level and the second volume level. 7. The system of claim 1 , wherein at least a portion of the playback queue is stored on a remote computing device associated with a cloud-based computing system. 8. A system comprising: a first playback device configured to communicate over at least one data network, wherein the first playback device comprises: at least one first processor; at least one first tangible, non-transitory computer-readable medium; and first program instructions stored on the at least one first tangible, non-transitory computer-readable medium that are executable by the at least one first processor such that the first playback device is configured to operate in a given playback state comprising play back of one or more media items identified in a playback queue associated with a first playback zone; and at least one computing device configured to communicate over the at least one data network, wherein the at least one computing device comprises: at least one second processor; at least one second tangible non-transitory computer-readable medium; second program instructions stored on the at least one second tangible, non-transitory computer-readable medium that are executable by the at least one second processor of the at least one computing device such that the at least one computing device is configured to: obtain metadata relating to a configuration of a media playback system, wherein the metadata indicates that (i) the first playback device is configured to operate in the first playback zone and (ii) the first playback device and a second playback device are configured to operate in a second playback zone; while the first playback device is operating in the given playback state: receive data corresponding to a detected voice input, wherein the data comprises an indication within the voice input of (i) a command word and (ii) one or more zone variable instances; and determine, based on the command word and the one or more zone variable instances, an intent to transfer the given playback state to the second playback zone; and after determining the intent to transfer the given playback state to the second playback zone, transfer the given playback state to the second playback zone, thereby causing the second playback device in the second playback zone to play back the one or more media items identified in the playback queue. 9. The system of claim 8 , wherein the voice input does not identify any playback device of the media playback system that is to execute a command corresponding to the voice input. 10. The system of claim 8 , further comprising: at least one microphone, wherein the second program instructions are executable by the at least one second processor of the at least one computing device such that the system is configured to detect the voice input via the at least one microphone. 11. The system of claim 10 , wherein one of the first playback device and the second playback device comprise the at least one microphone. 12. The system of claim 10 , wherein the first program instructions are executable by the at least one first processor of the first playback device such that the system is configured to: receive an indication of a direction of the voice input received via the at least one microphone; and direct audio output of at least one of the first playback device and the second playback device based on the indication of the direction of the voice input. 13. The system of claim 10 , wherein the first program instructions are executable by the at least one first processor of the first playback device such that the system is configured to: play back the one or more media items at a first volume level; play back audio content associated with a response to the voice input at a second volume level; and adjust playback of the one or more media items at a third volume level for at least a duration of the playback of the audio content associated with the response to the voice input, wherein the third volume level is lower than each of the first volume level and the second volume level. 14. A system comprising a first playback device and a second playback device each configured to communicate over at least one data network, wherein the first playback device comprises: at least one first processor; at least one first tangible, non-transitory computer-readable medium; and first program instructions stored on the at least one first tangible, non-transitory computer-readable medium that are executa

Assignees

Inventors

Classifications

  • G06F3/165Primary

    Management of the audio stream, e.g. setting of volume, audio stream path · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • H04R3/00Primary

    Circuits for transducers (arrangements for producing a reverberation or echo sound G10K15/08; amplifiers H03F) · CPC title

  • Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs · CPC title

  • using statistical models, e.g. Hidden Markov Models [HMMs] (G10L15/18 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11736860B2 cover?
Multiple aspects of systems and methods for voice control and related features and functionality for various embodiments of media playback devices, networked microphone devices, microphone-equipped media playback devices, and speaker-equipped networked microphone devices are disclosed and described herein, including but not limited to designating and managing default networked devices, audio re…
Who is the assignee on this patent?
Sonos Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/165. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 22 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).