Systems and methods for voice-assisted media content selection

US12360734B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12360734-B2
Application numberUS-202318484198-A
CountryUS
Kind codeB2
Filing dateOct 10, 2023
Priority dateMay 10, 2018
Publication dateJul 15, 2025
Grant dateJul 15, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for media playback via a media playback system include (i) capturing a voice input comprising a request for media content, (ii) receiving information derived at least from the request for media content, (iii) requesting and receiving information from at least one remote computing device associated with a first media content service and at least one remote computing device associated with a second media content service, wherein (a) the information identifies first media content available via the first media content service for playback and identifies second media content available via the second media content service for playback, and (b) the first and second media content are related to the requested media content, and (iv) after receiving at least one of the first information and the second information, (a) selecting the first media content instead of the second media content, and (b) playing back the first media content.

First claim

Opening claim text (preview).

The invention claimed is: 1. A media playback system comprising: one or more processors; at least one network microphone device (NMD); and tangible, non-transitory, computer-readable media storing instructions executable by one or more processors to cause the media playback system to perform operations comprising: capturing a first voice input via one or more microphones of the NMD, wherein the first voice input comprises a user request; transmitting the first voice input to one or more first remote computing devices associated with a voice assistant service for deriving intent information regarding the request based at least on the first voice input; receiving a first response from the one or more first remote computing devices, the first response comprising first information associated with audio content; outputting an audio response via one or more audio transducers of the NMD based on the first response; capturing a second voice input via the one or more microphones of the NMD, wherein the second voice input comprises a request for media content; transmitting the second voice input to the one or more first remote computing devices for deriving intent information regarding the request for media content based at least on the second voice input; receiving a second response from the one or more first remote computing devices, wherein the second response comprises the derived intent information and an identified media content service; based at least in part on the derived intent information, requesting, independent of the voice assistant service, media content information directly from one or more second remote computing devices hosting the identified media content service; receiving, independent of the voice assistant service, second information from the one or more second remote computing devices, wherein the second information identifies media content available via the media content service for playback; and independent of the voice assistant service, playing back the media content via the NMD. 2. The media playback system of claim 1 , wherein the first information associated with the audio content comprises at least one of: a storage address, a link, a URL, or a file. 3. The media playback system of claim 1 , wherein the first information associated with the audio content comprises a voice response from the voice assistant service. 4. The media playback system of claim 1 , further comprising one or more third remote computing devices, wherein the receiving, independent of the voice assistant service, the second information from the one or more third remote computing devices comprises receiving the second information via the one or more third remote computing devices. 5. The media playback system of claim 4 , wherein the operations further comprise, after receiving the second information, (i) transmitting a uniform resource identifier (URI) or uniform resource locator (URL) associated with the media content from the one or more third remote computing devices of the media playback system to the NMD, and (ii) requesting, via the NMD, the media content, via the URI or URL, from the one or more third remote computing devices of the media content service for playback. 6. The media playback system of claim 4 , wherein the requesting, via the media playback system and independent of the voice assistant service, media content information from one or more second remote computing devices hosting the identified media content service comprises transmitting a request from the one or more third remote computing devices of the media playback system to the one or more second remote computing devices hosting the identified media content service. 7. The media playback system of claim 1 , wherein the derived intent information comprises a predefined data structure including one or more media content attributes, and wherein requesting media content information from the media content service comprises querying the media content service for media corresponding to the media content attributes. 8. A method performed by a media playback system comprising a network microphone device (NMD), the method comprising: capturing a first voice input via one or more microphones of the NMD, wherein the first voice input comprises a user request; transmitting the first voice input to one or more first remote computing devices associated with a voice assistant service for deriving intent information regarding the request based at least on the first voice input; receiving a first response from the one or more first remote computing devices, the first response comprising first information associated with audio content; outputting an audio response via one or more audio transducers of the NMD based on the first response; capturing a second voice input via the NMD, wherein the second voice input comprises a request for media content; transmitting the second voice input to the one or more first remote computing devices for deriving intent information regarding the request for media content based at least on the second voice input; receiving a second response from the one or more first remote computing devices, wherein the second response comprises the derived intent information and an identified media content service; based at least in part on the derived intent information, requesting, independent of the voice assistant service, media content information directly from one or more second remote computing devices hosting the identified media content service; receiving, independent of the voice assistant service, second information from the one or more second remote computing devices, wherein the second information identifies media content available via the media content service for playback; and independent of the voice assistant service, playing back the media content via the NMD. 9. The method of claim 8 , wherein the first information associated with the audio content comprises at least one of: a storage address, a link, a URL, or a file. 10. The method of claim 8 , wherein the first information associated with the audio content comprises a voice response from the voice assistant service. 11. The method of claim 8 , wherein the media playback system further comprises one or more third remote computing devices, and wherein the receiving, independent of the voice assistant service, the second information from the one or more third remote computing devices comprises receiving the second information via the one or more third remote computing devices. 12. The method of claim 11 , further comprising, after receiving the second information, (i) transmitting a uniform resource identifier (URI) or uniform resource locator (URL) associated with the media content from the one or more third remote computing devices of the media playback system to the NMD, and (ii) requesting, via the NMD, the media content, via the URI or URL, from the one or more third remote computing devices of the media content service for playback. 13. The method of claim 11 , wherein the requesting, via the media playback system and independent of the voice assistant service, media content information from one or more second remote computing devices hosting the identified media content service comprises transmitting a request from the one or more third remote computing devices of the media playback system to the one or more second remote computing devices hosting the identified media content service. 14. The method of claim 8 , wherein the derived intent information comprises a predefined data structure including one or more media content attributes, and wherein requesting media content informatio

Assignees

Inventors

Classifications

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Announcement of recognition results · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Execution procedure of a spoken command · CPC title

  • G06F3/167Primary

    Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12360734B2 cover?
Systems and methods for media playback via a media playback system include (i) capturing a voice input comprising a request for media content, (ii) receiving information derived at least from the request for media content, (iii) requesting and receiving information from at least one remote computing device associated with a first media content service and at least one remote computing device as…
Who is the assignee on this patent?
Sonos Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/167. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 15 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).