Media system with multiple digital assistants

US11062702B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11062702-B2
Application numberUS-201816032724-A
CountryUS
Kind codeB2
Filing dateJul 11, 2018
Priority dateAug 28, 2017
Publication dateJul 13, 2021
Grant dateJul 13, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for providing voice control using multiple digital assistants. In some embodiments, a voice platform operates to receive a voice input from a user. The voice platform selects a digital assistant from a plurality of digital assistants based on a trigger word. The voice platform then generates an intent from the voice input using the selected digital assistant. The voice platform then transmits the intent to a media device for processing.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for providing voice control using multiple digital assistants, comprising: receiving, over a network by at least one processor in a voice platform, a voice input from a media device; selecting, by the at least one processor, a first digital assistant from a plurality of digital assistants to process the voice input using a trigger word in the voice input, wherein the selected first digital assistant is mapped to the trigger word; generating, by the at least one processor, an intent from the voice input using the selected first digital assistant; determining, by the at least one processor, that a second digital assistant from the plurality of digital assistants is unmapped to the trigger word, and that the second digital assistant processes a type of the intent from the voice input more often than the selected first digital assistant; selecting, by the at least one processor, the second digital assistant from the plurality of digital assistants to process the voice input based on the determining; and transmitting, over the network by the at least one processor, the intent to a voice adaptor at the media device, wherein the voice adaptor routes the intent to an application at the media device for processing. 2. The method of claim 1 , wherein the voice adaptor selects the application to process the intent based on a fixed rule, default application setting, a search result, or metadata in the intent. 3. The method of claim 1 , further comprising: determining the trigger word in the voice input. 4. The method of claim 1 , further comprising: refining the intent based on information in a cloud computing platform. 5. The method of claim 1 , wherein the generating the intent further comprises: converting the voice input into a text input using an automated speech recognizer associated with the selected first digital assistant. 6. The method of claim 5 , wherein the generating the intent further comprises: generating the intent from the text input using a natural language unit associated with the selected first digital assistant. 7. The method of claim 6 , wherein the generating the intent further comprises: converting the intent from a first intent format associated with the selected first digital assistant to a second intent format associated with the media device. 8. A voice platform, comprising: a memory; and a processor coupled to the memory and configured to: receive, over a network, a voice input from a media device; select a first digital assistant from a plurality of digital assistants to process the voice input using a trigger word in the voice input, wherein the selected first digital assistant is mapped to the trigger word; generate an intent from the voice input using the selected first digital assistant; determine that a second digital assistant from the plurality of digital assistants is unmapped to the trigger word, and that the second digital assistant processes a type of the intent from the voice input more often than the selected first digital assistant; select the second digital assistant from the plurality of digital assistants to process the voice input based on the determining; and transmit, over the network, the intent to a voice adaptor at the media device, wherein the voice adaptor routes the intent to an application at the media device for processing. 9. The voice platform of claim 8 , wherein the voice adaptor selects the application to process the intent based on a fixed rule, default application setting, a search result, or metadata in the intent. 10. The voice platform of claim 8 , wherein the processor is further configured to: determine the trigger word in the voice input. 11. The voice platform of claim 8 , wherein the processor is further configured to: refine the intent based on information in a cloud computing platform. 12. The voice platform of claim 8 , wherein to generate the intent the processor is further configured to: convert the voice input into a text input using an automated speech recognizer associated with the selected first digital assistant. 13. The voice platform of claim 12 , wherein to generate the intent the processor is further configured to: generate the intent from the text input using a natural language unit associated with the selected first digital assistant. 14. The voice platform of claim 13 , wherein to generate the intent the processor is further configured to: convert the intent from a first intent format associated with the selected first digital assistant to a second intent format associated with the media device. 15. A non-transitory computer-readable medium having instructions stored thereon that, when executed by at least one computing device, cause the at least one computing device to perform operations comprising: transmitting, over a network, a voice input to a voice platform, wherein the voice platform selects a first digital assistant from a plurality of digital assistants to process the voice input using a trigger word in the voice input, generates an intent from the voice input using the selected first digital assistant, determines that a second digital assistant from the plurality of digital assistants is unmapped to the trigger word, and that the second digital assistant processes a type of the intent from the voice input more often than the selected first digital assistant, and selects the second digital assistant from the plurality of digital assistants to process the voice input based on the determining; and receiving, over the network, the intent at a voice adaptor, wherein the voice adaptor routes the intent to an application for processing. 16. The non-transitory computer-readable medium of claim 15 , wherein the voice adaptor selects the application to process the intent based on a fixed rule, default application setting, a search result, or metadata in the intent. 17. The non-transitory computer-readable medium of claim 15 , wherein the voice platform refines the intent based on information in a cloud computing platform. 18. The non-transitory computer-readable medium of claim 15 , wherein the voice platform converts the voice input into a text input using an automated speech recognizer associated with the selected first digital assistant. 19. The non-transitory computer-readable medium of claim 18 , wherein the voice platform generates the intent from the text input using a natural language unit associated with the selected first digital assistant. 20. The non-transitory computer-readable device medium of claim 19 , wherein the voice platform converts the intent from a first intent format associated with the selected first digital assistant to a second intent format associated with the at least one computing device. 21. The method of claim 1 , further comprising: determining the trigger word in the voice input in response to the media device determining the trigger word is present in the voice input below a confidence threshold value. 22. The method of claim 1 , wherein the determining further comprises: determining that the second digital assistant processes the type of the intent from the voice input more often than the selected first digital assistant based on crowdsourced data, wherein the crowdsource data indicates how often each digital assistant from the plurality of digital assistants is used to process the type of the intent. 23. The method of claim 22 , further comprising

Assignees

Inventors

Classifications

  • Word spotting · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Execution procedure of a spoken command · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • based on the content of a request · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11062702B2 cover?
Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for providing voice control using multiple digital assistants. In some embodiments, a voice platform operates to receive a voice input from a user. The voice platform selects a digital assistant from a plurality of digital assistan…
Who is the assignee on this patent?
Roku Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 13 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).