What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 13 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Media system with multiple digital assistants

US11062702B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11062702-B2
Application number	US-201816032724-A
Country	US
Kind code	B2
Filing date	Jul 11, 2018
Priority date	Aug 28, 2017
Publication date	Jul 13, 2021
Grant date	Jul 13, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for providing voice control using multiple digital assistants. In some embodiments, a voice platform operates to receive a voice input from a user. The voice platform selects a digital assistant from a plurality of digital assistants based on a trigger word. The voice platform then generates an intent from the voice input using the selected digital assistant. The voice platform then transmits the intent to a media device for processing.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for providing voice control using multiple digital assistants, comprising: receiving, over a network by at least one processor in a voice platform, a voice input from a media device; selecting, by the at least one processor, a first digital assistant from a plurality of digital assistants to process the voice input using a trigger word in the voice input, wherein the selected first digital assistant is mapped to the trigger word; generating, by the at least one processor, an intent from the voice input using the selected first digital assistant; determining, by the at least one processor, that a second digital assistant from the plurality of digital assistants is unmapped to the trigger word, and that the second digital assistant processes a type of the intent from the voice input more often than the selected first digital assistant; selecting, by the at least one processor, the second digital assistant from the plurality of digital assistants to process the voice input based on the determining; and transmitting, over the network by the at least one processor, the intent to a voice adaptor at the media device, wherein the voice adaptor routes the intent to an application at the media device for processing. 2. The method of claim 1 , wherein the voice adaptor selects the application to process the intent based on a fixed rule, default application setting, a search result, or metadata in the intent. 3. The method of claim 1 , further comprising: determining the trigger word in the voice input. 4. The method of claim 1 , further comprising: refining the intent based on information in a cloud computing platform. 5. The method of claim 1 , wherein the generating the intent further comprises: converting the voice input into a text input using an automated speech recognizer associated with the selected first digital assistant. 6. The method of claim 5 , wherein the generating the intent further comprises: generating the intent from the text input using a natural language unit associated with the selected first digital assistant. 7. The method of claim 6 , wherein the generating the intent further comprises: converting the intent from a first intent format associated with the selected first digital assistant to a second intent format associated with the media device. 8. A voice platform, comprising: a memory; and a processor coupled to the memory and configured to: receive, over a network, a voice input from a media device; select a first digital assistant from a plurality of digital assistants to process the voice input using a trigger word in the voice input, wherein the selected first digital assistant is mapped to the trigger word; generate an intent from the voice input using the selected first digital assistant; determine that a second digital assistant from the plurality of digital assistants is unmapped to the trigger word, and that the second digital assistant processes a type of the intent from the voice input more often than the selected first digital assistant; select the second digital assistant from the plurality of digital assistants to process the voice input based on the determining; and transmit, over the network, the intent to a voice adaptor at the media device, wherein the voice adaptor routes the intent to an application at the media device for processing. 9. The voice platform of claim 8 , wherein the voice adaptor selects the application to process the intent based on a fixed rule, default application setting, a search result, or metadata in the intent. 10. The voice platform of claim 8 , wherein the processor is further configured to: determine the trigger word in the voice input. 11. The voice platform of claim 8 , wherein the processor is further configured to: refine the intent based on information in a cloud computing platform. 12. The voice platform of claim 8 , wherein to generate the intent the processor is further configured to: convert the voice input into a text input using an automated speech recognizer associated with the selected first digital assistant. 13. The voice platform of claim 12 , wherein to generate the intent the processor is further configured to: generate the intent from the text input using a natural language unit associated with the selected first digital assistant. 14. The voice platform of claim 13 , wherein to generate the intent the processor is further configured to: convert the intent from a first intent format associated with the selected first digital assistant to a second intent format associated with the media device. 15. A non-transitory computer-readable medium having instructions stored thereon that, when executed by at least one computing device, cause the at least one computing device to perform operations comprising: transmitting, over a network, a voice input to a voice platform, wherein the voice platform selects a first digital assistant from a plurality of digital assistants to process the voice input using a trigger word in the voice input, generates an intent from the voice input using the selected first digital assistant, determines that a second digital assistant from the plurality of digital assistants is unmapped to the trigger word, and that the second digital assistant processes a type of the intent from the voice input more often than the selected first digital assistant, and selects the second digital assistant from the plurality of digital assistants to process the voice input based on the determining; and receiving, over the network, the intent at a voice adaptor, wherein the voice adaptor routes the intent to an application for processing. 16. The non-transitory computer-readable medium of claim 15 , wherein the voice adaptor selects the application to process the intent based on a fixed rule, default application setting, a search result, or metadata in the intent. 17. The non-transitory computer-readable medium of claim 15 , wherein the voice platform refines the intent based on information in a cloud computing platform. 18. The non-transitory computer-readable medium of claim 15 , wherein the voice platform converts the voice input into a text input using an automated speech recognizer associated with the selected first digital assistant. 19. The non-transitory computer-readable medium of claim 18 , wherein the voice platform generates the intent from the text input using a natural language unit associated with the selected first digital assistant. 20. The non-transitory computer-readable device medium of claim 19 , wherein the voice platform converts the intent from a first intent format associated with the selected first digital assistant to a second intent format associated with the at least one computing device. 21. The method of claim 1 , further comprising: determining the trigger word in the voice input in response to the media device determining the trigger word is present in the voice input below a confidence threshold value. 22. The method of claim 1 , wherein the determining further comprises: determining that the second digital assistant processes the type of the intent from the voice input more often than the selected first digital assistant based on crowdsourced data, wherein the crowdsource data indicates how often each digital assistant from the plurality of digital assistants is used to process the type of the intent. 23. The method of claim 22 , further comprising

Assignees

Roku Inc

Inventors

Classifications

G10L2015/088
Word spotting · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L2015/223
Execution procedure of a spoken command · CPC title
G06F3/167
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
H04L67/1014
based on the content of a request · CPC title

Patent family

Related publications grouped by family.

View patent family 65435500

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11062702B2 cover?: Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for providing voice control using multiple digital assistants. In some embodiments, a voice platform operates to receive a voice input from a user. The voice platform selects a digital assistant from a plurality of digital assistan…
Who is the assignee on this patent?: Roku Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 13 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).