Orchestrating execution of a series of actions requested to be performed via an automated assistant
US-11031007-B2 · Jun 8, 2021 · US
US12437764B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12437764-B2 |
| Application number | US-202418439411-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 12, 2024 |
| Priority date | May 6, 2019 |
| Publication date | Oct 7, 2025 |
| Grant date | Oct 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Implementations set forth herein relate to a system that employs an automated assistant to further interactions between a user and another application, which can provide the automated assistant with permission to initialize relevant application actions simultaneous to the user interacting with the other application. Furthermore, the system can allow the automated assistant to initialize actions of different applications, despite being actively operating a particular application. Available actions can be gleaned by the automated assistant using various application-specific schemas, which can be compared with incoming requests from a user to the automated assistant. Additional data, such as context and historical interactions, can also be used to rank and identify a suitable application action to be initialized via the automated assistant.
Opening claim text (preview).
We claim: 1. A system comprising: memory storing instructions; one or more processors operable to execute the instructions to: determine that a user has provided, at a computing device, a spoken utterance that is directed to an automated assistant but does not explicitly identify any application that is accessible via the computing device, wherein the spoken utterance is received at an automated assistant interface of the computing device, and wherein the automated assistant is a separate application from an application; access, based on determining that the user has provided the spoken utterance that is directed to the automated assistant, application data characterizing multiple different actions capable of being performed by the application; determine, based on the application data, a correlation between content of the spoken utterance provided by the user and the application data; in response to determining the correlation between the content of the spoken utterance provided by the user and the application data: select, based on the content of the spoken utterance, an action from the multiple different actions characterized by the application data; and cause the application to perform the selected action. 2. The system of claim 1 , wherein the application data includes a schema file that includes action natural language description that characterizes the multiple different actions capable of being performed by the application. 3. The system of claim 2 , wherein the schema file further includes features natural language description that characterizes an output modality for the action. 4. The system of claim 2 , wherein the schema file further includes features natural language description that characterizes an input modality for the action. 5. The system of claim 2 , wherein the schema file further includes features natural language description that characterizes an input modality for the action and an output modality for the action. 6. The system of claim 1 , wherein one or more of the processors are further operable to execute the instructions to: access additional application data that characterizes other actions capable of being performed via an additional application that is separately accessible from the automated assistant and the application; wherein in determining the correlation between the content of the spoken utterance provided by the user and the application data one or more of the processors are to determine the correlation further based on the additional application data. 7. The system of claim 1 , wherein the application data identifies one or more contextual actions of the multiple different actions based on one or more features of a current application status of the application when the user provided the spoken utterance. 8. The system of claim 7 , wherein the one or more contextual actions are identified by the application and the one or more features characterize a graphical user interface of the application rendered when the user provided the spoken utterance. 9. The system of claim 7 , wherein the one or more contextual actions are identified by the application based on a status of an ongoing action that is being performed at the computing device when the user provided the spoken utterance. 10. The system of claim 1 , wherein in determining that the user has provided the spoken utterance that is directed to the automated assistant but does not explicitly identify any application that is accessible via the computing device one or more of the processors are to: perform speech-to-text processing of audio data, that embodies the spoken utterance provided by the user, to generate the content of the spoken utterance. 11. The system of claim 1 , wherein one or more of the processors are of the computing device. 12. A method implemented by one or more processors, the method comprising: determining that a user has provided a spoken utterance, wherein the spoken utterance includes natural language content that does not explicitly identify any application that is accessible via the computing device, and wherein the spoken utterance is received at an automated assistant interface of an automated assistant of a computing device; accessing, based on determining that the user has provided the spoken utterance, application data that includes a schema file that includes action natural language description of multiple different actions capable of being performed by an application, the application being in addition to the automated assistant; determining, based on processing the natural language content and the action natural language description of the schema file: that the spoken utterance is directed to the application, and an action from the multiple different actions capable of being performed by the application; and in response to determining that the spoken utterance is directed to the application and determining the action: causing the application to perform the action. 13. The method of claim 12 , wherein the schema file further includes features natural language description that characterizes an output modality for the action. 14. The method of claim 12 , wherein the schema file further includes features natural language description that characterizes an input modality for the action. 15. The method of claim 12 , wherein the schema file further includes features natural language description that characterizes an input modality for the action and an output modality for the action. 16. The method of claim 12 , wherein the application data identifies one or more contextual actions of the multiple different actions based on one or more features of a current application status of the application when the user provided the spoken utterance. 17. The method of claim 16 , wherein the one or more contextual actions are identified by the application and the one or more features characterize a graphical user interface of the application rendered when the user provided the spoken utterance. 18. The method of claim 16 , wherein the one or more contextual actions are identified by the application based on a status of an ongoing action that is being performed at the computing device when the user provided the spoken utterance.
Execution procedure of a spoken command · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
of application context · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.