Initializing non-assistant background actions, via an automated assistant, while accessing a non-assistant application

US12437764B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12437764-B2
Application numberUS-202418439411-A
CountryUS
Kind codeB2
Filing dateFeb 12, 2024
Priority dateMay 6, 2019
Publication dateOct 7, 2025
Grant dateOct 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations set forth herein relate to a system that employs an automated assistant to further interactions between a user and another application, which can provide the automated assistant with permission to initialize relevant application actions simultaneous to the user interacting with the other application. Furthermore, the system can allow the automated assistant to initialize actions of different applications, despite being actively operating a particular application. Available actions can be gleaned by the automated assistant using various application-specific schemas, which can be compared with incoming requests from a user to the automated assistant. Additional data, such as context and historical interactions, can also be used to rank and identify a suitable application action to be initialized via the automated assistant.

First claim

Opening claim text (preview).

We claim: 1. A system comprising: memory storing instructions; one or more processors operable to execute the instructions to: determine that a user has provided, at a computing device, a spoken utterance that is directed to an automated assistant but does not explicitly identify any application that is accessible via the computing device, wherein the spoken utterance is received at an automated assistant interface of the computing device, and wherein the automated assistant is a separate application from an application; access, based on determining that the user has provided the spoken utterance that is directed to the automated assistant, application data characterizing multiple different actions capable of being performed by the application; determine, based on the application data, a correlation between content of the spoken utterance provided by the user and the application data; in response to determining the correlation between the content of the spoken utterance provided by the user and the application data: select, based on the content of the spoken utterance, an action from the multiple different actions characterized by the application data; and cause the application to perform the selected action. 2. The system of claim 1 , wherein the application data includes a schema file that includes action natural language description that characterizes the multiple different actions capable of being performed by the application. 3. The system of claim 2 , wherein the schema file further includes features natural language description that characterizes an output modality for the action. 4. The system of claim 2 , wherein the schema file further includes features natural language description that characterizes an input modality for the action. 5. The system of claim 2 , wherein the schema file further includes features natural language description that characterizes an input modality for the action and an output modality for the action. 6. The system of claim 1 , wherein one or more of the processors are further operable to execute the instructions to: access additional application data that characterizes other actions capable of being performed via an additional application that is separately accessible from the automated assistant and the application; wherein in determining the correlation between the content of the spoken utterance provided by the user and the application data one or more of the processors are to determine the correlation further based on the additional application data. 7. The system of claim 1 , wherein the application data identifies one or more contextual actions of the multiple different actions based on one or more features of a current application status of the application when the user provided the spoken utterance. 8. The system of claim 7 , wherein the one or more contextual actions are identified by the application and the one or more features characterize a graphical user interface of the application rendered when the user provided the spoken utterance. 9. The system of claim 7 , wherein the one or more contextual actions are identified by the application based on a status of an ongoing action that is being performed at the computing device when the user provided the spoken utterance. 10. The system of claim 1 , wherein in determining that the user has provided the spoken utterance that is directed to the automated assistant but does not explicitly identify any application that is accessible via the computing device one or more of the processors are to: perform speech-to-text processing of audio data, that embodies the spoken utterance provided by the user, to generate the content of the spoken utterance. 11. The system of claim 1 , wherein one or more of the processors are of the computing device. 12. A method implemented by one or more processors, the method comprising: determining that a user has provided a spoken utterance, wherein the spoken utterance includes natural language content that does not explicitly identify any application that is accessible via the computing device, and wherein the spoken utterance is received at an automated assistant interface of an automated assistant of a computing device; accessing, based on determining that the user has provided the spoken utterance, application data that includes a schema file that includes action natural language description of multiple different actions capable of being performed by an application, the application being in addition to the automated assistant; determining, based on processing the natural language content and the action natural language description of the schema file: that the spoken utterance is directed to the application, and an action from the multiple different actions capable of being performed by the application; and in response to determining that the spoken utterance is directed to the application and determining the action: causing the application to perform the action. 13. The method of claim 12 , wherein the schema file further includes features natural language description that characterizes an output modality for the action. 14. The method of claim 12 , wherein the schema file further includes features natural language description that characterizes an input modality for the action. 15. The method of claim 12 , wherein the schema file further includes features natural language description that characterizes an input modality for the action and an output modality for the action. 16. The method of claim 12 , wherein the application data identifies one or more contextual actions of the multiple different actions based on one or more features of a current application status of the application when the user provided the spoken utterance. 17. The method of claim 16 , wherein the one or more contextual actions are identified by the application and the one or more features characterize a graphical user interface of the application rendered when the user provided the spoken utterance. 18. The method of claim 16 , wherein the one or more contextual actions are identified by the application based on a status of an ongoing action that is being performed at the computing device when the user provided the spoken utterance.

Assignees

Inventors

Classifications

  • Execution procedure of a spoken command · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • G06F3/167Primary

    Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • of application context · CPC title

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12437764B2 cover?
Implementations set forth herein relate to a system that employs an automated assistant to further interactions between a user and another application, which can provide the automated assistant with permission to initialize relevant application actions simultaneous to the user interacting with the other application. Furthermore, the system can allow the automated assistant to initialize actions…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).