Automated assistant control of non-assistant applications via identification of synonymous term and/or speech processing biasing

US11967321B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11967321-B2
Application numberUS-202117538641-A
CountryUS
Kind codeB2
Filing dateNov 30, 2021
Priority dateOct 6, 2021
Publication dateApr 23, 2024
Grant dateApr 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations set forth herein relate to an automated assistant that can interact with applications that may not have been pre-configured for interfacing with the automated assistant. The automated assistant can identify content of an application interface of the application to determine synonymous terms that a user may speak when commanding the automated assistant to perform certain tasks. Speech processing operations employed by the automated assistant can be biased towards these synonymous terms when the user is accessing an application interface of the application. In some implementations, the synonymous terms can be identified in a responsive language of the automated assistant when the content of the application interface is being rendered in a different language. This can allow the automated assistant to operate as an interface between the user and certain applications that may not be rendering content in a native language of the user.

First claim

Opening claim text (preview).

We claim: 1. A method implemented by one or more processors, the method comprising: determining, by an automated assistant, that a user is accessing an application interface of an application that is different from the automated assistant, wherein the application interface is rendered at a display interface of a computing device and the computing device provides access to the automated assistant; identifying, by the automated assistant, one or more terms that are synonymous with a particular feature of the application interface of the application; receiving, from the user, a spoken utterance that includes a particular term of the one or more terms identified by the automated assistant, wherein the particular term is not expressly rendered in the application interface; determining, based on having identified the particular term as being synonymous with the particular feature, to control the application using a type of input that can be received by an operating system of the computing device for interacting with the particular feature of the application interface; and causing, by the automated assistant and in response to the spoken utterance, the operating system to interact with the particular feature of the application interface. 2. The method of claim 1 , further comprising: processing, in response to receiving the spoken utterance, audio data that corresponds to the spoken utterance, wherein the processing is performed with a bias towards the one or more terms that are synonymous with the particular feature of the application interface. 3. The method of claim 1 , wherein identifying the one or more terms that are synonymous with the particular feature of the application interface of the application includes: determining, based on application data associated with the application interface, that the particular feature of the application interface is selectable via a user input to the operating system of the computing device. 4. The method of claim 1 , wherein identifying the one or more terms that are synonymous with the particular feature of the application interface of the application includes: determining that the user has invoked, using a previous spoken utterance, the automated assistant to control another feature of a separate application, wherein the previous spoken utterance identifies the one or more terms, and the other feature of the separate application is associated with the particular feature of the application. 5. The method of claim 4 , wherein the particular feature includes a term label that is also included in the other feature, and the term label is different than the one or more terms. 6. The method of claim 1 , wherein the spoken utterance is provided by the user in a first natural language, and content is rendered at the application interface in a second natural language that is different than the first natural language. 7. The method of claim 6 , wherein the automated assistant is responsive to inputs in the first natural language as indicated by a user-controllable setting of the automated assistant. 8. The method of claim 1 , wherein identifying the one or more terms that are synonymous with the particular feature of the application interface of the application includes: processing one or more images that capture natural language content corresponding to the particular feature of the application interface, wherein the one or more terms are synonymous with the natural language content. 9. A method implemented by one or more processors, the method comprising: determining, at a computing device, that an application interface of an application is being rendered at a display interface of the computing device, wherein the application is different from an automated assistant that is accessible via the computing device; processing, based on the application interface being rendered at the display interface, application data associated with content that is being rendered at the application interface of the application; generating, based on the processing of the application data, operation data that indicates a correspondence between a feature the application interface and an operation capable of being performed, by the automated assistant, to effectuate control of the application; receiving, from a user, an input that is directed to the automated assistant, wherein the input is embodied in a spoken utterance that is provided by the user to an audio interface of the computing device; determining, based on receiving the input, that the input refers to the operation for interacting with the feature of the application interface; and causing, in response to receiving the input, the automated assistant to initialize performance of the operation for interacting with the feature of the application interface. 10. The method of claim 9 , wherein the spoken utterance is provided by the user in a first natural language, and the content is rendered at the application interface in a second natural language that is different than the first natural language. 11. The method of claim 10 , wherein the automated assistant is responsive to inputs in the first natural language as indicated by a user-controllable setting of the automated assistant. 12. The method of claim 9 , wherein determining that the input refers to the operation for interacting with the feature of the application interface includes: determining to bias a speech processing operation according to the operation data, and causing the speech processing operation to be performed on audio data that characterizes the spoken utterance provided by the user. 13. The method of claim 9 , wherein causing the automated assistant to initialize performance of the operation includes: causing the automated assistant to communicate with an operating system of the computing device to cause the operating system to issue a control command to the application for controlling the feature of the application interface. 14. The method of claim 9 , wherein processing content of the application interface includes processing one or more images corresponding to a screenshot of the application interface. 15. A method implemented by one or more processors, the method comprising: determining, at a computing device, that a user is accessing an application interface of an application that is different from an automated assistant, wherein the computing device provides access to the automated assistant and the application; identifying, by the automated assistant, one or more terms rendered at the application interface that correspond to an operation capable of being performed by the application, wherein the one or more terms are rendered in a language that is different from a responsive language of the automated assistant; determining, by the automated assistant, one or more other terms that are not expressly rendered at the application interface of the application, wherein the one or more other terms are in the responsive language of the automated assistant and are synonymous with the one or more terms rendered at the application interface; receiving, by the automated assistant and from the user, a spoken utterance that includes the one or more other terms, wherein the spoken utterance is received in the responsive language of the automated assistant while the computing device is rendering at the application interface of the application; and causing, in response to the spoken utterance and by the automated assistant, the application to initialize performance of the operation corresponding to the one or more terms.

Assignees

Inventors

Classifications

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • Thesauruses; Synonyms · CPC title

  • Semantic analysis · CPC title

  • Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11967321B2 cover?
Implementations set forth herein relate to an automated assistant that can interact with applications that may not have been pre-configured for interfacing with the automated assistant. The automated assistant can identify content of an application interface of the application to determine synonymous terms that a user may speak when commanding the automated assistant to perform certain tasks. S…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).