Automated calling system

US12254883B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12254883-B2
Application numberUS-202418635974-A
CountryUS
Kind codeB2
Filing dateApr 15, 2024
Priority dateSep 24, 2019
Publication dateMar 18, 2025
Grant dateMar 18, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method implemented by one or more processors, the method comprising: receiving audio data of an utterance spoken by a user during a portion of an ongoing conversation between the user and a bot, the audio data being captured by one or more microphones of a computing device of the user; determining, based on processing the audio data of the utterance spoken by the user during the portion of the ongoing conversation between the user and the bot, a representation of the utterance received during the portion of the ongoing conversation; determining a context of the ongoing conversation between the user and the bot, the context of the ongoing conversation between the user and the bot being based on one or more previous portions of the ongoing conversation between the user and the bot, and the one or more previous portions of the ongoing conversation between the user and the bot occurring prior to receiving the utterance spoken by the user during the portion of the ongoing conversation between the user and the bot; causing at the least the representation of the utterance received during the ongoing conversation and the context of the ongoing conversation to be processed, using a sequence-to-sequence model, to generate a reply by the bot to the utterance; and causing synthesized speech, that captures the reply by the bot to the utterance, to be provided for audible presentation to the user, the synthesized speech being provided for audible presentation to the user via one or more speakers of a computing device of the user. 2. The method of claim 1 , wherein the context of the ongoing conversation comprises one or more of: a task associated with the conversation, a time the conversation is initiated, or a location associated with the user. 3. The method of claim 1 , wherein causing the synthesized speech to be generated comprises: processing, using a speech synthesizer, the reply by the bot to generate the synthesized speech. 4. The method of claim 1 , wherein the utterance includes a request to perform a task. 5. The method of claim 4 , further comprising: based on the ongoing conversation: determining whether the task has been completed; and in response to determining that the task has been completed: causing the bot to terminate the conversation. 6. The method of claim 5 , further comprising: in response to determining that the task has not been completed: causing the bot to continue the conversation. 7. The method of claim 1 , further comprising: determining one or more corresponding user intents for the ongoing conversation between the user and the bot, wherein the one or more corresponding user intents are processed, using the sequence-to-sequence model and along with the representation of the utterance received during the ongoing conversation and the context of the ongoing conversation, to generate the reply by the bot to the utterance. 8. A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the at least one processor to: receive audio data of an utterance spoken by a user during a portion of an ongoing conversation between the user and a bot, the audio data being captured by one or more microphones of a computing device of the user; determine, based on processing the audio data of the utterance spoken by the user during the portion of the ongoing conversation between the user and the bot, a representation of the utterance received during the portion of the ongoing conversation; determine a context of the ongoing conversation between the user and the bot, the context of the ongoing conversation between the user and the bot being based on one or more previous portions of the ongoing conversation between the user and the bot, and the one or more previous portions of the ongoing conversation between the user and the bot occurring prior to receiving the utterance spoken by the user during the portion of the ongoing conversation between the user and the bot; cause at the least the representation of the utterance received during the ongoing conversation and the context of the ongoing conversation to be processed, using a sequence-to-sequence model, to generate a reply by the bot to the utterance; and cause synthesized speech, that captures the reply by the bot to the utterance, to be provided for audible presentation to the user, the synthesized speech being provided for audible presentation to the user via one or more speakers of a computing device of the user. 9. The system of claim 8 , wherein the context of the ongoing conversation comprises one or more of: a task associated with the conversation, a time the conversation is initiated, or a location associated with the user. 10. The system of claim 9 , wherein the instructions to cause the synthesized speech to be generated comprise instructions to: process, using a speech synthesizer, the reply by the bot to generate the synthesized speech. 11. The system of claim 8 , wherein the utterance includes a request to perform a task. 12. The system of claim 11 , wherein the instructions further comprise instructions to: based on the ongoing conversation: determine whether the task has been completed; and in response to determining that the task has been completed: cause the bot to terminate the conversation. 13. The system of claim 12 , wherein the instructions further comprise instructions to: in response to determining that the task has not been completed: cause the bot to continue the conversation. 14. The system of claim 8 , wherein the instructions further comprise instructions to: determine one or more corresponding user intents for the ongoing conversation between the user and the bot, wherein the one or more corresponding user intents are processed, using the sequence-to-sequence model and along with the representation of the utterance received during the ongoing conversation and the context of the ongoing conversation, to generate the reply by the bot to the utterance. 15. A non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform operations, the operations comprising: receiving audio data of an utterance spoken by a user during a portion of an ongoing conversation between the user and a bot, the audio data being captured by one or more microphones of a computing device of the user; determining, based on processing the audio data of the utterance spoken by the user during the portion of the ongoing conversation between the user and the bot, a representation of the utterance received during the portion of the ongoing conversation; determining a context of the ongoing conversation between the user and the bot, the context of the ongoing conversation between the user and the bot being based on one or more previous portions of the ongoing conversation between the user and the bot, and the one or more previous portions of the ongoing conversation between the user and the bot occurring prior to receiving the utterance spoken by the user during the portion of the ongoing conversation between the user and the bot; causing at the least the representation of the utterance received during the ongoing conversation and the context of the ongoing conversation to be processed, using a sequence-to-sequence model, to generate a reply by the bot to the utterance; and causing synthesized speech, that captures the reply by the bot to the utterance, to be provided for audible presentation to the user, the synthesized speech

Assignees

Inventors

Classifications

  • Constructional features of telephone sets · CPC title

  • interacting with the Internet · CPC title

  • Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

  • Preventing unauthorised calls to a telephone set · CPC title

  • Notifying a held subscriber when his held call is removed from hold · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12254883B2 cover?
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions furth…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).