Indication of Outreach Options for Healthcare Facility to Facilitate Patient Actions
US-2017061091-A1 · Mar 2, 2017 · US
US11741966B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11741966-B2 |
| Application number | US-202217964141-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 12, 2022 |
| Priority date | Sep 24, 2019 |
| Publication date | Aug 29, 2023 |
| Grant date | Aug 29, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
Opening claim text (preview).
The invention claimed is: 1. A method implemented by one or more processors, the method comprising: receiving, by a computing device, audio data of an utterance spoken by a user who is having a conversation with a bot; determining, by the computing device, a user intent of a first previous portion of the conversation by the user and a bot intent of a second previous portion of the conversation by the bot, wherein the first previous portion of the conversation occurred prior to receiving the audio data of the utterance, and wherein the second previous portion of the conversation also occurred prior to receiving the audio data of the utterance; generating, by the computing device, and based on at least the audio data, the user intent; and the bot intent, synthesized speech of a reply by the bot to the utterance; and causing, by the computing device, the synthesized speech to be provided for audible presentation to the user. 2. The method of claim 1 , further comprising: determining, by the computing device, a context of the conversation, wherein generating the synthesized speech of the reply by the bot to the utterance is further based on the context of the conversation. 3. The method of claim 2 , wherein the context of the conversation comprises at least a task associated with the conversation, a time the conversation is initiated, or a location associated with the user. 4. The method of claim 1 , wherein generating the synthesized speech of the reply by the bot to the utterance comprises: determining, by the computing device, an additional bot intent of the reply by the bot to the utterance; generating, by the computing device, and based on the additional bot intent, a transcription of the reply by the bot to the utterance; and generating, by the computing device, and using a speech synthesizer, the synthesized speech of the reply by the bot to the utterance. 5. The method of claim 1 , further comprising: bypassing, by the computing device, performance of speech recognition on the utterance spoken by the user. 6. The method of claim 1 , further comprising: accessing, by the computing device, historical data for previous conversations, wherein the historical data includes, for each previous conversation, (i) a previous context of the previous conversation, (ii) previous first speaker intents of portions of the previous conversation spoken by a first speaker, (iii) previous second speaker intents of portions of the previous conversation spoken by a second speaker, (iv) previous audio data of a most recent utterance of the first speaker or the second speaker during the previous conversation, and (v) a previous intent of a previous reply to the most recent utterance; and training, by the computing device, and based on the historical data, a machine learning model that is configured to receive (i) audio data of a most recent given utterance of a given conversation, (ii) a given user intent of a first portion of the given conversation spoken by a given user, (iii) a given bot intent of a second portion of the given conversation outputted by the bot, and (iv) a given context of the given conversation and output a given intent for a given reply to the most recent given utterance. 7. The method of claim 6 , wherein one or more of the previous conversations comprise previous telephone conversations. 8. The method of claim 6 , wherein generating the synthesized speech of the reply by the bot to the utterance comprises: processing, by the computing device, and using the machine learning model, at least the audio data, the user intent; and the bot intent to generate output; determining, by the computing device, and based on the output, an additional bot intent of the reply by the bot to the utterance; generating, by the computing device, and based on the additional bot intent, a transcription of the reply by the bot to the utterance; and generating, by the computing device, and using a speech synthesizer, the synthesized speech of the reply by the bot to the utterance. 9. The method of claim 1 , wherein an intent of a portion of the conversation identifies a type of information conveyed in the portion of the conversation. 10. The method of claim 1 , wherein the utterance includes a request to perform a task. 11. The method of claim 10 , further comprising: based on the conversation: determining, by the computing device, whether the task has been completed; and in response to determining that the task has been completed: causing, by the computing device, the bot to terminate the conversation. 12. The method of claim 11 , further comprising: in response to determining that the task has not been completed: causing, by the computing device, the bot to continue the conversation. 13. A system comprising: at least one processor; and memory storing instructions that, when executed, cause the at least one processor to perform operations, the operations comprising: receiving, by a computing device, audio data of an utterance spoken by a user who is having a conversation with a bot; determining, by the computing device, a user intent of a first previous portion of the conversation by the user and a bot intent of a second previous portion of the conversation by the bot, wherein the first previous portion of the conversation occurred prior to receiving the audio data of the utterance, and wherein the second previous portion of the conversation also occurred prior to receiving the audio data of the utterance; generating, by the computing device, and based on at least the audio data, the user intent; and the bot intent, synthesized speech of a reply by the bot to the utterance; and causing, by the computing device, the synthesized speech to be provided for audible presentation to the user. 14. The system of claim 13 , the operations further comprising: determining, by the computing device, a context of the conversation, wherein generating the synthesized speech of the reply by the bot to the utterance is further based on the context of the conversation. 15. The system of claim 14 , wherein the context of the conversation comprises at least a task associated with the conversation, a time the conversation is initiated, or a location associated with the user. 16. The system of claim 13 , wherein generating the synthesized speech of the reply by the bot to the utterance comprises: determining, by the computing device, an additional bot intent of the reply by the bot to the utterance; generating, by the computing device, and based on the additional bot intent, a transcription of the reply by the bot to the utterance; and generating, by the computing device, and using a speech synthesizer, the synthesized speech of the reply by the bot to the utterance. 17. The system of claim 13 , the operations further comprising: bypassing, by the computing device, performance of speech recognition on the utterance spoken by the user. 18. The system of claim 13 , the operations further comprising: accessing, by the computing device, historical data for previous conversations, wherein the historical data includes, for each previous conversation, (i) a previous context of the previous conversation, (ii) previous first speaker intents of portions of the previous conversation spoken by a first speaker, (iii) previous second speaker intents of portions of the previous conversation spoken by a second speaker, (iv) previous audio data of a most recent utterance of the first speaker or the second speaker during the previous conversation, and (v) a previous
Speech to text systems (G10L15/08 takes precedence) · CPC title
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
Constructional features of telephone sets · CPC title
Preventing unauthorised calls to a telephone set · CPC title
Notifying a held subscriber when his held call is removed from hold · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.