Indication of Outreach Options for Healthcare Facility to Facilitate Patient Actions
US-2017061091-A1 · Mar 2, 2017 · US
US11495233B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11495233-B2 |
| Application number | US-202117505913-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 20, 2021 |
| Priority date | Sep 24, 2019 |
| Publication date | Nov 8, 2022 |
| Grant date | Nov 8, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: receiving, by a server, and from a client device of a user, a request to perform a task, wherein the task includes causing a bot hosted at the server to initiate a telephone call with an entity; initiating, by the server, the telephone call with the entity to perform the task; and based on a telephone conversation conducted during the telephone call: receiving, by the server, audio data that captures a spoken utterance provided by a human representative associated with the entity; determining, by the server, a user intent of a first previous portion of the telephone conversation associated with the human representative and a bot intent of a second previous portion of the telephone conversation associated with the bot, wherein the first previous portion of the telephone conversation occurred prior to receiving the audio data of the utterance, and wherein the second previous portion of the telephone conversation also occurred prior to receiving the audio data of the utterance; generating, by the server, and based on at least the audio data that captures the spoken utterance, the user intent, and the bot intent, synthesized speech capturing a reply to the spoken utterance; and causing, by the server, the synthesized speech to be provided for audible presentation to the human representative. 2. The method of claim 1 , further comprising: based on the telephone conversation conducted during the telephone call: determining, by the server, a context of the telephone conversation, wherein generating the synthesized speech capturing the reply to the spoken utterance is further based on the context of the telephone conversation. 3. The method of claim 2 , wherein the context of the telephone conversation comprises one or more of: an identity of the entity, a time the telephone call is initiated, an entity location associated with the entity, or a user location associated with the user. 4. The method of claim 1 , further comprising: based on the telephone conversation conducted during the telephone call: determining, by the server, whether the task has been completed; and in response to determining that the task has been completed: causing, by the server, the telephone call with the entity to be terminated. 5. The method of claim 4 , further comprising: in response to determining that the task has not been completed: continuing, by the server, the telephone call with the entity. 6. The method of claim 1 , wherein the server bypasses performance of speech recognition on the spoken utterance. 7. The method of claim 1 , wherein generating the synthesized speech capturing the reply to the spoken utterance based on at least the audio data that captures the spoken utterance, the user intent, and the bot intent comprises: processing, using a machine learning model, at least the audio data that captures the spoken utterance, the user intent, and the bot intent to generate output; determining, based on the output, an additional bot intent associated with the reply to the spoken utterance; and generating, based on the additional bot intent associated with the reply associated with the spoken utterance, the synthesized speech capturing the reply to the spoken utterance. 8. The method of claim 7 , further comprising: processing, using the machine learning model, and along with the audio data that captures the spoken utterance, the user intent, and the bot intent, a context of the telephone conversation to generate the output. 9. The method of claim 7 , wherein the machine learning model is trained based on historical data for previous telephone conversations, and wherein the historical data for the previous telephone conversation comprises, for each previous telephone conversation, at least (i) corresponding previous first speaker intents associated with a first speaker determined based on corresponding first portions of each previous telephone conversation, (ii) corresponding previous second speaker intents associated with a second speaker determined based on corresponding second portions of each previous telephone conversation, (iii) corresponding previous audio data that captures most recent spoken utterance of the first speaker or the second speaker during each previous telephone conversation, and (iv) a corresponding previous intent of a corresponding previous reply to corresponding the most recent spoken utterance. 10. The method of claim 9 , wherein the historical data for the previous telephone conversation further comprises, for each previous telephone conversation, (v) a corresponding previous context for each previous telephone conversation. 11. A system comprising: at least one processor; and memory storing instructions that, when executed, cause the at least one processor to perform operations, the operations comprising: receiving, from a client device of a user, a request to perform a task, wherein the task includes causing a bot hosted at the server to initiate a telephone call with an entity; initiating the telephone call with the entity to perform the task; and based on a telephone conversation conducted during the telephone call: receiving audio data that captures a spoken utterance provided by a human representative associated with the entity; determining a context of the telephone conversation; determining a user intent of a first previous portion of the telephone conversation associated with the human representative and a bot intent of a second previous portion of the telephone conversation associated with the bot, wherein the first previous portion of the telephone conversation occurred prior to receiving the audio data of the utterance, and wherein the second previous portion of the telephone conversation also occurred prior to receiving the audio data of the utterance; generating, based on at least the audio data that captures the spoken utterance, the user intent, and the bot intent, synthesized speech capturing a reply to the spoken utterance; and causing the synthesized speech to be provided for audible presentation to the human representative. 12. The system of claim 11 , the operations further comprising: based on the telephone conversation conducted during the telephone call: determining a context of the telephone conversation, wherein generating the synthesized speech capturing the reply to the spoken utterance is further based on the context of the telephone conversation. 13. The system of claim 12 , wherein the context of the telephone conversation comprises one or more of: an identity of the entity, a time the telephone call is initiated, an entity location associated with the entity, or a user location associated with the user. 14. The system of claim 11 , the operations further comprising: based on the telephone conversation conducted during the telephone call: determining whether the task has been completed; and in response to determining that the task has been completed: causing the telephone call with the entity to be terminated. 15. The system of claim 14 , the operations further comprising: in response to determining that the task has not been completed: continuing the telephone call with the entity. 16. The system of claim 11 , wherein the system bypasses performance of speech recognition on the spoken utterance. 17. The system of claim 11 , wherein generating the synthesized speech capturing the reply to the spoken utterance based on at least the audio data that captures the spoken utterance, the user intent, and the bot inte
Speech interaction details (speech recognition per se G10L15/00) · CPC title
Semantic analysis · CPC title
Notifying a held subscriber when his held call is removed from hold · CPC title
Preventing unauthorised calls to a telephone set · CPC title
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.