Multi-lingual virtual personal assistant
US-2018314689-A1 · Nov 1, 2018 · US
US10923102B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10923102-B2 |
| Application number | US-201815991411-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 29, 2018 |
| Priority date | Jun 22, 2017 |
| Publication date | Feb 16, 2021 |
| Grant date | Feb 16, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure provides a method and apparatus for broadcasting a response based on artificial intelligence, and a storage medium, wherein the method comprises: obtaining a user-input speech query; generating a response corresponding to the query; obtaining a recorded speech of a mood meaning corresponding to a modal particle in the response and matched with the response; combining the obtained recorded speech with a TTS-generated speech to perform TTS broadcast of the response. The solution of the present disclosure may be applied to enhance an effect of broadcasting the response.
Opening claim text (preview).
What is claimed is: 1. A method for broadcasting a response based on artificial intelligence, wherein the method comprises: obtaining a user-input speech query; generating a response in a text form corresponding to the query; in response to determining that the response includes a modal particle, determining a mood meaning of the modal particle expressed in the response, and and obtaining a pre-recorded speech of the modal particle having the mood meaning; and combining the obtained pre-recorded speech with a Text-To-Speech-generated speech to perform Text-To-Speech broadcast of the response. 2. The method according to claim 1 , wherein the generating a response corresponding to the query comprises: determining a demand corresponding to the query; selecting one response-generating algorithm from M response-generating algorithms corresponding to the demand, M being a positive integer larger than 1; and using the selected response-generating algorithm to generate the response. 3. The method according to claim 2 , wherein the determining a demand corresponding to the query comprises: performing speech recognition for the query to obtain a speech recognition result; and determining a demand corresponding to the query by performing semantic parsing for the speech recognition result; the selecting one response-generating algorithm from M response-generating algorithms corresponding to the demand comprises: randomly selecting one response-generating algorithm from the M response-generating algorithms corresponding to the demand. 4. The method according to claim 1 , wherein at least one pre-recorded speech is pre-generated for the modal particle, and each pre-recorded speech corresponds to a different mood meaning. 5. The method according to claim 1 , wherein the combining the obtained pre-recorded speech with a Text-To-Speech-generated speech to perform Text-To-Speech broadcast of the response comprises: when it is needed to broadcast the modal particle corresponding to the obtained pre-recorded speech, broadcasting the obtained pre-recorded speech, otherwise broadcasting the Text-To- Speech-generated speech. 6. A computer device, comprising a memory, a processor and a computer program which is stored on the memory and runs on the processor, wherein the processor, upon executing the program, implements the following operation: obtaining a user-input speech query; generating a response in a text form corresponding to the query; in response to determining that the response includes a modal particle, determining a mood meaning of the modal particle expressed in the response, and obtaining a pre-recorded speech of the mood meaning having the mood meaning; and combining the obtained pre-recorded speech with a Text-To-Speech-generated speech to perform Text-To-Speech broadcast of the response. 7. The computer device according to claim 6 , wherein the generating a response corresponding to the query comprises: determining a demand corresponding to the query; selecting one response-generating algorithm from M response-generating algorithms corresponding to the demand, M being a positive integer larger than 1; and using the selected response-generating algorithm to generate the response. 8. The computer device according to claim 7 , wherein the determining a demand corresponding to the query comprises: performing speech recognition for the query to obtain a speech recognition result; and determining a demand corresponding to the query by performing semantic parsing for the speech recognition result; the selecting one response-generating algorithm from M response-generating algorithms corresponding to the demand comprises: randomly selecting one response-generating algorithm from the M response-generating algorithms corresponding to the demand. 9. The computer device according to claim 6 , wherein pre-recorded speech is pre-generated for the modal particle, and each pre-recorded speech corresponds to a different mood meaning. 10. The computer device according to claim 6 , wherein the combining the obtained pre-recorded speech with a Text-To-Speech-generated speech to perform Text-To-Speech broadcast of the response comprises: when it is needed to broadcast the modal particle corresponding to the obtained pre- recorded speech, broadcasting the obtained pre-recorded speech, otherwise broadcasting the Text- To-Speech-generated speech. 11. A non-transitory computer-readable storage medium on which a computer program is stored, wherein the program, when executed by a processor, implements the following operation: obtaining a user-input speech query; generating a response in a text form corresponding to the query; in response to determining that the response includes a modal particle, determining a mood meaning of the modal particle expressed in the response, and obtaining a pre-recorded speech of the modal particle having the mood meaning; and combining the obtained pre-recorded speech with a Text-To-Speech-generated speech to perform Text-To-Speech broadcast of the response. 12. The non-transitory computer-readable storage medium according to claim 11 , wherein the generating a response corresponding to the query comprises: determining a demand corresponding to the query; selecting one response-generating algorithm from M response-generating algorithms corresponding to the demand, M being a positive integer larger than 1; and using the selected response-generating algorithm to generate the response. 13. The non-transitory computer-readable storage medium according to claim 12 , wherein the determining a demand corresponding to the query comprises: performing speech recognition for the query to obtain a speech recognition result; and determining a demand corresponding to the query by performing semantic parsing for the speech recognition result; the selecting one response-generating algorithm from M response-generating algorithms corresponding to the demand comprises: randomly selecting one response-generating algorithm from the M response-generating algorithms corresponding to the demand. 14. The non-transitory computer-readable storage medium according to claim 11 , wherein at least one pre-recorded speech is pre-generated for the modal particle, and each pre-recorded speech corresponds to a different mood meaning. 15. The non-transitory computer-readable storage medium according to claim 11 , wherein the combining the obtained pre-recorded speech with a Text-To-Speech-generated speech to perform Text-To-Speech broadcast of the response comprises: when it is needed to broadcast the modal particle corresponding to the obtained pre-recorded speech, broadcasting the obtained pre-recorded speech, otherwise broadcasting the Text-To-Speech-generated speech.
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
using natural language analysis · CPC title
Execution procedure of a spoken command · CPC title
for estimating an emotional state · CPC title
Semantic analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.