Information processing device, information processing terminal, information processing method, and program
US-2020272407-A1 · Aug 27, 2020 · US
US11568853B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11568853-B2 |
| Application number | US-202016942644-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 29, 2020 |
| Priority date | Aug 16, 2019 |
| Publication date | Jan 31, 2023 |
| Grant date | Jan 31, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed is a voice recognition method and apparatus using artificial intelligence. A voice recognition method using artificial intelligence may include: generating a utterance by receiving a voice command of a user; obtaining a user's intention by analyzing the generated utterance; deriving an urgency level of the user on the basis of the generated utterance and prestored user information; generating a first response in association with the user's intention; obtaining main vocabularies included in the first response; generating a second response by using the main vocabularies and the urgency level of the user; determining a speech rate of the second response on the basis of the urgency level of the user; and outputting the second response according to the speech rate by synthesizing the second response to a voice signal.
Opening claim text (preview).
What is claimed is: 1. A method of performing voice recognition by using artificial intelligence, the method comprising: generating an utterance by receiving a voice command from a user; obtaining a user's intention by analyzing the generated utterance; deriving an urgency level of the user on the basis of the generated utterance and prestored user information; generating a first response in association with the user's intention; obtaining main vocabularies included in the first response; generating a second response by using the main vocabularies and the urgency level of the user; determining a speech rate of the second response on the basis of the urgency level of the user; outputting the second response on the basis of the speech rate by synthesizing the second response to a voice signal; receiving a voice command of the user as feedback for the second response output in the voice signal; and modifying the obtaining of the main vocabularies and the deriving of the urgency level according to whether or not the voice command input as the feedback relates to the main vocabularies of the first response but excluded in the second response. 2. The method of claim 1 , wherein the prestored user information includes information on a user's schedule, and the deriving of the urgency level of the user includes: calculating the urgency level of the user by using at least one of the user's intention, a sound feature of the voice command, and a relation between the information on the user's schedule and current time. 3. The method of claim 2 , wherein the deriving of the urgency level of the user includes: calculating the urgency level of the user by using factors and weighting factors respectively assigned to the user's intention, the sound feature of the voice command, and the relation between the information on the user's schedule and the current time. 4. The method of claim 2 , wherein the using of the sound feature of the voice command includes: using a result obtained by comparing a prestored speech feature of a general voice command of the user and a speech feature of the voice command. 5. The method of claim 1 , wherein the analyzing of the generated utterance includes: determining whether the utterance corresponds to a repeat-response type or a general-response type on the basis of a number of input times of the utterance, and the generating of the first response includes: when the utterance corresponds to the repeat-response type, generating a prestored response in association with the utterance as the first respon se. 6. The method of claim 5 , wherein the generating of the second response includes: when the utterance corresponds to the repeat-response type, generating the second response by selecting one of a plurality of sentences having the same meaning with lengths different with each other on the basis of the urgency level of the user. 7. The method of claim 1 , wherein the obtaining of the main vocabularies includes: determining rankings for vocabularies included in the first response according to an importance level, and performing classification on the vocabularies included in the first response. 8. The method of claim 1 , wherein the determining of the speech rate of the second response on the basis of the urgency level of the user includes: increasing the speech rate of the second response when the urgency level of the user is high. 9. A non-transitory computer-readable medium storing a program comprising instructions for performing, when executed by at least one processor, the method of claim 1 . 10. An apparatus for performing voice recognition by using artificial intelligence, the apparatus comprising: a microphone configured to receive a voice command from a user; a processor configured to generate an utterance by processing the voice command, and generate a response in association with the generated utterance; and a speaker configured to output the response, wherein the processor implements: an intention analysis module configured to obtain a user's intention by analyzing the utterance; an urgency deriving module configured to derive an urgency level of the user on the basis of the generated utterance and prestored user information; a response generation module configured to generate a first response in association with the user's intention, obtain main vocabularies included in the first response, and generate a second response on the basis of the obtained main vocabularies and the urgency level of the user; and a voice synthesis module configured to determine a speech rate of the second response on the basis of the urgency level of the user, and provide a voice signal having the determined speech rate to the audio output unit by synthesizing the second response to the voice signal, wherein the microphone is further configured to receive a voice command from the user as feedback for the second response output in the voice signal, and wherein the urgency deriving module and the response generation module are configured to respectively modify the obtaining of the main vocabularies and the deriving of the urgency level according to whether or not the voice command input as the feedback to the response generation module relates to the main vocabularies of the first response which are excluded when generating the second response. 11. The apparatus of claim 10 , wherein the prestored user information includes information on a user's schedule, and the urgency deriving module calculates the urgency level of the user by using at least one of the user's intention, a sound feature of the voice command, and a relation between the information on the user's schedule and current time. 12. The apparatus of claim 11 , wherein the urgency deriving module is further configured to calculate the urgency level of the user by using factors and weighting factors respectively assigned to the user's intention, the sound feature of the voice command, and the relation between the information on the user's schedule and the current time. 13. The apparatus of claim 11 , wherein the urgency deriving module is further configured to obtain the sound feature of the voice command by using a result obtained by comparing a prestored speech feature of a general voice command of the user and a speech feature of the voice command. 14. The apparatus of claim 10 , wherein the intention analysis module is further configured to determine whether the utterance corresponds to a repeat-response type or a general-response type on the basis of a number of input times of the utterance, and wherein the response generation module is further configured to generate the first response by using a prestored response in association with the utterance when the utterance corresponds to the repeat-response type. 15. The apparatus of claim 14 , wherein the response generation module is further configured to generate the second response by selecting one of a plurality of sentences having the same meaning with lengths different with each other on the basis of the urgency level of the user when the utterance corresponds to the repeat-response type. 16. The apparatus of claim 10 , wherein the response generation module is further configured to determine rankings for vocabularies included in the first response according to an importance level, and perform classification on the vocabularies included in the first response. 17. The apparatus of claim 10 , wherein the voice synthesis module is further configured to increase the speech rate of the second response when the urgency level of the user is high.
Machine learning · CPC title
Inference or reasoning models · CPC title
Voice editing, e.g. manipulating the voice of the synthesiser · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.