Artificial intelligence apparatus for recognizing speech of user using personalized language model and method for the same
US-11302311-B2 · Apr 12, 2022 · US
US11580980B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11580980-B2 |
| Application number | US-202117155783-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 22, 2021 |
| Priority date | Jun 5, 2020 |
| Publication date | Feb 14, 2023 |
| Grant date | Feb 14, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and apparatus for generating a user intention understanding satisfaction evaluation model, a method and apparatus for evaluating a user intention understanding satisfaction, an electronic device and a storage medium are provided, relating to intelligent voice recognition and knowledge graphs. The method for generating a user intention understanding satisfaction evaluation model is: acquiring a plurality of sets of intention understanding data, at least one set of which comprises a plurality of sequences corresponding to multi-round behaviors of an intelligent device in multi-round man-machine interactions; and learning the plurality of sets of intention understanding data through a first machine learning model, to obtain the user intention understanding satisfaction evaluation model after the learning, wherein the user intention understanding satisfaction evaluation model is configured to evaluate user intention understanding satisfactions of the intelligent device in the multi-round man-machine interactions according to the plurality of sequences corresponding to the multi-round man-machine interactions.
Opening claim text (preview).
What is claimed is: 1. A method for generating a user intention understanding satisfaction evaluation model, comprising: acquiring a plurality of sets of intention understanding data, at least one set of intention understanding data comprises a plurality of sequences corresponding to multi-round behaviors of an intelligent device in multi-round man-machine interactions; wherein one sequence corresponds to a behavior of the intelligent device in one round of man-machine interaction; and learning the plurality of sets of intention understanding data through a first machine learning model, to obtain the user intention understanding satisfaction evaluation model after the learning, wherein the user intention understanding satisfaction evaluation model is used for evaluating user's satisfactions with intention understanding by the intelligent device in the multi-round man-machine interactions according to the plurality of sequences corresponding to the multi-round man-machine interactions; wherein the multi-round man-machine interactions comprise receiving multi-round voice instructions and feeding back respectively by an intelligent voice device. 2. The method according to claim 1 , wherein the multi-round man-machine interactions corresponding to the plurality of sequences are continuous multi-round man-machine interactions. 3. The method according to claim 1 , wherein the plurality of sequences comprise a first sequence corresponding to a behavior category, to which a first behavior of the intelligent device belongs. 4. The method according to claim 1 , wherein the plurality of sequences comprise a second sequence, which comprises a first subsequence and a second subsequence, wherein the first subsequence corresponds to a first-level behavior category, to which a second behavior of the intelligent device belongs, the second subsequence corresponds to a second-level behavior category, to which the second behavior of the intelligent device belongs, wherein one first-level behavior category comprises one or more second-level behavior categories. 5. The method according to claim 4 , wherein the first-level behavior category comprises at least one of the following behavior categories: a control category, an audio-visual category, an information category, an education category, a leisure category, a home control category, and a game category; the second-level behavior category comprises at least one of the following behavior categories: wake up, volume up, volume down, exit application, basic settings, shut down, song playback, video playback, playlist, playback progress adjustment, change the song, song information, singer information, play video, video information, weather check, and play completed. 6. The method according to claim 1 , wherein in a case that a first-round voice instruction in the multi-round voice instructions is a second instruction, a second-round voice instruction is received while the intelligent voice device is playing a feedback result, and the second-round voice instruction is still the second instruction, then a user's satisfaction with the intention understanding of a sequence corresponding to the first-round voice instruction is determined to be unsatisfied. 7. The method according to claim 1 , wherein the first machine learning model comprises a Hidden Markov Model, the plurality of sets of intention understanding data is unlabeled or labeled data. 8. The method according to claim 1 , wherein the first machine learning model comprises a neural network model, and the plurality of sets of intention understanding data is labeled data. 9. The method according to claim 1 , wherein in a case that a first-round voice instruction in the multi-round voice instructions is a first instruction, and a second-round voice instruction is still the first instruction, then a user's satisfaction with the intention understanding of a sequence corresponding to the first-round voice instruction is determined to be unsatisfied. 10. A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions cause a computer to perform the method according to claim 1 . 11. A method for evaluating a user's satisfaction with intention understanding, comprising: acquiring information of multi-round behaviors of an intelligent device in multi-round man-machine interactions to be evaluated; wherein one sequence corresponds to a behavior of the intelligent device in one round of man-machine interaction; serializing the information of the multi-round behaviors, to obtain a plurality of sequences; and inputting the plurality of sequences into a user intention understanding satisfaction evaluation model, to obtain evaluation results of user's satisfactions with intention understanding to be outputted by the model, wherein the user intention understanding satisfaction evaluation model is used for evaluating the user's satisfactions with the intention understanding by the intelligent device in the multi-round human-computer interactions according to the plurality of sequences corresponding to the multi-round human-computer interactions; wherein the multi-round man-machine interactions comprise receiving multi-round voice instructions and feeding back respectively by an intelligent voice device. 12. The method according to claim 11 , wherein the serializing the information of the multi-round behaviors comprises: in a case that a first-round behavior in the multi-round behaviors belongs to a first behavior category in a first-level behavior category set and also belongs to a second behavior category in a second-level behavior category set, labeling the first-round behavior as a superimposed sequence of a first sequence and a second sequence, wherein the first sequence and the second sequence correspond to the first behavior category and the second behavior category, respectively; and in a case that a second-round behavior in the multi-round behaviors belongs to a third behavior category in the first-level behavior category set and also belongs to a fourth behavior category in the second-level behavior category set, labeling the second-round behavior as a superimposed sequence of a third sequence and a fourth sequence, wherein the third sequence and the fourth sequence correspond to the third behavior category and the fourth behavior category, respectively, wherein one first-level behavior category comprises one or more second-level behavior categories. 13. The method according to claim 11 , wherein the user intention understanding satisfaction evaluation model is generated based on a method for generating the user intention understanding satisfaction evaluation model, comprising: acquiring a plurality of sets of intention understanding data, at least one set of intention understanding data comprises a plurality of sequences corresponding to multi-round behaviors of an intelligent device in multi-round man-machine interactions; and learning the plurality of sets of intention understanding data through a first machine learning model, to obtain the user intention understanding satisfaction evaluation model after the learning, wherein the user intention understanding satisfaction evaluation model is used for evaluating the user's satisfactions with the intention understanding by the intelligent device in the multi-round man-machine interactions according to the plurality of sequences corresponding to the multi-round man-machine interactions; wherein the multi-round man-machine interactions comprise receiving multi-round voice instructions and feeding back respectively by an intelligent voice device.
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Non-supervised learning, e.g. competitive learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.