Speech recognition system and method
US-2018075844-A1 · Mar 15, 2018 · US
US11520992B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11520992-B2 |
| Application number | US-202016909731-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 23, 2020 |
| Priority date | Mar 23, 2018 |
| Publication date | Dec 6, 2022 |
| Grant date | Dec 6, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An agent automation system includes a memory configured to store a natural language understanding (NLU) framework and a processor configured to execute instructions of the NLU framework to cause the agent automation system to perform actions. These actions comprise: generating an annotated utterance tree of an utterance using a combination of rules-based and machine-learning (ML)-based components, wherein a structure of the annotated utterance tree represents a syntactic structure of the utterance, and wherein nodes of the annotated utterance tree include word vectors that represent semantic meanings of words of the utterance; and using the annotated utterance tree as a basis for intent/entity extraction of the utterance.
Opening claim text (preview).
What is claimed is: 1. An agent automation system, comprising: a memory configured to store a natural language understanding (NLU) framework and an intent/entity model, wherein the intent/entity model associates defined intents with a plurality of written sample utterances, and wherein the written sample utterances encode defined entities as parameters of the defined intents within the intent/entity model, wherein the NLU framework includes a vocabulary subsystem, a structure subsystem, and a prosody subsystem; and a processor configured to execute instructions of the NLU framework to cause the agent automation system to perform actions comprising: generating an annotated utterance tree for a written sample utterance of the plurality of written sample utterances by: processing, via the prosody subsystem, the written sample utterance based on written prosody cues to divide the written sample utterance into a plurality of nodes that each represents a word or phrase of the written sample utterance, wherein the written prosody cues comprise a rhythm, an emphasis, or a focus of the written sample utterance; processing, via the structure subsystem, the written sample utterance to organize the plurality of nodes into a dependency parse tree structure that encodes a syntactic structure of the written sample utterance; and assigning, via the vocabulary subsystem, a respective word vector to each of the plurality of nodes, wherein each respective word vector encodes a semantic meaning of the word or phrase represented by each of the plurality of nodes. 2. The agent automation system of claim 1 , wherein the processor is configured to execute instructions of the NLU framework to cause the agent automation system to perform actions comprising: generating a respective subtree vector for each subtree of the dependency parse tree structure based on the respective word vectors of the nodes of each subtree of the dependency parse tree structure. 3. The agent automation system of claim 2 , wherein the processor is configured to execute instructions of the NLU framework to cause the agent automation system to perform actions comprising: receiving a written user utterance; generating an annotated utterance tree for the written user utterance by: processing, via the prosody subsystem, the written user utterance based on the written prosody cues to divide the written user utterance into a second plurality of nodes that each represents a word or phrase of the written user utterance; processing, via the structure subsystem, the written user utterance to organize the second plurality of nodes into a second dependency parse tree structure that encodes a syntactic structure of the written user utterance; and assigning, via the vocabulary subsystem, a respective word vector to each of the second plurality of nodes, wherein each respective word vector encodes a semantic meaning of the word or phrase represented by each of the second plurality of nodes. 4. The agent automation system of claim 3 , wherein the processor is configured to execute instructions of the NLU framework to cause the agent automation system to perform actions comprising: performing, via the vocabulary subsystem, rules-based cleansing and augmentation to modify the written user utterance before generating the annotated utterance tree for the written user utterance, wherein the rules-based cleansing and augmentations comprises substituting synonyms, correcting misspellings, removing punctuation, addressing domain-specific syntax and terminology, combining words, and/or separating compounds words and contractions in the written user utterance. 5. The agent automation system of claim 3 , wherein the processor is configured to execute instructions of the NLU framework to cause the agent automation system to perform actions comprising: generating a respective subtree vector for each subtree of the second dependency parse tree structure based on the respective word vectors of the nodes of each subtree of the second dependency parse tree structure; and extracting an intent and/or entity from the written user utterance based on a comparison of the subtree vectors of the annotated utterance tree of the written user utterance to the subtree vectors of the annotated utterance tree of the written sample utterance. 6. The agent automation system of claim 5 , wherein the extracted intent and/or entity includes a corresponding confidence score that is based on subtree similarity scores determined during the comparison of the subtree vectors of the annotated utterance tree of the written user utterance to the subtree vectors of the annotated utterance tree of the written sample utterance. 7. The agent automation system of claim 3 , wherein the vocabulary subsystem includes a word vector distribution model, wherein vocabulary subsystem determines the respective word vector for each of the plurality of nodes of the annotated utterance tree based on the word vector distribution model. 8. The agent automation system of claim 7 , wherein the processor is configured to execute instructions of the NLU framework to cause the agent automation system to improve operation of the vocabulary subsystem by performing actions comprising: performing rule-based unsupervised learning of the words or phrases of the written user utterance; and modifying word vectors of the word vector distribution model based on the rule-based unsupervised learning. 9. The agent automation system of claim 3 , wherein the structure subsystem includes a plurality of rules-based parsers and a machine-learning (ML)-based parser, and wherein the processor is configured to execute instructions of the NLU framework to cause the agent automation system to improve operation of the ML-based parser by performing actions comprising: parsing the written sample utterance using the plurality of rules-based parsers of the structure subsystem to generate a plurality of dependency parse tree structures; and in response to the processor determining that a majority of the plurality of dependency parse tree structures has a common dependency parse tree structure, adjusting a neural network model associated with the ML-based parser such that the ML-based parser is configured to generate the common dependency parse tree structure from the written user utterance. 10. A method of operating a natural language understanding (NLU) framework, comprising: generating an annotated utterance tree for each written sample utterance of a plurality of written sample utterances of an intent/entity model by: processing the written sample utterance based on written prosody cues to segment the written sample utterance into a plurality of nodes that each represents a word or phrase of the written sample utterance, wherein the written prosody cues comprise a rhythm, an emphasis, or a focus of the written sample utterance; organizing the plurality of nodes into a dependency parse tree structure that encodes a syntactic structure of the written sample utterance, wherein class annotations are assigned to each of the plurality of nodes in the dependency parse tree structure, and wherein the class annotations comprise: a verb annotation, a subject or entity annotation, a direct object annotation, a subject modifier annotation, an object modifier annotation, or a verb modifier annotation; assigning a respective word vector to each of the plurality of nodes that encodes a semantic meaning of the word or phrase represented by each of the plurality of nodes; and generating a respective subtree vector for each subtree of the dependency parse tree structure from the respective word vectors of the nodes of each subtree of the dependency parse tree structure.
Parsing for meaning understanding · CPC title
Feedback of the input speech · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Semantic analysis · CPC title
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.