Offtrack virtual agent interaction session detection
US-2020005118-A1 · Jan 2, 2020 · US
US11468233B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11468233-B2 |
| Application number | US-202016750182-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 23, 2020 |
| Priority date | Jan 29, 2019 |
| Publication date | Oct 11, 2022 |
| Grant date | Oct 11, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An intention identification method includes generating a heterogeneous text network based on a language material sample; using a graph embedding algorithm to perform learning with respect to the heterogeneous text network and obtain a vector representation of the language material sample and a word, and determining keywords of the language material sample based on a similarity in terms of a vector between the language material sample and the word in the language material sample; training an intention identification model until a predetermined training termination condition is satisfied, by using the keywords of the language material samples, and obtaining the trained intention identification model; and receiving a language material query, and using the trained intention identification model to identify an intention of the language material query.
Opening claim text (preview).
What is claimed is: 1. An intention identification method comprising: generating a heterogeneous text network based on a plurality of language material samples that include a plurality of labeled language materials to which an intention has been labeled, and a plurality of unlabeled language materials to which an intention has not been labeled, wherein the heterogeneous text network includes a first co-occurrence relationship indicating that a word occurs in a language material sample from among the plurality of language material samples, and a second co-occurrence relationship indicating that two or more words appear in the language material sample; using a graph embedding algorithm to perform learning with respect to the heterogeneous text network, obtain a vector representation of the language material sample and the word, and determining keywords of the language material sample based on a similarity in terms of a vector between the language material sample and the word in the language material sample; training an intention identification model until a predetermined training termination condition is satisfied, the intention identification model being one or more intention identification classifiers that include a plurality of different language levels, wherein the training of the intention identification model includes matching the keywords of each language material sample of the plurality of the language material samples to a language level of the one or more intention identification classifiers; receiving a language material query; and identifying an intention of the received language material query using the trained intention identification model. 2. The intention identification method according to claim 1 , wherein the training of the intention identification model includes: training an intention identification classifier of the one or more intention identification classifiers by using the keywords of each language material sample of the plurality of labeled language materials; terminating the training upon detecting that the predetermined training termination condition is satisfied, or predicting an intention and a prediction reliability of the plurality of unlabeled language materials by using a plurality of the trained intention identification classifiers upon detecting that the predetermined training termination condition is not satisfied; acquiring a probability distribution of vectors of the plurality of unlabeled language materials, selecting, from the plurality of unlabeled language materials, a target language material for which the prediction reliability is greater than a predetermined first threshold and for which a probability corresponding to a feature vector is less than a predetermined second threshold, and labeling an intention to the target language material based on the intention and the prediction reliability that have been predicted; and deleting the target language material from the plurality of unlabeled language materials, adding the target language material to the plurality of labeled language materials, returning to using, a feature vector of the plurality of labeled language materials, and training the intention identification classifier. 3. The intention identification method according to claim 2 , wherein the training of the intention identification classifier includes: converting the keywords of the plurality of labeled language materials into an input sequence of the language levels of the intention identification classifier based on the language levels of the intention identification classifier, inputting the input sequence to the intention identification classifier, and training the intention identification classifier, wherein when the language levels are word levels, the input sequence is a sequence of the keywords in the plurality of labeled language materials, when the language levels are character levels, the input sequence is a sequence of characters obtained by dividing the keywords in the plurality of labeled language materials, and when the language levels are phrase levels, the input sequence is an order of phrases in the plurality of labeled language materials, and the phrases are formed by the keywords whose positional relationships in the plurality of labeled language materials satisfy a predetermined condition. 4. The intention identification method according to claim 1 , wherein the generating of the heterogeneous text network based on the language material sample includes: performing a character string preprocess with respect to the language material sample and obtaining the language material sample that has undergone the character string preprocess, the character string preprocess including data cleaning, stop word, an error correction process, and a stemming process; extracting a word in a language material text, which is obtained by processing the language material sample, and establishing the first co-occurrence relationship, and extracting two words present in the same language material text and establishing the second co-occurrence relationship; and generating the heterogeneous text network including the first co-occurrence relationship and the second co-occurrence relationship. 5. The intention identification method according to claim 1 , wherein the determining of the keywords of the language material sample based on the similarity in terms of the vector between the language material sample and the word in the language material sample includes: calculating, the similarity in terms of the vector between the language material sample and the word in the language material sample; and selecting a predetermined number of words for which the similarity in terms of the vector is maximum, and determining the selected words as the keywords of the language material sample. 6. The intention identification method according to claim 1 , wherein the language levels include at least two levels among a character level, a word level, and a phrase level. 7. A non-transitory computer-readable recording medium storing a computer program, wherein the intention identification method according to claim 1 is executed by having a processor execute the computer program. 8. An intention identification apparatus comprising: a processor; and a memory storing program instructions that cause the processor to implement: a text network generator configured to-generate a heterogeneous text network based on a plurality of language material samples that include a plurality of labeled language materials to which an intention has already been labeled, and a plurality of unlabeled language materials to which an intention has not been labeled, wherein the heterogeneous text network includes a first co-occurrence relationship indicating that a word occurs in a language material sample from among the plurality of language material samples, and a second co-occurrence relationship indicating that two or more words appear in the language material sample; a vector generator configured to-use a graph embedding algorithm to perform learning with respect to the heterogeneous text network, obtain a vector representation of the language material sample and the word, and determine keywords of the language material sample based on a similarity in terms of a vector between the language material sample and the word in the language material sample; a model trainer configured to train an intention identification model until a predetermined training termination condition is satisfied the intention identification model being one or more intention identification classifiers that include a plurality of different language levels, wherein the training of the intention identification mode
using statistical methods · CPC title
Natural language query formulation · CPC title
Natural language generation · CPC title
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.