Combining unsupervised and semi-supervised deep clustering approaches for mining intentions from texts

US12153897B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12153897-B2
Application numberUS-202117605326-A
CountryUS
Kind codeB2
Filing dateSep 17, 2021
Priority dateSep 17, 2020
Publication dateNov 26, 2024
Grant dateNov 26, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An analysis platform combines unsupervised and semi-supervised approaches to quickly surface and organize relevant user intentions from conversational text (e.g., from natural language inputs). An unsupervised and semi-supervised pipeline is provided that integrates the fine-tuning of high performing language models via a language models fine-tuning module, a distributed KNN-graph building method via a KNN-graph building module, and community detection techniques for mining the intentions and topics from texts via an intention mining module.

First claim

Opening claim text (preview).

What is claimed: 1. A system for mining latent intentions from natural language inputs, the system comprising: a computing device that maintains a plurality of natural language inputs; and an analysis platform that uses a plurality of unsupervised and semi-supervised approaches to surface and organize a plurality of relevant user intentions from the plurality of natural language inputs, wherein the analysis platform comprises: a language models fine-tuning module; a K-nearest neighbor (KNN)-graph building module; and a clustering module. 2. The system of claim 1 , wherein the language models fine-tuning module is configured to fine-tune a plurality of language models based on the plurality of natural language inputs. 3. The system of claim 1 , wherein the language models fine-tuning module is configured to tokenize a plurality of labeled texts and unlabeled texts into a plurality of language models. 4. The system of claim 1 , wherein the KNN-graph building module is configured to build a distributed KNN-graph. 5. The system of claim 1 , wherein the clustering module comprises a clustering technique that requires a number of clusters to be known ahead of time, and a clustering technique that is graph-based that does not require the number of clusters to be known ahead of time. 6. The system of claim 1 , wherein the clustering module is configured to perform clustering based on whether a number of clusters is known or unknown, wherein when the number of clusters is unknown, then a Louvain clustering technique is used, and when the number of clusters is known, then a K-means clustering technique is used. 7. The system of claim 1 , wherein the clustering module is configured to perform clustering based on whether a number of clusters is predetermined or detected automatically, wherein when the number of clusters is detected automatically, then a Louvain clustering technique is used, and when the number of clusters is predetermined, then a K-means clustering technique is used. 8. The system of claim 1 , further comprising an intention mining module. 9. The system of claim 8 , wherein the intention mining module is configured to design and refine a plurality of Intelligent Virtual Assistants (IVAs) for customer service and sales support. 10. The system of claim 1 , further comprising an output device that receives an output from the analysis platform and determines a plurality of latent intentions using the output. 11. An analysis platform stored on one or more computer-readable tangible storage media, the platform comprising: a language models fine-tuning module that fine-tunes a plurality of language models; a K-nearest neighbor (KNN)-graph building module that builds a distributed KNN-graph; a clustering module that comprises a K-means clustering technique and a Louvain clustering technique, wherein the clustering module is configured to perform clustering based on whether a number of clusters is known or unknown; and an intention mining module that mines a plurality of latent intentions from a plurality of natural language inputs and an output from the clustering module. 12. The analysis platform of claim 11 , wherein the language models fine-tuning module fine-tunes a plurality of language models based on the plurality of natural language inputs. 13. The analysis platform of claim 11 , wherein the intention mining module is configured to design and refine a plurality of Intelligent Virtual Assistants (IVAs) for customer service and sales support. 14. The analysis platform of claim 11 , wherein when the number of clusters is unknown, then the Louvain clustering technique is used, and when the number of clusters is known, then the K-means clustering technique is used. 15. The analysis platform of claim 11 , wherein when the number of clusters is detected automatically, then the Louvain clustering technique is used, and when the number of clusters is predetermined, then the K-means clustering technique is used. 16. A method for mining latent intentions from natural language inputs, the method comprising: receiving a plurality of language models based on a plurality of natural language inputs; fine-tuning the plurality of language models; performing clustering using the plurality of fine-tuned language models; and determining a plurality of latent intentions based on results of the clustering; wherein performing clustering comprises performing clustering based on whether a number of clusters is known or unknown, wherein when the number of clusters is unknown, then a Louvain clustering technique is used, and when the number of clusters is known, then a K-means clustering technique is used. 17. The method of claim 16 , wherein fine-tuning the plurality of language models comprises encoding the plurality of language models and using a softmax classifier to fine-tune the plurality of language models. 18. The method of claim 16 , further comprising building a K-nearest neighbor (KNN)-graph using the plurality of language models, when a number of clusters for performing the clustering is unknown or detected automatically. 19. A method for mining latent intentions from natural language inputs, the method comprising: receiving a plurality of language models based on a plurality of natural language inputs; fine-tuning the plurality of language models; performing clustering using the plurality of fine-tuned language models; and determining a plurality of latent intentions based on results of the clustering; wherein performing clustering comprises performing clustering based on whether a number of clusters is predetermined or detected automatically, wherein when the number of clusters is detected automatically, then a Louvain clustering technique is used, and when the number of clusters is predetermined, then a K-means clustering technique is used.

Assignees

Inventors

Classifications

  • based on graph theory, e.g. minimum spanning trees [MST] or graph cuts · CPC title

  • Phrasal analysis, e.g. finite state techniques or chunking · CPC title

  • Semantic analysis · CPC title

  • G06F40/40Primary

    Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • G06F16/355Primary

    Creation or modification of classes or clusters · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12153897B2 cover?
An analysis platform combines unsupervised and semi-supervised approaches to quickly surface and organize relevant user intentions from conversational text (e.g., from natural language inputs). An unsupervised and semi-supervised pipeline is provided that integrates the fine-tuning of high performing language models via a language models fine-tuning module, a distributed KNN-graph building meth…
Who is the assignee on this patent?
Verint Americas Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/40. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).