What technology area does this patent fall under?

Primary CPC classification G06F40/35. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Unsupervised clustering of dialogs extracted from released application logs

US9606984B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9606984-B2
Application number	US-201313969825-A
Country	US
Kind code	B2
Filing date	Aug 19, 2013
Priority date	Aug 19, 2013
Publication date	Mar 28, 2017
Grant date	Mar 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A natural language understanding system performs automatic unsupervised clustering of dialog data from a natural language dialog application. A log parser automatically extracts structured dialog data from application logs. A dialog generalizing module generalizes the extracted dialog data to generalization identifier vectors. A data clustering module automatically clusters the dialog data based on the generalization identifier vectors using an unsupervised density-based clustering algorithm without a predefined number of clusters and without a predefined distance threshold in an iterative approach based on a hierarchical ordering of the generalization.

First claim

Opening claim text (preview).

What is claimed is: 1. A natural language understanding system using at least one hardware implemented computer processor for automatic unsupervised clustering of dialog data from a natural language dialog application, the arrangement comprising: a log parser configured to extract structured dialog data from application logs, the structured dialog data including a transcription of a dialog between a user and the natural language dialog application, the transcription being generated by the natural language dialog application; a dialog generalizing module configured to automatically generalize the extracted dialog data using different independent generalization methods to produce generalization identifier vectors aggregating the results of the generalization methods used, the generalization identifier vectors indicating descriptive categories of the different independent generalization methods that correspond to statements by the user and statements by the natural language dialog application included in the extracted dialog data; and a data clustering module configured to automatically cluster the dialog data based on the generalization identifier vectors using an unsupervised density-based clustering algorithm without a predefined number of clusters and without a predefined distance threshold. 2. The system according to claim 1 , further comprising: a dialog information database configured to store the clustered dialog data. 3. The system according to claim 1 , wherein the generalization identifier vectors include sequences of application state identifiers characterizing internal transition of the state of the dialog application. 4. The system according to claim 1 , wherein the data clustering module further post-processes the clustered dialog data to add additional cluster characteristic information. 5. The system according to claim 1 , wherein the clustering algorithm flattens hierarchic clusters of dialog data. 6. The system according to claim 1 , wherein the clustering algorithm is an iterative clustering algorithm. 7. A computer-implemented method using at least one hardware implemented computer processor for automatic unsupervised clustering of dialog data from a natural language dialog application, the method comprising: automatically generalizing structured dialog data extracted from application logs using different independent generalization methods to produce generalization identifier vectors aggregating the results of the generalization methods used, the structured dialog data including a transcription of a dialog between a user and the natural language dialog application, the transcription being generated by the natural language dialog application, the generalization identifier vectors indicating descriptive categories of the different independent generalization methods that correspond to statements by the user and statements by the natural language dialog application included in the extracted dialog data; and automatically clustering the dialog data based on the generalization identifier vectors using an unsupervised density-based clustering algorithm without a predefined number of clusters and without a predefined distance threshold. 8. The method according to claim 7 , further comprising: storing the clustered dialog data in a dialog information database. 9. The method according to claim 7 , wherein the generalization identifiers include sequences of application state identifiers characterizing internal transition of the state of the dialog application. 10. The method according to claim 7 , post-processing the clustered dialog data to add additional cluster characteristic information. 11. The method according to claim 7 , wherein the clustering algorithm flattens hierarchic clusters of dialog data. 12. The method according to claim 7 , wherein the clustering algorithm is an iterative clustering algorithm. 13. A computer program product encoded in a non-transitory computer-readable medium for automatic unsupervised clustering of dialog data from a natural language dialog application, the product comprising: program code for automatically generalizing structured dialog data extracted from application logs using different independent generalization methods to produce generalization identifier vectors aggregating the results of the methods used, the structured dialog data including a transcription of a dialog between a user and the natural language dialog application, the transcription being generated by the natural language dialog application, the generalization identifier vectors indicating descriptive categories of the different independent generalization methods that correspond to statements by the user and statements by the natural language dialog application included in the extracted dialog data; and program code for automatically clustering the dialog data based on the generalization identifier vectors using an unsupervised density-based clustering algorithm without a predefined number of clusters and without a predefined distance threshold. 14. The product according to claim 13 , further comprising: program code for storing the clustered dialog data in a dialog information database. 15. The product according to claim 13 , wherein the generalization identifiers include sequences of application state identifiers characterizing internal transition of the state of the dialog application. 16. The product according to claim 13 , program code for post-processing the clustered dialog data to add additional cluster characteristic information. 17. The product according to claim 13 , wherein the clustering algorithm flattens hierarchic clusters of dialog data. 18. The product according to claim 13 , wherein clustering algorithm is an iterative clustering algorithm.

Assignees

Nuance Communications Inc

Inventors

Lavallée Jean-Francois

Classifications

G06F40/35Primary
Discourse or dialogue representation · CPC title
G10L2015/0631
Creating reference templates; Clustering · CPC title
G10L15/063
Training · CPC title
G10L2015/0633
using lexical or orthographic knowledge sources · CPC title
G06F17/279Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 52467438

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9606984B2 cover?: A natural language understanding system performs automatic unsupervised clustering of dialog data from a natural language dialog application. A log parser automatically extracts structured dialog data from application logs. A dialog generalizing module generalizes the extracted dialog data to generalization identifier vectors. A data clustering module automatically clusters the dialog data base…
Who is the assignee on this patent?: Nuance Communications Inc
What technology area does this patent fall under?: Primary CPC classification G06F40/35. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).