What technology area does this patent fall under?

Primary CPC classification G06F40/268. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for semantically exploring concepts

US2016012818A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016012818-A1
Application number	US-201414327476-A
Country	US
Kind code	A1
Filing date	Jul 9, 2014
Priority date	Jul 9, 2014
Publication date	Jan 14, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for detecting and categorizing topics in a plurality of interactions includes: extracting, by a processor, a plurality of fragments from the plurality of interactions; filtering, by the processor, the plurality of fragments to generate a filtered plurality of fragments; clustering, by the processor, the filtered fragments into a plurality of base clusters; and clustering, by the processor, the plurality of base clusters into a plurality of hyper clusters.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for detecting and categorizing topics in a plurality of interactions, the method comprising: extracting, by a processor, a plurality of fragments from the plurality of interactions; filtering, by the processor, the plurality of fragments to generate a filtered plurality of fragments; clustering, by the processor, the filtered fragments into a plurality of base clusters; and clustering, by the processor, the plurality of base clusters into a plurality of hyper clusters. 2 . The method of claim 1 , wherein the extracting the plurality of fragments from the plurality of interactions comprises: receiving, by the processor, text corresponding to the plurality of interactions; tagging, by the processor, portions of the text based on parts of speech; and extracting, by the processor, fragments from the text in accordance with one or more extraction rules. 3 . The method of claim 2 , wherein the text corresponding to the plurality of interactions comprises an output of an automatic speech recognition engine, the output being generated by processing at least one of the plurality of interactions through the automatic speech recognition engine. 4 . The method of claim 2 , wherein the one or more extraction rules comprise a part of speech sequence. 5 . The method of claim 2 , wherein the one or more extraction rules are automatically generated by the processor based on a plurality of manually extracted fragments. 6 . The method of claim 1 , further comprising labeling, by the processor, a base cluster of the plurality of base clusters, the labeling comprising: extracting, by the processor, a plurality of noun phrases from the base cluster; computing, by the processor, a distribution of probabilities of stems of the noun phrases; and identifying, by the processor, a label noun phrase of the noun phrases, the label noun phrase having a highest probability based on the stem distribution. 7 . The method of claim 1 , wherein the clustering the plurality of base clusters into the plurality of hyper clusters comprises: computing, by the processor, a plurality of semantic distances between pairs of the plurality of base clusters; and clustering, by the processor, the base clusters into the hyper clusters in accordance with the semantic distances. 8 . The method of claim 7 , wherein the plurality of semantic distances are computed based on semantic similarities of the pairs of base clusters and co-occurrence of fragments in the pairs of base clusters. 9 . The method of claim 1 , further comprising: generating, by the processor, a visualization of the plurality of topics as organized into a hierarchy based on the plurality of hyper clusters, at least one of the hyper clusters comprising a plurality of corresponding base clusters, each of the base clusters comprising a corresponding plurality of fragments. 10 . A system comprising: a processor; and a memory, wherein the memory has stored thereon instructions that, when executed by the processor, cause the processor to: receive a plurality of interactions; extract a plurality of fragments from the plurality of interactions; filter the plurality of fragments to generate a filtered plurality of fragments; cluster the filtered fragments into a plurality of base clusters; and cluster the plurality of base clusters into a plurality of hyper clusters. 11 . The system of claim 10 , wherein the instructions that cause the processor to extract the plurality of fragments from the plurality of interactions comprise instructions that, when executed by the processor, cause the processor to: receive text corresponding to the plurality of interactions; tag portions of the text based on parts of speech; and extract fragments from the text in accordance with one or more extraction rules. 12 . The system of claim 11 , wherein the text corresponding to the plurality of interactions comprises an output of an automatic speech recognition engine, the output being generated by processing at least one of the plurality of interactions through the automatic speech recognition engine. 13 . The system of claim 11 , wherein the one or more extraction rules comprise a part of speech sequence. 14 . The system of claim 11 , wherein the memory further has stored thereon instructions that, when executed by the processor, cause the processor to generate the one or more extraction rules based on a plurality of manually extracted fragments. 15 . The system of claim 10 , wherein the memory further has stored thereon instructions that, when executed by the processor, cause the processor to label a base cluster of the plurality of base clusters by: extracting a plurality of noun phrases from the base cluster; computing a distribution of probabilities of stems of the noun phrases; and identifying a label noun phrase of the noun phrases, the label noun phrase having a highest probability based on the stem distribution. 16 . The system of claim 10 , wherein the instructions that cause the processor to cluster the plurality of base clusters into the plurality of hyper clusters comprise instructions that, when executed by the processor, cause the processor to: compute a plurality of semantic distances between pairs of the plurality of base clusters; and cluster the base clusters in into the hyper clusters in accordance with the semantic distances. 17 . The system of claim 16 , wherein the instructions that cause the processor to compute the plurality of semantic distances between the pairs of the base clusters comprise instructions to compute a semantic distance of the semantic distances based on semantic similarities between the pairs of the base clusters and co-occurrence of fragments in the pairs of the base clusters. 18 . The system of claim 10 , wherein the memory further has stored thereon instructions that, when executed by the processor, cause the processor to generate a visualization of a plurality of topics as organized into a hierarchy based on the plurality of hyper clusters, at least one of the hyper clusters comprising a plurality of corresponding base clusters, each of the base clusters comprising a corresponding plurality of fragments.

Assignees

Genesys Telecomm Lab Inc

Inventors

Classifications

G06F40/30
Semantic analysis · CPC title
G06F40/268Primary
Morphological analysis · CPC title
G06F16/35
Clustering; Classification · CPC title
G10L15/063Primary
Training · CPC title
G06F16/285
Clustering or classification · CPC title

Patent family

Related publications grouped by family.

View patent family 55064922

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016012818A1 cover?: A method for detecting and categorizing topics in a plurality of interactions includes: extracting, by a processor, a plurality of fragments from the plurality of interactions; filtering, by the processor, the plurality of fragments to generate a filtered plurality of fragments; clustering, by the processor, the filtered fragments into a plurality of base clusters; and clustering, by the proces…
Who is the assignee on this patent?: Genesys Telecomm Lab Inc
What technology area does this patent fall under?: Primary CPC classification G06F40/268. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for identifying key phrase clusters within documents

Phrase extraction using subphrase scoring

Keywords extraction and enrichment via categorization systems

Methods and apparatus for performing transformation techniques for data clustering and/or classification

System and Method for Creating Labels for Clusters

Frequently asked questions