Intelligent online personal assistant with offline visual search database
US-2018107685-A1 · Apr 19, 2018 · US
US11036776B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11036776-B2 |
| Application number | US-201916436882-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 10, 2019 |
| Priority date | Nov 8, 2016 |
| Publication date | Jun 15, 2021 |
| Grant date | Jun 15, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Clustering a set of natural language queries NLQs based on a set of significant events retrieved from a corpus stored in a computer system is described. A set of NLQs is used by a search engine for searching a selected corpus to retrieve respective sets of significant events. The set of NLQs is clustered into a plurality of NLQ clusters according to a number of common significant events being returned by the search engine for respective members of an NLQ cluster.
Opening claim text (preview).
The invention claimed is: 1. An improved method for searching a selected corpus by clustering a set of natural language queries (NLQ) based on a set of significant events retrieved from a corpus stored in a computer system comprising: using a set of NLQs by a search engine for searching a selected corpus to retrieve respective sets of significant events from the selected corpus; for each NLQ in the set of NLQs, using a first set of entities from the NLQ and using the first set of entities to search for a first set of significant events in the selected corpus; using a second set of entities from the first set of significant events to search for a second set of significant events in the selected corpus; producing a distribution profile for each NLQ based on a number of common significant events retrieved using the first set of entities and a number of common significant events retrieved using the second set of entities; clustering the set of NLQs into NLQ clusters according to the distribution profiles; and using a respective NLQ cluster for a function in the search engine. 2. The method as recited in claim 1 , wherein the first set of significant events in the selected corpus are determined in a first search pass and the second set of significant events in the selected corpus in a second search pass. 3. The method as recited in claim 1 , wherein the clustering is also based in part on common linguistic and semantic features of respective NLQs. 4. The method as recited in claim 3 , further comprising: from user input, receiving a threshold number of common significant events as a clustering criterion; and from user input, receiving a threshold number of common linguistic and semantic features in an NLQ as a clustering criterion. 5. The method as recited in claim 2 , further comprising: building a knowledge graph based on a selected corpus stored in the computer system, the knowledge graph having a set of co-occurrence scores on edges of the knowledge graph between respective events in the selected corpus placed at the nodes of the knowledge graph, wherein the co-occurrence scores indicate co-occurrence of entities within respective pairs of events in the selected corpus; and using the knowledge graph to extract the second set of entities. 6. The method as recited in claim 2 , further comprising: extracting a third set of entities from the second set of significant events and using the third set of entities to search for a third set of significant events in the selected corpus in a third search pass; and producing a distribution profile for each NLQ based on a number of significant events retrieved in the first search pass, the second search pass and the third search pass. 7. The method as recited in claim 2 , further comprising: determining a significance score for respective events retrieved by the search system according to a metric of mutual information (MMI); and filtering the retrieved events according to respective significance scores to produce the first set of significant events. 8. The method as recited in claim 1 , wherein the first and second sets of entities have no common members. 9. Apparatus, comprising: a processor; computer memory holding computer program instructions executed by the processor for improved searching of a selected corpus by clustering a set of natural language queries (NLQ), the computer program instructions comprising: program code, operative to use a set of NLQs by for searching a selected corpus to retrieve respective sets of significant events from the selected corpus; program code, operative for each NLQ in the set of NLQs to use a first set of entities from the NLQ and to use the first set of entities to search for a first set of significant events in the selected corpus; program code, operative to use a second set of entities from the first set of significant events to search for a second set of significant events in the selected corpus; program code, operative to produce a distribution profile for each NLQ based on a number of common significant events retrieved using the first set of entities and a number of common significant events retrieved using the second set of entities; program code, operative to cluster the set of NLQs into NLQ clusters according to the distribution profiles; and program code, operative to use a respective NLQ cluster for a function in the search engine. 10. The apparatus as recited in claim 9 , wherein the first set of significant events in the selected corpus are determined in a first search pass and the second set of significant events in the selected corpus in a second search pass. 11. The apparatus as recited in claim 9 , wherein the clustering is also based in part on common linguistic and semantic features of respective NLQs. 12. The apparatus as recited in claim 11 , further comprising: program code, operative to receive a threshold number of common significant events as a clustering criterion; and program code, operative to receive a threshold number of common linguistic and semantic features in an NLQ as a clustering criterion. 13. The apparatus as recited in claim 10 , further comprising: program code, operative to build a knowledge graph based on a selected corpus stored in the computer system, the knowledge graph having a set of co-occurrence scores on edges of the knowledge graph between in the selected corpus placed at the nodes of the knowledge graph, wherein the co-occurrence scores indicate co-occurrence of entities within respective pairs of events in the selected corpus; and program code, operative to use the knowledge graph to extract the second set of entities. 14. The apparatus as recited in claim 10 , further comprising: program code, operative to extract a third set of entities from the second set of significant events and using the third set of entities to search for a third set of significant events in the selected corpus in a third search pass; and program code, operative to produce a distribution profile for each NLQ based on a number of significant events retrieved in the first search pass, the second search pass and the third search pass. 15. A computer program product in a non-transitory computer readable medium for use in a data processing system, the computer program product holding computer program instructions executed by the data processing system for improved searching of a selected corpus by performing clustering of natural language queries (NLQ), the computer program instructions comprising: program code, operative to use a set of NLQs by for searching a selected corpus to retrieve respective sets of significant events from the selected corpus; program code, operative for each NLQ in the set of NLQs to use a first set of entities from the NLQ and to use the first set of entities to search for a first set of significant events in the selected corpus; program code, operative to use a second set of entities from the first set of significant events to search for a second set of significant events in the selected corpus; program code, operative to produce a distribution profile for each NLQ based on a number of common significant events retrieved using the first set of entities and a number of common significant events retrieved using the second set of entities; program code, operative to cluster the set of NLQs into NLQ clusters according to the distribution profiles; and program code, operative to use a respective NLQ cluster for a function in the search engine. 16. The computer program product as recited in claim 15 , wherein the first set of signi
using natural language analysis · CPC title
Clustering; Classification · CPC title
Clustering; Classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.