Dynamic faceted search
US-2018232449-A1 · Aug 16, 2018 · US
US11275796B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11275796-B2 |
| Application number | US-201916399030-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 30, 2019 |
| Priority date | Apr 30, 2019 |
| Publication date | Mar 15, 2022 |
| Grant date | Mar 15, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A query-focused faceted structure generation method, system, and computer program product for generating a query-focused faceted structure from a taxonomy for searching a document collection, including ingesting a document corpus, generating a vector space representation of a query and instances from a taxonomy of the document corpus, and producing a dynamic structure of a relevant facet categories and facet values using a two-vector space representation from the generated vector space representation.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented query-focused faceted structure generation method for generating a query-focused faceted structure from a taxonomy for searching a document collection, the method comprising: ingesting a document corpus including a pre-processing that filters parts of speech; generating a vector space representation of a query and instances from a taxonomy of the document corpus via at least two models, the taxonomy being loaded and including a graph of a type and instance nodes where the instance nodes have a consistent relationship to the type; and producing a dynamic structure of a relevant category and facet using a two-vector space representation from the generated vector space representation based on a separate two-vector space representation of the at least two models, wherein the ingesting ingests the document corpus by: extracting the terminology that includes noun words and phrases from the document corpus to: train a type model that generates a phrase embedding of the terminology in the document corpus; and train a topic model that generates a second phrase embedding of the terminology in the document corpus, wherein the generating generates a vector for a user query as a weighted combination of the vector for each query token in the topic model as a query vector, wherein the generating generates a list of the vectors for instances from the taxonomy in the topic model, and wherein the producing produces the dynamic structure of the relevant category and the facet by: selecting a first parameter of nearest neighbor instances to the query vector from the taxonomy instances using the topic model as query-similar instances; selecting a second parameter of types in the taxonomy with a most number of query-similar instances to use as categories; selecting a third parameter of facets from instances of the types corresponding to each of the categories for the second parameter; and expanding from the third parameter of the facets within each of the second parameter of the categories to obtain more category-similar instances from the document corpus using the type model. 2. The method of claim 1 , further comprising returning the dynamic structure as a data file to a user. 3. The method of claim 1 , wherein the facets are ranked within each of the first parameter of the categories by distance to both: the query vector in the topic model vector space, and a centroid of the third parameter of instances that correspond to the category. 4. The method of claim 1 , embodied in a cloud-computing environment. 5. A computer program product for query-focused faceted structure generation, the computer program product comprising a computer-readable storage medium having program instructions embodied therewith for generating a query-focused faceted structure from a taxonomy for searching a document collection, the program instructions executable by a computer to cause the computer to perform: ingesting a document corpus including a pre-processing that filters parts of speech; generating a vector space representation of a query and instances from a taxonomy of the document corpus via at least two models, the taxonomy being loaded and including a graph of a type and instance nodes where the instance nodes have a consistent relationship to the type; and producing a dynamic structure of a relevant category and facet using a two-vector space representation from the generated vector space representation based on a separate two-vector space representation of the at least two models, wherein the ingesting ingests the document corpus by: extracting the terminology that includes noun words and phrases from the document corpus to: train a type model that generates a phrase embedding of the terminology in the document corpus; and train a topic model that generates a second phrase embedding of the terminology in the document corpus, wherein the generating generates a vector for a user query as a weighted combination of the vector for each query token in the topic model as a query vector, wherein the generating generates a list of the vectors for instances from the taxonomy in the topic model, and wherein the producing produces the dynamic structure of the relevant category and the facet by: selecting a first parameter of nearest neighbor instances to the query vector from the taxonomy instances using the topic model as query-similar instances; selecting a second parameter of types in the taxonomy with a most number of query-similar instances to use as categories; selecting a third parameter of facets from instances of the types corresponding to each of the categories for the second parameter; and expanding from the third parameter of the facets within each of the second parameter of the categories to obtain more category-similar instances from the document corpus using the type model. 6. The computer program product of claim 5 , further comprising returning the dynamic structure as a data file to a user. 7. The computer program product of claim 5 , wherein the facets are ranked within each of the first parameter of the categories by distance to both: the query vector in the topic model vector space, and a centroid of the third parameter of instances that correspond to the category. 8. A query-focused faceted structure generation system for generating a query-focused faceted structure from a taxonomy for searching a document collection, the system comprising: a processor; and a memory, the memory storing instructions to cause the processor to perform: ingesting a document corpus including a pre-processing that filters parts of speech; generating a vector space representation of a query and instances from a taxonomy of the document corpus via at least two models, the taxonomy being loaded and including a graph of a type and instance nodes where the instance nodes have a consistent relationship to the type; and producing a dynamic structure of a relevant category and facet using a two-vector space representation from the generated vector space representation based on a separate two-vector space representation of the at least two models, wherein the ingesting ingests the document corpus by: extracting the terminology that includes noun words and phrases from the document corpus to: train a type model that generates a phrase embedding of the terminology in the document corpus; and train a topic model that generates a second phrase embedding of the terminology in the document corpus, wherein the generating generates a vector for a user query as a weighted combination of the vector for each query token in the topic model as a query vector, wherein the generating generates a list of the vectors for instances from the taxonomy in the topic model, and wherein the producing produces the dynamic structure of the relevant category and the facet by: selecting a first parameter of nearest neighbor instances to the query vector from the taxonomy instances using the topic model as query-similar instances; selecting a second parameter of types in the taxonomy with a most number of query-similar instances to use as categories; selecting a third parameter of facets from instances of the types corresponding to each of the categories for the second parameter; and expanding from the third parameter of the facets within each of the second parameter of the categories to obtain more category-similar instances from the document corpus using the type model. 9. The system of claim 8 , further comprising returning the dynamic structure as a data file to a user. 10. The system of claim 8 , embodied in a cloud-computing environment.
Combinations of networks · CPC title
Feedforward networks · CPC title
Machine learning · CPC title
Knowledge engineering; Knowledge acquisition · CPC title
Document management systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.