Method and apparatus for processing questions and answers, electronic device and storage medium
US-2021200956-A1 · Jul 1, 2021 · US
US2021319068A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021319068-A1 |
| Application number | US-202016847622-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 13, 2020 |
| Priority date | Apr 13, 2020 |
| Publication date | Oct 14, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An in-application search service receives a query from a user attempting to locate information within a document. If a current index and current vector table do not exist, a content indexing service produces an index and a vector table based on the current content of the document. The vector table is based on semantic models that have been pre-trained. The query is tokenized and one of three different processing paths are taken based on the length of the tokenized query. For a single term query, a semantic search and prefix search are performed and the results are combined. If the length of tokenized query exceeds a threshold, a semantic search produces the results. Otherwise, search results are produced based on both prefix fanout and semantic fanout. Results are deduplicated, snippets are extracted, and the results and/or snippets are presented to the user.
Opening claim text (preview).
What is claimed is: 1 . A method for in-application search, comprising: receiving an indication that a user desires to search for content within a document; responsive to the indication, building an index for the document and a vector table for the document, the index comprising information for term searching and the vector table comprising information for semantic searching; performing a term search for a query from the user based upon the index; performing a semantic search for a query from the user based upon the vector table; aggregating search results from both the term search and the semantic search; select a subset of aggregated search results; and present the subset to the user. 2 . The method of claim 1 further comprising: identifying a context snippet for a second subset of the aggregated search results; sending the context snippets and the query to a trained machine learning model; receiving from the trained machine learning model a quick answer; presenting the quick answer as part of the subset of results. 3 . The method of claim 1 wherein prior to building the document index and vector table: determining whether an existing index for the document and an existing vector table for the document are available; determining whether the document has been modified since the existing index and the existing vector table were created; responsive to determining that an existing index and an existing vector table are available and to determining that the document has not been modified, skipping building the index and vector table and using the existing index for the term search and the existing vector table for the semantic search. 4 . The method of claim 1 further comprising: determining a number of results in the subset of search results; responsive to the number falling below a threshold number of results: identifying a secondary search provider; invoking the secondary search provider with the query. 5 . The method of claim 4 further comprising: receiving a second set of search results from the secondary search provider; presenting a subset of the second set of search results to the user. 6 . The method of claim 4 further comprising: receiving selection of a second document from the secondary search provider; building a second index for the second document and a second vector table for the second document, the second index comprising information for term searching and the second vector table comprising information for semantic searching; performing a second term search for the query based upon the second index; performing a second semantic search for the query based upon the second vector table; aggregating second search results from both the second term search and the second semantic search; select a second subset of the aggregated second search results; and sending the second subset to the secondary search provider. 7 . The method of claim 1 wherein performing the semantic search comprises: determining a length of the query; responsive to determining that the length is greater than a threshold, performing a semantic search using the query. 8 . The method of claim 7 further comprising: responsive to determining that the length is not greater than the threshold: perform a prefix fanout on a subset of terms in the query; perform a semantic fanout on a subset of the query; generate combinations from the prefix fanout and the semantic fanout; join context locations for the generated combinations; determine whether a number of context locations falls below a second threshold; responsive to determining that the number of context locations does not fall below the second threshold, semantically validate the joined context locations; and responsive to determining that the number of context locations does fall below the second threshold, performing a semantic search using the query. 9 . The method of claim 1 further comprising: saving the index and the vector table for later use. 10 . The method of claim 1 wherein the index comprises a prefix tree and wherein the vector table is created by vectorizing the document at a plurality of levels. 11 . A system comprising a processor and computer executable instructions, that when executed by the processor, cause the system to perform operations comprising: receiving identification of a document; receiving a query to be used to search the document; determining whether an existing index for the document and an existing vector table for the document are available; determining whether the document has been modified since the existing index and the existing vector table were created; responsive to determining that an existing index and an existing vector table are not available or to determining that the document has been modified: building an index for the document and a vector table for the document, the index comprising information for term searching and the vector table comprising information for semantic searching; performing a term search for a query from a user based upon the index; performing a semantic search for a query from the user based upon the vector table; aggregating search results from both the term search and the semantic search; select a subset of aggregated search results; and present the subset to the user. 12 . The system of claim 11 further comprising: identifying a context snippet for a second subset of the aggregated search results; sending the context snippets and the query to a trained machine learning model; receiving from the trained machine learning model a quick answer; presenting the quick answer as part of the subset of results. 13 . The system of claim 11 further comprising: responsive to determining that an existing index and an existing vector table are available and to determining that the document has not been modified: performing a term search for a query from the user based upon the existing index; performing a semantic search for a query from the user based upon the existing vector table; aggregating search results from both the term search and the semantic search; select a subset of aggregated search results; and present the subset to the user. 14 . The system of claim 11 further comprising: determining a number of results in the subset of search results; responsive to the number falling below a threshold number of results: identifying a secondary search provider; invoking the secondary search provider with the query. 15 . The system of claim 14 further comprising: receiving a second set of search results from the secondary search provider; presenting a subset of the second set of search results to the user. 16 . The system of claim 14 further comprising: receiving selection of a second document from the secondary search provider; building a second index for the second document and a second vector table for the second document, the second index comprising information for term searching and the second vector table comprising information for semantic searching; performing a second term search for the query based upon the second index; performing a second semantic search for the query based upon the second vector table; aggregating second search results from both the second term search and the second semantic search; select a second subset of the aggregated second search results; and sending the second subset to the secondary search provider. 17 . The system of claim 11 wherein performing the semantic search comprises: deter
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
Machine learning · CPC title
Natural language query formulation or dialogue systems · CPC title
Creation of semantic tools, e.g. ontology or thesauri · CPC title
by using string matching techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.