Smart find for in-application searching

US2021319068A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021319068-A1
Application numberUS-202016847622-A
CountryUS
Kind codeA1
Filing dateApr 13, 2020
Priority dateApr 13, 2020
Publication dateOct 14, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An in-application search service receives a query from a user attempting to locate information within a document. If a current index and current vector table do not exist, a content indexing service produces an index and a vector table based on the current content of the document. The vector table is based on semantic models that have been pre-trained. The query is tokenized and one of three different processing paths are taken based on the length of the tokenized query. For a single term query, a semantic search and prefix search are performed and the results are combined. If the length of tokenized query exceeds a threshold, a semantic search produces the results. Otherwise, search results are produced based on both prefix fanout and semantic fanout. Results are deduplicated, snippets are extracted, and the results and/or snippets are presented to the user.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for in-application search, comprising: receiving an indication that a user desires to search for content within a document; responsive to the indication, building an index for the document and a vector table for the document, the index comprising information for term searching and the vector table comprising information for semantic searching; performing a term search for a query from the user based upon the index; performing a semantic search for a query from the user based upon the vector table; aggregating search results from both the term search and the semantic search; select a subset of aggregated search results; and present the subset to the user. 2 . The method of claim 1 further comprising: identifying a context snippet for a second subset of the aggregated search results; sending the context snippets and the query to a trained machine learning model; receiving from the trained machine learning model a quick answer; presenting the quick answer as part of the subset of results. 3 . The method of claim 1 wherein prior to building the document index and vector table: determining whether an existing index for the document and an existing vector table for the document are available; determining whether the document has been modified since the existing index and the existing vector table were created; responsive to determining that an existing index and an existing vector table are available and to determining that the document has not been modified, skipping building the index and vector table and using the existing index for the term search and the existing vector table for the semantic search. 4 . The method of claim 1 further comprising: determining a number of results in the subset of search results; responsive to the number falling below a threshold number of results: identifying a secondary search provider; invoking the secondary search provider with the query. 5 . The method of claim 4 further comprising: receiving a second set of search results from the secondary search provider; presenting a subset of the second set of search results to the user. 6 . The method of claim 4 further comprising: receiving selection of a second document from the secondary search provider; building a second index for the second document and a second vector table for the second document, the second index comprising information for term searching and the second vector table comprising information for semantic searching; performing a second term search for the query based upon the second index; performing a second semantic search for the query based upon the second vector table; aggregating second search results from both the second term search and the second semantic search; select a second subset of the aggregated second search results; and sending the second subset to the secondary search provider. 7 . The method of claim 1 wherein performing the semantic search comprises: determining a length of the query; responsive to determining that the length is greater than a threshold, performing a semantic search using the query. 8 . The method of claim 7 further comprising: responsive to determining that the length is not greater than the threshold: perform a prefix fanout on a subset of terms in the query; perform a semantic fanout on a subset of the query; generate combinations from the prefix fanout and the semantic fanout; join context locations for the generated combinations; determine whether a number of context locations falls below a second threshold; responsive to determining that the number of context locations does not fall below the second threshold, semantically validate the joined context locations; and responsive to determining that the number of context locations does fall below the second threshold, performing a semantic search using the query. 9 . The method of claim 1 further comprising: saving the index and the vector table for later use. 10 . The method of claim 1 wherein the index comprises a prefix tree and wherein the vector table is created by vectorizing the document at a plurality of levels. 11 . A system comprising a processor and computer executable instructions, that when executed by the processor, cause the system to perform operations comprising: receiving identification of a document; receiving a query to be used to search the document; determining whether an existing index for the document and an existing vector table for the document are available; determining whether the document has been modified since the existing index and the existing vector table were created; responsive to determining that an existing index and an existing vector table are not available or to determining that the document has been modified: building an index for the document and a vector table for the document, the index comprising information for term searching and the vector table comprising information for semantic searching; performing a term search for a query from a user based upon the index; performing a semantic search for a query from the user based upon the vector table; aggregating search results from both the term search and the semantic search; select a subset of aggregated search results; and present the subset to the user. 12 . The system of claim 11 further comprising: identifying a context snippet for a second subset of the aggregated search results; sending the context snippets and the query to a trained machine learning model; receiving from the trained machine learning model a quick answer; presenting the quick answer as part of the subset of results. 13 . The system of claim 11 further comprising: responsive to determining that an existing index and an existing vector table are available and to determining that the document has not been modified: performing a term search for a query from the user based upon the existing index; performing a semantic search for a query from the user based upon the existing vector table; aggregating search results from both the term search and the semantic search; select a subset of aggregated search results; and present the subset to the user. 14 . The system of claim 11 further comprising: determining a number of results in the subset of search results; responsive to the number falling below a threshold number of results: identifying a secondary search provider; invoking the secondary search provider with the query. 15 . The system of claim 14 further comprising: receiving a second set of search results from the secondary search provider; presenting a subset of the second set of search results to the user. 16 . The system of claim 14 further comprising: receiving selection of a second document from the secondary search provider; building a second index for the second document and a second vector table for the second document, the second index comprising information for term searching and the second vector table comprising information for semantic searching; performing a second term search for the query based upon the second index; performing a second semantic search for the query based upon the second vector table; aggregating second search results from both the second term search and the second semantic search; select a second subset of the aggregated second search results; and sending the second subset to the secondary search provider. 17 . The system of claim 11 wherein performing the semantic search comprises: deter

Assignees

Inventors

Classifications

  • based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title

  • Machine learning · CPC title

  • Natural language query formulation or dialogue systems · CPC title

  • Creation of semantic tools, e.g. ontology or thesauri · CPC title

  • by using string matching techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021319068A1 cover?
An in-application search service receives a query from a user attempting to locate information within a document. If a current index and current vector table do not exist, a content indexing service produces an index and a vector table based on the current content of the document. The vector table is based on semantic models that have been pre-trained. The query is tokenized and one of three di…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/90332. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).