Relevant passage retrieval system

US12174839B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12174839-B2
Application numberUS-201616303274-A
CountryUS
Kind codeB2
Filing dateMay 23, 2016
Priority dateMay 23, 2016
Publication dateDec 24, 2024
Grant dateDec 24, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A new architecture is provided to support a precise information retrieval system on a web scale. The architecture provides algorithms to generate candidates and select the top N results via ranking models (e.g., Semantic ranking models, Aggregation ranking models) to capture term relationships between query and result contents at search-time.

First claim

Opening claim text (preview).

We claim: 1. A computer implemented method comprising: identifying, at a server, a corpus of electronic objects for passage extraction; extracting passages from each identified electronic object; receiving, at the server, a query from a client computing device; ranking extracted passages in each of the electronic objects based on a score that indicates a likelihood that each of the extracted passages directly answers the received query to produce top-ranked passages; aggregating the top-ranked passages to form aggregated passages; generating for each aggregate passage a caption parsed from content of a corresponding top-ranked passage; re-ranking the aggregated passages to form ranked aggregated passages using analysis, by a machine learned model, of the generated captions of the top-ranked passages, wherein the re-ranking is based on a second score that indicates a likelihood that each of the ranked aggregated passages directly answers the received query, the second score based on key entities in the received query and entities in the extracted passages; selecting at least one top-ranked passage from the re-ranked aggregated passages; and providing, by the server, in an electronic format suitable for display on a display of the client computing device, the generated caption corresponding to the at least one top-ranked passage. 2. The computer implemented method of claim 1 , wherein the electronic objects are one of emails, web pages, images, audio files, videos, or documents. 3. The computer implemented method of claim 1 , wherein ranking the extracted passages in each of the electronic objects comprises: parsing the query to determine query information, wherein the query information comprises at least one of: a query semantic meaning; query keywords; and query entities; parsing each of the extracted passages to determine passage information, wherein the passage information comprises at least one of: a passage semantic meaning; passage keywords; and passage entities; and generating a machine learning passage ranking model with a variety of semantic features, including, a semantic translation model using the query information and the passage information. 4. The computer implemented method of claim 1 , further comprising providing a link to the electronic object from where the passage was retrieved. 5. The computer implemented method of claim 4 , wherein the passage and the link are provided to a client device. 6. The computer implemented method of claim 4 , wherein the query was received from the client device. 7. The computer implemented method of claim 1 , wherein the corpus of electronic objects is stored in a distributed network. 8. The computer implemented method of claim 1 , wherein the generated the caption is generated using a second machine learned model different from the machine learned model. 9. A computer implemented method comprising: receiving a query; determining a semantic intent of the query; in response to determining the semantic intent, performing a search, using a search engine, to generate search results, wherein the search results include electronic objects which are ordered according to a rank based on the semantic intent of the query; analyzing a plurality of passages from each of at least a subset of the ranked search results to produce a plurality of top-ranked passages from each electronic object; aggregating the top-ranked passages from each electronic object for the subset of the ranked search results; generating for each aggregate passage a caption parsed from content of a corresponding top-ranked passage; ranking the aggregated top-ranked passages to identify a plurality top-ranked aggregated passages using analysis, by a machine learned model, of the generated captions of the top-ranked passages by a machine learned model, wherein the ranking is based on a score that indicates a likelihood that each of the plurality of top-ranked aggregated passages directly answers the received query, the score based on key entities in the received query and entities in the plurality of passages; providing, in an electronic format suitable for display on a display device of a client device, the generated caption corresponding to the at least one top-ranked aggregated passage. 10. The computer implemented method of claim 9 , further comprising: analyzing, using a machine learning model, the query to determine query information, wherein the query information includes the semantic intent of the query; analyzing, using the machine learning model, each of the plurality of passages from the subset of the ranked search results to determine passage information; generating a machine learning passage ranking model with a variety of semantic features, including, a semantic translation model using the query information and the passage information; and placing the passage in an order among other passages of the plurality of passages based on a numerical score. 11. The computer implemented method of claim 9 , wherein the electronic objects are one of emails, web pages, audio files, images, videos, or documents. 12. The computer implemented method of claim 9 , further comprising providing a link to a web page that contains the at least one top-ranked passage. 13. The computer implemented method of claim 12 , wherein the link is returned to the client device. 14. The computer implemented method of claim 13 , wherein the query was received from the client device. 15. The computer implemented method of claim 9 , wherein the electronic objects are stored in a distributed network. 16. A system comprising a computing device, the computing device comprising: at least one processor; and a memory for storing and encoding computer executable instructions that, when executed by the at least one processor is operative to: receive a query; determine a semantic intent of the query based at least upon the query; in response to determining the semantic intent, perform a search to generate search results, wherein the search results include electronic objects which are ordered according to a rank based on the semantic intent of the query; analyze a plurality of passages from each of at least a subset of the ranked search results to produce a plurality of top-ranked passages; aggregate the top-ranked passages for each electronic object for the subset of the ranked search results; generating for each aggregate passage a caption parsed from content of a corresponding top-ranked passage; rank the aggregated top-ranked passages to identify a plurality of top-ranked aggregated passages using analysis, by a machine learned model, of the generated captions of the top-ranked passages by a machine learned model, wherein the ranking is based on a score that indicates a likelihood that each of the plurality of top-ranked aggregated passages directly answers the received query, the score based on key entities in the received query and entities in the plurality of passages; provide, in an electronic format suitable for display on a display device of a client device, the generated caption corresponding to the at least one top-ranked aggregated passage. 17. The system of claim 16 , wherein the at least one processor is further operative to: analyze, using a machine learning model, the query to determine query information, wherein the query information includes the semantic intent of the query; analyze, using the machine learning model, each of the plurality of passages from the subset of the ranked search results to determine passage information; g

Assignees

Inventors

Classifications

  • G06F16/345Primary

    Summarisation for human users · CPC title

  • using natural language analysis · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • Machine learning · CPC title

  • Details of hyperlinks; Management of linked annotations · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12174839B2 cover?
A new architecture is provided to support a precise information retrieval system on a web scale. The architecture provides algorithms to generate candidates and select the top N results via ranking models (e.g., Semantic ranking models, Aggregation ranking models) to capture term relationships between query and result contents at search-time.
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/345. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).