Automatic document summarization using search engine intelligence
US-2017277668-A1 · Sep 28, 2017 · US
US11226972B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11226972-B2 |
| Application number | US-201916278856-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 19, 2019 |
| Priority date | Feb 19, 2019 |
| Publication date | Jan 18, 2022 |
| Grant date | Jan 18, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Query service receives a query comprising at least a name component. The query service searches a document corpus to identify multiple passages, each comprising a mention of the name component within a selection of one or more documents of the document corpus. The query service collects bins, each bin comprising a distinct selection of the passages from the one or more documents, each of the bins identifying a separate relationship the name component participates in within the distinct selection of passages. The query service assesses a separate score of each respective bin reflecting the relevance of each respective bin to the query. The query service returns a response to the query with the bins each ranked according to each separate score.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, by a computer system, a query comprising at least a name component and one or more identifier components identifying one or more particular entities associated with the name component; searching, by the computer system, a plurality of documents in a document corpus to identify a plurality of passages each comprising a mention of the name component within a selection of one or more documents of the plurality of documents; collecting, by the computer system, a plurality of bins each comprising a distinct selection of the plurality of passages from the one or more documents, each of the plurality of bins identifying a separate relationship the name component and the one or more identifier components participate in within the distinct selection of the plurality of passages; extracting, by the computer system, from each of the one or more identifier components in each respective bin, an identifier string and a desired relationship; calculating, by the computer system, one or more match values for each of the one or more identifier components in each respective bin, the one or more match values based, at least in part, on: (i) the desired relationship, and (ii) a closeness of the identifier string to the mention; assessing, by the computer system, a separate score of each respective bin of the plurality of bins reflecting the relevance of each respective bin to the query, the separate score based, at least in part, on the calculated one or more match values for the one or more identifier components in the respective bin; and returning, by the computer system, a response to the query with the plurality of bins each ranked according to each separate score. 2. The method of claim 1 , wherein the one or more match values are further based, at least in part, on whether a fuzzy match to the identifier string is acceptable. 3. The method of claim 1 , wherein the one or more match values are further based, at least in part, on one or more identifier ranking characteristics. 4. The method of claim 3 , further comprising: evaluating, by the computer system, a ranking characteristic of the one or more identifier ranking characteristics such that an extracted relationship of the identifier string to the mention that matches the desired relationship scores higher than an occurrence of the identifier string to the mention that does not match the desired relationship. 5. The method of claim 3 , further comprising: evaluating, by the computer system, a ranking characteristic of the one or more identifier ranking characteristics such that an occurrence of the identifier string closer to the mention scores higher than the occurrence of the identifier string farther from the mention. 6. The method of claim 5 , wherein evaluating the ranking characteristic of the one or more ranking characteristics comprises: evaluating, by the computer system, a nearness of the occurrence of the identifier string to the mention based on a nearness percentage of the identifier string to the mention in view of a maximum number of words set to a maxwords value. 7. The method of claim 1 , wherein: the query further comprises one or more association components each for indicating or counter-indicating a particular entity associated with the name component; and the separate score is further based, at least in part, on one or more additional match values calculated for each of the one or more identifier components in each respective bin based on one or more association ranking characteristics. 8. The method of claim 7 , further comprising: calculating, by the computer system, the one or more additional match values for each of the one or more identifier components in each respective bin based on the one or more association ranking characteristics by extracting an association string and a desired relationship from each of the one or more association components; evaluating, by the computer system, a first ranking characteristic of one or more association ranking characteristics such that an extracted relationship of the association string to the mention that matches the desired relationship scores higher than an occurrence of the association string to the mention that does not match the desired relationship; and evaluating, by the computer system, a second ranking characteristic of one or more association ranking characteristics such that a higher relative frequency of the association string in one bin of the plurality of bins scores higher than a lower relative frequency of the association string in another bin of the plurality of bins. 9. A computer system comprising one or more processors, one or more computer-readable memories, one or more computer-readable storage devices, and program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, the stored program instructions comprising: program instructions to receive a query comprising at least a name component and one or more identifier components identifying one or more particular entities associated with the name component; program instructions to search a plurality of documents in a document corpus to identify a plurality of passages each comprising a mention of the name component within a selection of one or more documents of the plurality of documents; program instructions to collect a plurality of bins each comprising a distinct selection of the plurality of passages from the one or more documents, each of the plurality of bins identifying a separate relationship the name component and the one or more identifier components participate in within the distinct selection of the plurality of passages; program instructions to extract, from each of the one or more identifier components in each respective bin, an identifier string and a desired relationship; program instructions to calculate one or more match values for each of the one or more identifier components in each respective bin, the one or more match values based, at least in part, on: (i) the desired relationship, and (ii) a closeness of the identifier string to the mention; program instructions to assess a separate score of each respective bin of the plurality of bins reflecting the relevance of each respective bin to the query, the separate score based, at least in part, on the calculated one or more match values for the one or more identifier components in the respective bin; and program instructions to return a response to the query with the plurality of bins each ranked according to each separate score. 10. The computer system of claim 9 , wherein the one or more match values are further based, at least in part, on whether a fuzzy match to the identifier string is acceptable. 11. The computer system of claim 9 , wherein the one or more match values are further based, at least in part, on one or more identifier ranking characteristics. 12. The computer system of claim 11 , the stored program instructions further comprising: program instructions to evaluate a ranking characteristic of the one or more identifier ranking characteristics such that an extracted relationship of the identifier string to the mention that matches the desired relationship scores higher than an occurrence of the identifier string to the mention that does not match the desired relationship. 13. The computer system of claim 11 , the stored program instructions further comprising: program instructions to evaluate a ranking characteristic of the one or more identifier ranking characteristics such that an occurrence of the iden
using ranking · CPC title
Query execution (filtering based on additional data G06F16/335) · CPC title
Fuzzy queries · CPC title
Document management systems · CPC title
Clustering or classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.