Probabalistic generation of diverse summaries

US10628474B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10628474-B2
Application numberUS-201615203056-A
CountryUS
Kind codeB2
Filing dateJul 6, 2016
Priority dateJul 6, 2016
Publication dateApr 21, 2020
Grant dateApr 21, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating summaries includes selecting a first subset of text units of a text composition to incorporate into a first summary of the text composition using a weighting of the text units that indicates for each text unit a relative importance of including the text unit in summaries of the text composition. The weighting of the text units is modified to reduce the relative importance of each text unit in the first subset based on the text unit having been selected for the first subset. A second subset of the text units is selected to incorporate into a second summary of the text composition using the modified weighting of the text units. At least one of the first summary and the second summary are provided to a user device.

First claim

Opening claim text (preview).

What is claimed is: 1. At least one computer storage media storing computer-useable instructions that, when used by the at least one computing device, cause the at least one computing device to perform a method comprising: receiving a request from a user to generate summaries of a text composition; dividing the text composition into text units, each text unit representing a respective sentence of the text composition; generating a set of importance scores of each of the text units, each importance score corresponding to an importance metric that quantifies a relative importance of including an associated text unit of the text units in the summaries of the text composition; selecting a first subset from the text units to include in a first summary of the summaries based on the importance scores; generating a set of updated importance scores of each of the text units by adjusting each importance score of the importance scores that corresponds to the first subset of text units to reduce the relative importance of the first subset of text units from the text units not included in the first subset of text units; converting the set of updated importance scores into a probability distribution that defines for each text unit, a respective probability that a value corresponding to the text unit is selected by sampling of the probability distribution, the respective probability corresponding to the importance score of the text unit in the set of updated importance scores; selecting a second subset from the text units to include in a second summary of the summaries, the second subset comprising a different plurality of the text units than the first subset, wherein a given text unit is selected based on the value selected by the sampling corresponding to the given text unit; incorporating the given text unit into the second summary; and transmitting the first summary and the second summary to the user in response to the request. 2. The at least one computer storage media of claim 1 , further comprising probabilistically selecting each text unit in the first subset of text units for the incorporating into the first summary based on the importance scores. 3. The at least one computer storage media of claim 1 , wherein the method further comprises: updating the probability distribution to remove the given text unit from potential outcomes of sampling the probability distribution; and after the removing, sampling the probability distribution to probabilistically select an additional text unit of the text units to incorporate into the second summary based on the sampling. 4. The at least one computer storage media of claim 1 , wherein the method further comprises: analyzing text of each summary of the summaries; determining evaluation scores of each summary of the summaries based on the analyzing of the text of each summary; ranking the summaries by the evaluation scores; and displaying the first summary and the second summary to the user based on the ranking of the summaries. 5. The at least one computer storage media of claim 1 , wherein the importance metric quantifies for each text unit of the text units a level of representativeness of a concept in text of the text unit with respect to the text composition. 6. The at least one computer storage media of claim 1 , wherein the importance metric quantifies for each text unit of the text units a level of diversity of a concept in text of the text unit with respect to concepts in text of text units selected to incorporate into the first summary. 7. A computer-implemented method comprising: selecting a first subset of text units of a text composition to incorporate into a first summary of the text composition using a set of importance scores associated with each of the text units, the importance scores indicating, for each text unit of the text units, a relative importance of including the text unit in summaries of the text composition; generating a set of updated importance scores by: reducing the importance scores of each text unit in the first subset based on the text unit having been selected for the first subset; and maintaining the importance scores for the text units not included in the first subset; converting the set of updated importance scores into a probability distribution that defines for each text unit, a respective probability that a value corresponding to the text unit is selected by sampling of the probability distribution, the respective probability corresponding to the importance score of the text unit in the set of updated importance scores; selecting a second subset of text units to incorporate into a second summary of the text composition, the second subset comprising a different plurality of text units than the first subset, the selecting including a selected text unit from the text units based on the value selected by the sampling corresponding to the selected text unit; and providing at least one of the first summary and the second summary to a user device. 8. The computer-implemented method of claim 7 , wherein the reducing the importance scores of each text unit in the first subset reduces the relative importance for a given text unit of the first subset with respect to the text units not included in the first subset of text units. 9. The computer-implemented method of claim 7 , reducing the importance scores is by a damping factor. 10. The computer-implemented method of claim 7 , wherein the selecting the first subset comprises: converting the importance scores that quantify the relative importance for each of the text units into an initial probability distribution; sampling a value from the initial probability distribution; and including a first text unit in the first subset based on the first text unit corresponding to the sampled value. 11. The computer-implemented method of claim 7 , further comprising: identifying delimiters between text corresponding to the text units in the text composition; and parsing the text composition into the text units based on the identified delimiters. 12. The computer-implemented method of claim 7 , wherein the selecting the first subset is from a group of text units comprising the text units and comprises: probabilistically selecting a first text unit from the group of text units to incorporate into the first summary based on the relative importance of the first text unit; removing the first text unit from the group of text units; and after the removing, probabilistically selecting a second text unit from the group of text units to incorporate into the first summary based on the relative importance of the second text unit. 13. The computer-implemented method of claim 7 , wherein the selecting the second subset is from a group of text units comprising the text units and the method further comprises: determining similar text units in the group of text units to the selected text unit; modifying the probability distribution to reduce probabilities of sampling values corresponding to the determined similar text units; and after the modifying, probabilistically selecting an additional text unit to incorporate into the second summary based on the sampling of the probability distribution. 14. The computer-implemented method of claim 7 , wherein the providing the at least one of the first summary and the second summary to the user device is for presenting on the user device, and the method further comprises: receiving a selection, by the user, of the first summary from a set of the summaries presented on the user device by the presenting; in response to the selection, assigning the first summary to the tex

Assignees

Inventors

Classifications

  • G06F16/345Primary

    Summarisation for human users · CPC title

  • Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • Semantic analysis · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10628474B2 cover?
A method for generating summaries includes selecting a first subset of text units of a text composition to incorporate into a first summary of the text composition using a weighting of the text units that indicates for each text unit a relative importance of including the text unit in summaries of the text composition. The weighting of the text units is modified to reduce the relative importanc…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/345. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 21 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).