Navigating electronic documents using domain discourse trees

US10853574B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10853574-B2
Application numberUS-201816145644-A
CountryUS
Kind codeB2
Filing dateSep 28, 2018
Priority dateSep 28, 2017
Publication dateDec 1, 2020
Grant dateDec 1, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, devices, and methods of the present invention relate to extended discourse trees and using extended discourse trees to navigate text. In an example, a discourse navigation application creates a first discourse tree for a first paragraph of a first document and a second discourse tree for a second paragraph of a second document. The application determines an entity and a corresponding first elementary discourse unit from the first discourse tree. The application determines, in the second discourse tree, a second elementary discourse unit that matches the first elementary discourse unit that exists. The application determines a rhetorical relationship between the two elementary discourse units and creates a navigable link between the two discourse trees.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for determining a rhetorical relationship between one or more documents, the method comprising: creating a first discourse tree for a first paragraph of a first document, the first paragraph comprising a first plurality of elementary discourse units, and the first discourse tree representing a first rhetorical relationship between at least two of the first plurality of elementary discourse units; creating a second discourse tree for a second paragraph of a second document, the second paragraph comprising a second plurality of elementary discourse units, and the second discourse tree representing a second rhetorical relationship between at least two of the second plurality of elementary discourse units; determining that a first elementary discourse unit of the first discourse tree includes an entity by: extracting a noun phrase from the first elementary discourse unit; and classifying the noun phrase as including an entity; determining, in the second discourse tree, a second elementary discourse unit that matches the first elementary discourse unit; responsive to determining a rhetorical relationship between the first elementary discourse unit in the first discourse tree and the second elementary discourse unit in the second discourse tree, creating an extended discourse tree by linking the first discourse tree and the second discourse tree via the rhetorical relationship; presenting text associated with the first elementary discourse unit and the rhetorical relationship to a user device; and responsive to receiving, from a user device, a selection of the rhetorical relationship, presenting text associated with the second elementary discourse unit to the user device. 2. The computer-implemented method of claim 1 , wherein the first plurality of elementary discourse units and the second plurality of elementary discourse units each comprise a verb, and wherein each of the first discourse tree and the second discourse tree comprises a plurality of nodes, each nonterminal node representing a rhetorical relationship, and each terminal node of the nodes of the respective discourse tree is associated with an elementary discourse unit. 3. The computer-implemented method of claim 1 , wherein the classifying comprises applying a trained machine-learning model to the noun phrase. 4. The computer-implemented method of claim 1 , wherein an entity refers to one of (i) a person, (ii) a company, (iii) a location, (iv) a name of a document, or (v) a date or time. 5. The computer-implemented method of claim 1 , further comprising responsive to not determining a rhetorical relationship, creating a default rhetorical relationship of type elaboration between the first elementary discourse unit and the second elementary discourse unit and linking the first discourse tree and the second discourse tree with the default rhetorical relationship, thereby creating an extended discourse tree. 6. The computer-implemented method of claim 1 , wherein determining the rhetorical relationship further comprises: combining the first elementary discourse unit and the second elementary discourse unit into a temporary paragraph; and determining the rhetorical relationship within the temporary paragraph by applying discourse parsing to the temporary paragraph. 7. The computer-implemented method of claim 1 , wherein the entity is represented by either (i) one or more phrases or (ii) one or more elementary discourse units. 8. The computer-implemented method of claim 1 , wherein accessing the first document and the second document comprises determining that a difference between (i) a first content score for the first document and (ii) a second content score for the second document are within a threshold. 9. The computer-implemented method of claim 1 , wherein the first document and the second document are obtained by executing a user query of one or more documents. 10. The computer-implemented method of claim 1 , wherein first document and the second document include text based on a particular topic. 11. The computer-implemented method of claim 1 , further comprising determining, in the first document and the second document, a common subject, topic, or entity. 12. The computer-implemented method of claim 1 , wherein the classifying comprises using a list of keywords or searching an internet resource. 13. A non-transitory computer-readable medium storing computer-executable program instructions, wherein when executed by a processing device, the computer-executable program instructions cause the processing device to perform operations comprising: creating a first discourse tree for a first paragraph of a first document, the first paragraph comprising a first plurality of elementary discourse units, and the first discourse tree representing a second rhetorical relationship between at least two of the first plurality of elementary discourse units; creating a second discourse tree for a second paragraph of a second document, the second paragraph comprising a second plurality of elementary discourse units, and the second discourse tree representing a second rhetorical relationship between at least two of the second plurality of elementary discourse units; determining that a first elementary discourse unit of the first discourse tree includes an entity by: extracting a noun phrase from the first elementary discourse unit; and classifying the noun phrase as including an entity; determining, in the second discourse tree, a second elementary discourse unit that matches the first elementary discourse unit; responsive to determining a rhetorical relationship between the first elementary discourse unit in the first discourse tree and the second elementary discourse unit in the second discourse tree, creating an extended discourse tree by linking the first discourse tree and the second discourse tree via the rhetorical relationship; presenting text associated with the first elementary discourse unit and the rhetorical relationship to a user device; and responsive to receiving, from a user device, a selection of the rhetorical relationship, presenting text associated with the second elementary discourse unit to the user device. 14. The non-transitory computer-readable medium of claim 13 , wherein the first plurality of elementary discourse units and the second plurality of elementary discourse units each comprise a verb, and wherein each of the first discourse tree and the second discourse tree comprises a plurality of nodes, each nonterminal node representing a rhetorical relationship, each terminal node of the nodes of the respective discourse tree is associated with an elementary discourse unit. 15. The non-transitory computer-readable medium of claim 13 , wherein an entity refers to one of (i) a person, (ii) a company, (iii) a location, (iv) a name of a document, or (v) a date or time. 16. The non-transitory computer-readable medium of claim 13 , further comprising responsive to not determining a rhetorical relationship, creating a default rhetorical relationship of type elaboration between the first elementary discourse unit and the second elementary discourse unit and linking the first discourse tree and the second discourse tree with the default rhetorical relationship, thereby creating an extended discourse tree. 17. The non-transitory computer-readable medium of claim 13 , wherein determining the rhetorical relationship further comprises: combining the first elementary discourse unit and the second elementary discourse unit into a temporary paragraph; and

Assignees

Inventors

Classifications

  • G06F40/211Primary

    Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • hash tables · CPC title

  • Grammatical analysis; Style critique · CPC title

  • Information retrieval; Database structures therefor; File system structures therefor · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10853574B2 cover?
Systems, devices, and methods of the present invention relate to extended discourse trees and using extended discourse trees to navigate text. In an example, a discourse navigation application creates a first discourse tree for a first paragraph of a first document and a second discourse tree for a second paragraph of a second document. The application determines an entity and a corresponding f…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F40/211. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 01 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).