What technology area does this patent fall under?

Primary CPC classification G06F40/216. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Utilizing discourse structure of noisy user-generated content for chatbot learning

US10599885B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10599885-B2
Application number	US-201816010156-A
Country	US
Kind code	B2
Filing date	Jun 15, 2018
Priority date	May 10, 2017
Publication date	Mar 24, 2020
Grant date	Mar 24, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, devices, and methods of the present invention uses noisy-robust discourse trees to determine a rhetorical relationship between one or more sentences. In an example, a rhetoric classification application creates a noisy-robust communicative discourse tree. The application accesses a document that includes a first sentence, a second sentence, a third sentence, and a fourth sentence. The application identifies that syntactic parse trees cannot be generated for the first sentence and the second sentence. The application further creates a first communicative discourse tree from the second, third, and fourth sentences and a second communicative discourse tree from the first, third, and fourth sentences. The application aligns the first communicative discourse tree and the second communicative discourse tree and removes any elementary discourse units not corresponding to a relationship that is in common between the first and second communicative discourse trees.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of creating a noisy-text robust communicative discourse tree, comprising: accessing a document comprising a first sentence, a second sentence, a third sentence, and a fourth sentence; identifying that syntactic parse trees cannot be generated for the first sentence and the second sentence; creating a first communicative discourse tree from the second, third, and fourth sentences; creating a second communicative discourse tree from the first, third, and fourth sentences; and aligning the first communicative discourse tree and the second communicative discourse tree by: determining a mapping between elementary discourse units in the first communicative discourse tree and the second communicative discourse tree; and identifying which rhetorical relationships are common between the first communicative discourse tree and the second communicative discourse tree; and removing, from the first communicative discourse tree and the second communicative discourse tree, any elementary discourse units not corresponding to a relationship that is in common, thereby creating a noisy-text robust communicative discourse tree. 2. The method of claim 1 , wherein identifying that a syntactic parse tree cannot be generated for a particular sentence comprises: accessing a first confidence score for the sentence, wherein the first confidence score represents a confidence of a first parse; accessing a second confidence score for the sentence, wherein the second confidence score represents a confidence of a second parse; and determining that a difference between the first confidence score and the second score confidence is above a threshold. 3. The method of claim 1 , wherein each sentence comprises a plurality of fragments and a verb, and wherein generating a communicative discourse tree comprises: generating a discourse tree that represents rhetorical relationships between the plurality of fragments, wherein the discourse tree comprises a plurality of nodes, each nonterminal node representing a rhetorical relationship between two of the plurality of fragments, each terminal node of the nodes of the discourse tree is associated with one of the plurality of fragments; and matching each fragment that has a verb to a verb signature, thereby creating a communicative discourse tree, the matching comprising: accessing a plurality of verb signatures, wherein each verb signature comprises the verb of the fragment and a sequence of thematic roles, wherein thematic roles describe the relationship between the verb and related words; determining, for each verb signature of the plurality of verb signatures, a plurality of thematic roles of the respective signature that match a role of a word in the fragment; selecting a particular verb signature from the plurality of verb signatures based on the particular verb signature comprising a highest number of matches; and associating the particular verb signature with the fragment. 4. The method of claim 3 , wherein the verb is a communicative verb and each verb signature of the plurality of verb signatures comprises one of (i) an adverb, (ii) a noun phrase, or (iii) a noun. 5. The method of claim 3 , wherein the associating further comprises: identifying each of the plurality of thematic roles in the particular verb signature; and matching, for each of the plurality of thematic roles in the particular verb signature, a corresponding word in the fragment to the thematic role. 6. The method of claim 1 , wherein the document comprises text received from an Internet-based source. 7. The method of claim 1 , wherein the document comprises text that is grammatically incorrect. 8. A computer-implemented method for determining a complementarity of a pair of two documents by analyzing noisy-text robust communicative discourse trees, the method comprising: determining, for a first document, a first noisy-text robust communicative discourse tree comprising a first root node, wherein the noisy-text robust communicative discourse tree is a discourse tree that includes communicative actions and wherein the first document comprises a first sentence for which a parse tree cannot be reliably generated; determining, for a second document, a second noisy-text robust communicative discourse tree comprising a second root node, wherein the second document comprises a second sentence for which a parse tree cannot be reliably generated; merging the noisy-text robust communicative discourse trees by identifying that the first root node and the second root node are identical; computing a level of complementarity between the first noisy-text robust communicative discourse tree and the second noisy-text robust communicative discourse tree by applying a predictive model to the merged communicative discourse tree; and responsive to determining that the level of complementarity is above a threshold, identifying the first and second documents as complementary. 9. The method of claim 8 , further comprising: accessing a set of training data comprising a set of training pairs, wherein each training pair comprises a communicative discourse tree that represents text and an expected value; and training the predictive model by iteratively: providing one of the training pairs to the predictive model, receiving, from the predictive model, a predicted value; calculating a loss function by calculating a difference between the predicted value and the respective expected value; and adjusting internal parameters of the predictive model to minimize the loss function, wherein the expected value and the predicted value comprise (i) a particular text style, (ii) a presence of an argument, (iii) a validity of the text, (iv) a truthfulness of the text, or (v) an authenticity of the text. 10. The method of claim 8 , further comprising: accessing a set of training data comprising a set of training pairs, wherein each training pair comprises (i) a communicative discourse tree that represents a question and an answer and (ii) an expected level of complementarity; and training the predictive model by iteratively: providing one of the training pairs to the predictive model, receiving, from the predictive model, a determined level of complementarity; calculating a loss function by calculating a difference between the determined level of complementarity and the respective expected level of complementarity; and adjusting internal parameters of the predictive model to minimize the loss function. 11. The method of claim 8 , wherein the predictive model is trained to determine a level of complementarity of sub-trees of two noisy-text robust communicative discourse trees. 12. The method of claim 8 , wherein the predictive model is a support vector model. 13. The method of claim 10 , wherein the communicative discourse tree comprises an answer that is relevant but is rhetorically incorrect when compared to the question. 14. A system comprising: a non-transitory computer-readable medium storing computer-executable program instructions for creating a noisy-text robust communicative discourse tree; and a processing device communicatively coupled to the non-transitory computer-readable medium for executing the computer-executable program instructions, wherein executing the non-transitory computer-executable program instructions configures the processing device to perform operations comprising: accessing a document comprising a first sentence, a second sentence, a third sentence, and a fourth sentence; identifying that syntactic parse trees cannot be generated for the first sentence and the second sentence; creati

Assignees

Oracle Int Corp

Inventors

Galitsky Boris

Classifications

G06N20/10
using kernel methods, e.g. support vector machines [SVM] · CPC title
G06F40/35
Discourse or dialogue representation · CPC title
G06N5/022
Knowledge engineering; Knowledge acquisition · CPC title
G06F40/289
Phrasal analysis, e.g. finite state techniques or chunking · CPC title
G06F40/216Primary
using statistical methods · CPC title

Patent family

Related publications grouped by family.

View patent family 64562215

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10599885B2 cover?: Systems, devices, and methods of the present invention uses noisy-robust discourse trees to determine a rhetorical relationship between one or more sentences. In an example, a rhetoric classification application creates a noisy-robust communicative discourse tree. The application accesses a document that includes a first sentence, a second sentence, a third sentence, and a fourth sentence. The …
Who is the assignee on this patent?: Oracle Int Corp
What technology area does this patent fall under?: Primary CPC classification G06F40/216. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).