System and method for domain-independent aspect level sentiment detection

US10628528B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10628528-B2
Application numberUS-201816016106-A
CountryUS
Kind codeB2
Filing dateJun 22, 2018
Priority dateJun 29, 2017
Publication dateApr 21, 2020
Grant dateApr 21, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for automated aspect-based sentiment analysis includes parsing reviews from a first domain to generate rhetorical structure trees and extracting rhetorical rules from the rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on annotation data. The method further includes parsing reviews from a second domain to generate a second plurality of rhetorical structure trees, generating training data that associates at least one aspect in the review from the second domain with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules, and training a classifier to identify sentiments in reviews from the second domain using the second plurality of reviews and the training data.

First claim

Opening claim text (preview).

What is claimed: 1. A method for automated sentiment analysis comprising: receiving, with a network interface device in a server, a first plurality of reviews from a first domain, each review in the first plurality of reviews being associated with annotation data that identify a plurality of sentiments and a plurality of aspects included in the first plurality of reviews; parsing, with a processor in the server, the first plurality of reviews from the first domain to generate a first plurality of rhetorical structure trees, each rhetorical structure tree in the first plurality of rhetorical structure trees corresponding to one review in the first plurality of reviews and each rhetorical structure tree in the first plurality of rhetorical structure trees including at least one span associated with a predetermined relationship; extracting, with the processor in the server, a plurality of rhetorical rules from the first plurality of rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the first plurality of rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on the annotation data; receiving, with the network interface device in the server, a second plurality of reviews from a second domain that is different from the first domain, the second plurality of reviews including no annotation data; parsing, with the processor in the server, the second plurality of reviews from the second domain to generate a second plurality of rhetorical structure trees, each rhetorical structure tree in the second plurality of rhetorical structure trees corresponding to one review in the second plurality of reviews, each rhetorical structure tree in the second plurality of rhetorical structure trees including at least one span associated with the predetermined relationship; generating, with the processor in the server, training data that associates at least one aspect in the review in the second plurality of reviews with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules in response to a path extracted from the rhetorical structure tree including the at least one aspect in the review in the second plurality of reviews matching the path of the rhetorical rule; and training, with the processor in the server, a classifier to identify sentiments in reviews from the second domain using the second plurality of reviews and the training data. 2. The method of claim 1 , the extracting a plurality of rhetorical rules further comprising: extracting, with the processor in the server, a plurality of paths from the first plurality of rhetorical structure trees, each path in the plurality of paths including at least one span that contains an aspect. 3. The method of claim 1 further comprising: receiving, with the network interface device in the server, a third plurality of reviews from the second domain; identifying, with the processor in the server, a plurality of sentiments for at least one aspect that is included in the third plurality of reviews based on an output of the classifier; and generating, with the processor in the server, an output including an aspect-level sentiment report that identifies an aggregate sentiment level for the at least one aspect in the third plurality of reviews. 4. The method of claim 3 further comprising: parsing, with the processor in the server, the third plurality of reviews from the second domain to generate a third plurality of rhetorical structure trees, each rhetorical structure tree in the third plurality of rhetorical structure trees corresponding to one review in the third plurality of reviews, each rhetorical structure tree in the third plurality of rhetorical structure trees including at least one span associated with the predetermined relationship; and filtering, with the processor in the server, the output of the classifier to remove a sentiment corresponding to an aspect in one review in the third plurality of reviews with a path in the rhetorical tree that corresponds to a rhetorical rule that has a probability of the sentiment identified for the one review being less than a predetermined threshold. 5. The method of claim 1 , the parsing further comprising: identifying, with the processor in the server, a predetermined relationship for at least one span in a rhetorical structure tree in the plurality of rhetorical structure trees as a joint relationship, a concession relationship, an elaboration relationship, or an enablement relationship. 6. The method of claim 4 the parsing further comprising: identifying the joint relationship in the at least one span that further includes at least two spans in the rhetorical structure tree. 7. The method of claim 1 wherein the classifier for the second domain is a maximum entropy classifier. 8. A system for automated sentiment analysis comprising: a network interface device; a memory; and a processor operatively connected to the network interface device and the memory, the processor being configured to: receive a first plurality of reviews from a first domain using the network interface device, each review in the first plurality of reviews being associated with annotation data that identify a plurality of sentiments and a plurality of aspects included in the first plurality of reviews; parse the first plurality of reviews from the first domain to generate a first plurality of rhetorical structure trees, each rhetorical structure tree in the first plurality of rhetorical structure trees corresponding to one review in the first plurality of reviews and each rhetorical structure tree in the first plurality of rhetorical structure trees including at least one span associated with a predetermined relationship; extract a plurality of rhetorical rules from the first plurality of rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the first plurality of rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on the annotation data; receive a second plurality of reviews from a second domain that is different from the first domain using the network interface device, the second plurality of reviews including no annotation data; parse the second plurality of reviews from the second domain to generate a second plurality of rhetorical structure trees, each rhetorical structure tree in the second plurality of rhetorical structure trees corresponding to one review in the second plurality of reviews, each rhetorical structure tree in the second plurality of rhetorical structure trees including at least one span associated with the predetermined relationship; generate training data that associates at least one review in the second plurality of reviews with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules in response to a path extracted from the rhetorical structure tree corresponding to the at least one review in the second plurality of reviews matching the path of the rhetorical rule; and train a classifier to identify sentiments and aspects in reviews from the second domain using the second plurality of reviews and the training data, the classifier being stored in the memory for use in classifying sentiments and aspects for additional reviews in the second domain. 9. The system of claim 8 , the processor being further configured to: extract a plurality of paths from the first plurality of rhetorical structure trees, each path in the plurality of paths including at least one span. 10. The system of claim 8 , the processor be

Assignees

Inventors

Classifications

  • Market modelling; Market analysis; Collecting market data · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • Discourse or dialogue representation · CPC title

  • G06F40/30Primary

    Semantic analysis · CPC title

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10628528B2 cover?
A method for automated aspect-based sentiment analysis includes parsing reviews from a first domain to generate rhetorical structure trees and extracting rhetorical rules from the rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the rhetorical structure trees associated with a probability that the path corresponds to a positiv…
Who is the assignee on this patent?
Bosch Gmbh Robert
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 21 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).