System and method for topic extraction and opinion mining

US9514156B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9514156-B2
Application numberUS-201314020421-A
CountryUS
Kind codeB2
Filing dateSep 6, 2013
Priority dateSep 28, 2009
Publication dateDec 6, 2016
Grant dateDec 6, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for topic extraction and opinion mining are described. For example, a machine selects a document that is pertinent to a topic based on searching a plurality of documents. The machine identifies an identifier of a party to a transaction being referenced in the document, and identifies the transaction conducted by the party to the transaction based on the document. The machine determines a rating of the transaction based on the document. The determining of the rating of the transaction includes identifying, from a plurality of polarity words included in the document, a dominant polarity word based on a syntactic distance between the dominant polarity word and the topic in a syntactic tree. The machine determines a sentiment of the document based on the transaction, and the rating of the transaction.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a topic extractor configured to: select a document that is pertinent to a topic based on searching, using a key phrase, a plurality of documents, identify an identifier of a party to a transaction being referenced in the document, the identifier of the party to the transaction being associated with the topic, and identify the transaction conducted by the party to the transaction based on the document; and a sentiment analyzer configured to: determine a rating of the transaction conducted by the party to the transaction based on the document, a determining of the rating of the transaction conducted by the party to the transaction including identifying, from a plurality of polarity words included in the document, a dominant polarity word based on a syntactic distance between the dominant polarity word and the topic in a syntactic tree, the dominant polarity word having a dominant polarity impact on the rating of the transaction conducted by the party to the transaction, and determine, using at least one processor, a sentiment of the document based on the transaction conducted by the party to the transaction associated with the topic, and the rating of the transaction conducted by the party to the transaction associated with the topic. 2. The system of claim 1 , wherein the document is a comment submitted to a community forum for publication on the community forum. 3. The system of claim 1 , wherein the topic extractor is further configured to: identify one or more documents that reference a product of a plurality of products based on detecting a topic of a set of documents wherein the topic comprises a product name; and the sentiment analyzer is further configured to: determine a user sentiment for the product based on the sentiment of the one or more documents that reference the product, and generate a product ranking that orders the plurality of products based on the user sentiment for each of the plurality of products. 4. The system of claim 1 , wherein the sentiment analyzer is further configured to: determine, for each of a plurality of products, a combined rating of an aspect of the product based on one or more ratings of the aspect detected in one or more documents that reference the product, and generate a product comparison interface that indicates a comparison of the plurality of products based on the aspect and the combined rating of the aspect of each of the plurality of products. 5. The system of claim 1 , wherein the key phrase is selected based on identifying a plurality of words that are collocated within the document. 6. The system of claim 1 , wherein the sentiment analyzer determines the rating by performing operations including: detecting a polarity word in the document, and determining a polarity impact of the polarity word on the rating of the transaction conducted by the party to the transaction. 7. The system of claim 1 , wherein the topic extractor is further configured to: identify a source of the document based on metadata describing the document, and classify the document based on the sentiment of the document and the source of the document. 8. The system of claim 1 , wherein the topic extractor is further configured to: identify one or more key phrases associated with the document, wherein the one or more key phrases comprise metadata used to classify or identify a source of the document. 9. The system of claim 1 , wherein the topic extractor is further configured to: group the document with a further entry, and extract a relationship between the document and the further entry. 10. The system of claim 1 , wherein the determining of the sentiment of the document includes determining that a first number of documents of a plurality of documents pertaining to the topic indicate positive feedback for the transaction conducted by the party to the transaction and a second number of documents of the plurality of documents pertaining to the topic indicate negative feedback for the transaction conducted by the party to the transaction; and the sentiment analyzer is further configured to calculate a ratio of the first number to the second number. 11. The system of claim 1 , wherein the topic extractor is further configured to: rank a plurality of transactions conducted by the party to the transaction based on a frequency of the plurality of transactions. 12. The system of claim 11 , wherein the topic extractor is further configured to: identify one or more essential transactions from the plurality of transactions based on a determination that a frequency of each of the one or more essential transactions meets or exceeds a threshold value. 13. The system of claim 1 , wherein the topic extractor is further configured to: rank a plurality of transactions conducted by the party to the transaction based on a normalized distance between a location of a first appearance of an identifier of each transaction in one of the plurality of documents and the beginning of the one of the plurality of documents. 14. A method comprising: selecting a document that is pertinent to a topic based on searching, using a key phrase, a plurality of documents; identifying an identifier of a party to a transaction being referenced in the document; identifying the transaction conducted by the party to the transaction based on the document, the identifier of the party to the transaction being associated with the topic; determining a rating of the transaction conducted by the party to the transaction based on the document, the determining of the rating of the transaction conducted by the party to the transaction including identifying, from a plurality of polarity words included in the document, a dominant polarity word based on a syntactic distance between the dominant polarity word and the topic in a syntactic tree, the dominant polarity word having a dominant polarity impact on the rating of the transaction conducted by the party to the transaction; and determining, using at least one computer processor, a sentiment of the document based on the transaction conducted by the party to the transaction associated with the topic, and the rating of the transaction conducted by the party to the transaction associated with the topic. 15. The method of claim 14 , wherein the method further comprises: identifying one or more documents that reference a product of a plurality of products based on detecting a topic of a set of documents wherein the topic comprises a product name; determining a user sentiment for the product based on the sentiment of the one or more documents that reference the product; and generating a product ranking that orders the plurality of products based on the user sentiment for each of the plurality of products. 16. The method of claim 14 , wherein the method further comprises: determining, for each of a plurality of products, a combined rating of an aspect of the product based on one or more ratings of the aspect detected in one or more documents that reference the product; and generating a product comparison of the plurality of products based on the aspect and the combined rating of the aspect of each of the plurality of products. 17. A non-transitory machine-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the one or more hardware processors to perform operations comprising: selecting a document that is pertinent to a topic based on searching, using a key phrase, a plurality of documents; identifying an identifie

Assignees

Inventors

Classifications

  • Semantic analysis · CPC title

  • Commerce · CPC title

  • using natural language analysis · CPC title

  • Query execution (filtering based on additional data G06F16/335) · CPC title

  • using extracted text · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9514156B2 cover?
Techniques for topic extraction and opinion mining are described. For example, a machine selects a document that is pertinent to a topic based on searching a plurality of documents. The machine identifies an identifier of a party to a transaction being referenced in the document, and identifies the transaction conducted by the party to the transaction based on the document. The machine determin…
Who is the assignee on this patent?
Ebay Inc
What technology area does this patent fall under?
Primary CPC classification G06Q10/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 06 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).