System and method for sentiment lexicon expansion

US10089296B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10089296-B2
Application numberUS-201514984498-A
CountryUS
Kind codeB2
Filing dateDec 30, 2015
Priority dateDec 30, 2015
Publication dateOct 2, 2018
Grant dateOct 2, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for sentiment lexicon expansion receive at least a domain specific corpus comprising a plurality of words, and a generic sentiment lexicon; parse the plurality of words in the domain specific corpus into a plurality of dependency relations; identify, using one or more syntactic dependency rules and at least one of the plurality of dependency relations, a set of one or more sentiment candidates in the domain specific corpus; filter from the set of one or more sentiment candidates any sentiment candidate having an expected performance below a predefined threshold; sample the filtered set of one or more sentiment candidates to be used in a qualitative evaluation; and, for each sentiment candidate that passes the qualitative evaluation, add the sentiment candidate to the generic sentiment lexicon.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for sentiment lexicon expansion, performed on a computing device having a processor, memory, and one or more code sets stored in the memory and executing in the processor, the method comprising: receiving, by the processor, at least a domain specific corpus comprising a plurality of words, and a generic sentiment lexicon; parsing, by the processor, the plurality of words in the domain specific corpus into a plurality of dependency relations; identifying, by the processor, using one or more syntactic dependency rules and at least one of the plurality of dependency relations, a set of one or more sentiment candidates in the domain specific corpus; wherein the one or more syntactic dependency rules comprises at least one rule in which: if a pair of adverbs found in the domain specific corpus have either an ‘and’ conjunction dependency relationship or a ‘but’ conjunction dependency relationship, wherein a first adverb of the pair of adverbs was previously included in the generic sentiment lexicon and a second adverb of the pair of adverbs was not previously included in the generic sentiment lexicon, then the second adverb is identified as a sentiment candidate; filtering, by the processor, from the set of one or more sentiment candidates any sentiment candidate having an expected performance below a predefined threshold; sampling, by the processor, the filtered set of one or more sentiment candidates to be used in a qualitative evaluation; and for each sentiment candidate that passes the qualitative evaluation, adding, by the processor, the sentiment candidate to the generic sentiment lexicon. 2. The method as in claim 1 , wherein the second adverb is further identified as having a same polarity as a polarity of the first adverb when the first adverb and the second adverb have an ‘and’ conjunction dependency relationship; and wherein the second adverb is further identified as having an opposite polarity to the polarity of the first adverb when the first adverb and the second adverb have a ‘but’ conjunction dependency relationship. 3. The method as in claim 1 , wherein the one or more syntactic dependency rules comprises at least one rule in which: an adverb found in the domain specific corpus which has not been previously included in the generic sentiment lexicon and which modifies a known feature found in the domain specific corpus is identified as a sentiment candidate. 4. The method as in claim 1 , wherein the one or more syntactic dependency rules comprises at least one rule in which: an adverb, found in the domain specific corpus that has not been previously included in the generic sentiment lexicon, which modifies a context word found in the domain specific corpus, the context word having as its subject a known feature found in the domain specific corpus, is identified as a sentiment candidate. 5. The method as in claim 1 , further comprising: identifying a polarity of one or more of the one or more sentiment candidates, wherein the polarity is one of positive and negative. 6. The method as in claim 5 , further comprising: identifying an intensity of the polarity of the one or more of the one or more sentiment candidates. 7. The method as in claim 1 , further comprising: for each of the one or more sentiment candidates: if a polarity of the sentiment can be determined by applying the one or more syntactic dependency rules to the domain specific corpus, determining by the processor whether the polarity of the sentiment is positive or negative, and associating the sentiment with the determined polarity; and if a polarity of the sentiment cannot be determined by applying the one or more syntactic dependency rules to the domain specific corpus, identifying the sentiment polarity as undetermined. 8. The method as in claim 1 , further comprising: evaluating, by the processor, at least one of a recall value and a precision value for an expanded sentiment lexicon, wherein the expanded sentiment lexicon comprises the generic sentiment lexicon and each sentiment candidate added to the generic sentiment lexicon. 9. A system for sentiment lexicon expansion, comprising: a processor; a memory; and one or more code sets stored in the memory and executing in the processor, which, when executed, configure the processor to: receive at least a domain specific corpus comprising a plurality of words, and a generic sentiment lexicon; parse the plurality of words in the domain specific corpus into a plurality of dependency relations; identify, using one or more syntactic dependency rules and at least one of the plurality of dependency relations, a set of one or more sentiment candidates in the domain specific corpus; wherein the one or more syntactic dependency rules comprises at least one rule in which: if a pair of adverbs found in the domain specific corpus have either an ‘and’ conjunction dependency relationship or a ‘but’ conjunction dependency relationship, wherein a first adverb of the pair of adverbs was previously included in the generic sentiment lexicon and a second adverb of the pair of adverbs was not previously included in the generic sentiment lexicon, then the second adverb is identified as a sentiment candidate; filter from the set of one or more sentiment candidates any sentiment candidate having an expected performance below a predefined threshold; sample the filtered set of one or more sentiment candidates to be used in a qualitative evaluation; and for each sentiment candidate that passes the qualitative evaluation, add the sentiment candidate to the generic sentiment lexicon. 10. The system of claim 9 , wherein the second adverb is further identified as having a same polarity as a polarity of the first adverb when the first adverb and the second adverb have an ‘and’ conjunction dependency relationship; and wherein the second adverb is further identified as having an opposite polarity to the polarity of the first adverb when the first adverb and the second adverb have a ‘but’ conjunction dependency relationship. 11. The system of claim 9 , wherein the one or more syntactic dependency rules comprises at least one rule in which: an adverb found in the domain specific corpus which has not been previously included in the generic sentiment lexicon and which modifies a known feature found in the domain specific corpus is identified as a sentiment candidate. 12. The system of claim 9 , wherein the one or more syntactic dependency rules comprises at least one rule in which: an adverb, found in the domain specific corpus that has not been previously included in the generic sentiment lexicon, which modifies a context word found in the domain specific corpus, the context word having as its subject a known feature found in the domain specific corpus, is identified as a sentiment candidate. 13. The system of claim 9 , wherein the one or more code sets further configure the processor to: identify a polarity of one or more of the one or more sentiment candidates, wherein the polarity is one of positive and negative. 14. The system of claim 13 , wherein the one or more code sets further configure the processor to: Identify an intensity of the polarity of the one or more of the one or more sentiment candidates. 15. The system of claim 9 , wherein, for each of the one or more sentiment candidates, the one or more code sets further configure the processor to: if a polarity of the sentiment can be determined by applying the one or more syntactic dependency rules to the domain specific corpus, determine whether the polarity of the sentiment is p

Assignees

Inventors

Classifications

  • Morphological analysis · CPC title

  • Semantic analysis · CPC title

  • Validation · CPC title

  • G06F40/211Primary

    Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • Dictionaries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10089296B2 cover?
Systems and methods for sentiment lexicon expansion receive at least a domain specific corpus comprising a plurality of words, and a generic sentiment lexicon; parse the plurality of words in the domain specific corpus into a plurality of dependency relations; identify, using one or more syntactic dependency rules and at least one of the plurality of dependency relations, a set of one or more s…
Who is the assignee on this patent?
Nice Systems Ltd, Nice Ltd
What technology area does this patent fall under?
Primary CPC classification G06F40/211. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 02 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).