Data loss prevention system for cloud security based on document discourse analysis
US-2018365593-A1 · Dec 20, 2018 · US
US11880652B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11880652-B2 |
| Application number | US-202318151164-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 6, 2023 |
| Priority date | Nov 14, 2019 |
| Publication date | Jan 23, 2024 |
| Grant date | Jan 23, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed for identifying hypocrisy in text. A computer system creates, from fragments of text, a syntactic tree that represents syntactic relationships between words in the fragments. The system identifies, in the syntactic tree, a first entity and a second entity. The system further determines that the first entity is opposite to the second entity. The system further determines a first sentiment score for a first fragment comprising the first entity and a second sentiment score for a second fragment comprising the second entity. The system, responsive to determining that the first sentiment score and the second sentiment score indicate opposite emotions, identifies the text as comprising hypocrisy and providing the text to an external device.
Opening claim text (preview).
What is claimed is: 1. A method of detecting hypocrisy in text, the method comprising: accessing text comprising fragments; creating, from the fragments, a syntactic tree that represents syntactic relationships between words in the fragments; identifying, in the syntactic tree, a first entity and a second entity; determining that the first entity is opposite to the second entity; determining a first sentiment score for a first fragment comprising the first entity and a second sentiment score for a second fragment comprising the second entity, wherein each sentiment score indicates an emotion indicated by the respective entity, wherein determining the sentiment score for each fragment comprises: forming a communicative discourse tree from a respective fragment; providing the communicative discourse tree to a machine-learning model, the machine-learning model being trained to identify emotions based on input communicative discourse trees for which emotion associations are known; and receiving the sentiment score from the machine-learning model; and responsive to determining that the first sentiment score and the second sentiment score indicate opposite emotions, identifying the text as comprising hypocrisy and providing the text to an external device. 2. The method of claim 1 , wherein the machine-learning model was previously trained by: accessing a set of training data comprising labels and text, wherein each label of the labels indicates that a respective text is associated with hypocrisy or that the respective text is not associated with hypocrisy; and iteratively training the machine-learning model using the set of training data. 3. The method of claim 1 , further comprising: receiving, from a user device, a correction indicating that an identification that the text comprises hypocrisy is incorrect; updating the machine-learning model with the correction; and adding the syntactic tree and the correction to a training data set. 4. The method of claim 1 , wherein determining that the first entity is opposite to the second entity comprises: accessing an ontology that comprises a plurality of entries, wherein each entry of the plurality of entries comprises a noun and one or more synonyms of the noun; and responsive to identifying, in the ontology, a particular entry that comprises a synonym matching the first entity, substituting the noun of the particular entry as the first entity. 5. The method of claim 1 , wherein forming the communicative discourse tree comprises: constructing, from the respective fragment, a discourse tree that comprises a plurality of nodes, each nonterminal node representing a rhetorical relationship within the respective fragment; and forming, from the discourse tree, the communicative discourse tree, wherein forming the communicative discourse tree comprises matching each fragment to a verb signature by: accessing a plurality of verb signatures, wherein each verb signature comprises a verb of a corresponding fragment and a sequence of thematic roles, wherein each thematic role describes a corresponding relationship between the verb and related words; determining, for each verb signature of the plurality of verb signatures, a plurality of thematic roles of a respective verb signature, wherein each of the thematic roles matches a role of a respective word in the corresponding fragment; selecting a particular verb signature from the plurality of verb signatures based on the particular verb signature comprising a highest number of matches of roles of words to the verb; and associating the particular verb signature with the fragment. 6. The method of claim 1 , wherein determining that the first entity is opposite to the second entity comprises: providing the syntactic tree, the first entity, and the second entity to an additional machine-learning model; and receiving, from the additional machine-learning model, one or more of (a) an indication that the first entity is opposite to the second entity and (b) a relationship between the first entity and the second entity. 7. The method of claim 1 , further comprising providing one or more of the first entity and the second entity to the external device. 8. A non-transitory computer-readable storage medium storing computer-executable program instructions, wherein when executed by a processing device, the program instructions cause the processing device to perform operations comprising: accessing text comprising fragments; creating, from the fragments, a syntactic tree that represents syntactic relationships between words in the fragments; identifying, in the syntactic tree, a first entity and a second entity; determining that the first entity is opposite to the second entity; determining a first sentiment score for a first fragment comprising the first entity and a second sentiment score for a second fragment comprising the second entity, wherein each sentiment score indicates an emotion indicated by the respective entity, wherein determining the sentiment score for each fragment comprises: forming a communicative discourse tree from a respective fragment; providing the communicative discourse tree to a machine-learning model, the machine-learning model being trained to identify emotions based on input communicative discourse trees for which emotion associations are known; and receiving the sentiment score from the machine-learning model; and responsive to determining that the first sentiment score and the second sentiment score indicate opposite emotions, identifying the text as comprising hypocrisy and providing the text to an external device. 9. The non-transitory computer-readable storage medium of claim 8 , wherein the machine-learning model was previously trained by: accessing a set of training data comprising labels and text, wherein each label of the labels indicates that a respective text is associated with hypocrisy or that the respective text is not associated with hypocrisy; and iteratively training the machine-learning model using the set of training data. 10. The non-transitory computer-readable storage medium of claim 8 , wherein executing the program instructions cause the processing device to perform additional operations comprising: receiving, from a user device, a correction indicating that an identification that the text comprises hypocrisy is incorrect; updating the machine-learning model with the correction; and adding the syntactic tree and the correction to a training data set. 11. The non-transitory computer-readable storage medium of claim 8 , wherein determining that the first entity is opposite to the second entity comprises: accessing an ontology that comprises a plurality of entries, wherein each entry of the plurality of entries comprises a noun and one or more synonyms of the noun; and responsive to identifying, in the ontology, a particular entry that comprises a synonym matching the first entity, substituting the noun of the particular entry as the first entity. 12. The non-transitory computer-readable storage medium of claim 8 , wherein determining that the first entity is opposite to the second entity comprises: providing the syntactic tree, the first entity, and the second entity to an additional machine-learning model; and receiving, from the additional machine-learning model, one or more of (a) an indication that the first entity is opposite to the second entity and (b) a relationship between the first entity and the second entity. 13. The non-transitory computer-readable storage medium of claim 8 , wherein forming the communicative discourse tree comprises: constructing, from the respect
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
Trees · CPC title
Querying, e.g. by the use of web search engines · CPC title
Semantic analysis · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.