Detecting dangerous expressions based on a theme

US9575959B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9575959-B2
Application numberUS-201414460443-A
CountryUS
Kind codeB2
Filing dateAug 15, 2014
Priority dateOct 3, 2013
Publication dateFeb 21, 2017
Grant dateFeb 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments relate to a dangerous expression based on a particular theme. An aspect includes acquiring, by an electronic apparatus, from text data for learning, a subset of the text data associated with the particular theme and with particular time period information. Another aspect includes extracting text data containing negative information from the acquired subset of the text data. Another aspect includes extracting a word or phrase having a high correlation with the extracted text data or a word or phrase having a high appearance frequency in the extracted text data from the extracted text data. Yet another aspect includes determining that the extracted word or phrase is the dangerous expression based on the particular theme.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for detecting a dangerous expression based on a particular theme, comprising: acquiring, by a first electronic apparatus, from text data for learning, a first subset of the text data associated with the particular theme and with particular time period information; extracting second text data containing negative information, based on an appearance of at least one negative word, from the acquired first subset of the text data; extracting a first at least one word different from the at least one negative word and having a high correlation with the extracted second text data, wherein the first at least one word is determined to have a high correlation with the extracted second text data based on a first product divided by a second product being greater than one, wherein the first product comprises a first number of portions of the text data for learning multiplied by a second number of appearances of the first at least one word in the first subset of the text data for learning, wherein the second product comprises a third number of portions of the first subset of the text data for learning multiplied by a fourth number of appearances of the first at least one word in the text data for learning; determining that the extracted first at least one word different from the at least one negative word is the dangerous expression based on the particular theme and creating a learned learning model associating the dangerous expression to the particular theme; acquiring, by a second electronic apparatus, a second subset of text data associated with the particular theme, wherein the first electronic apparatus is connected to the second electronic apparatus by a network, and wherein the second subset of text data comprises text for posting on a social networking site; detecting that the dangerous expression determined to be the dangerous expression by the learned learning model exists in the second subset of text data acquired by the second electronic apparatus; displaying, on a screen of the second electronic apparatus, an indication that the second subset of text data contains the dangerous expression; and posting, from the second electronic apparatus and to the social networking site, a modified version of the second subset of text data in response to the indication that the second subset of text data contains the dangerous expression, wherein the modified version of the second subset of text data is generated based on user input. 2. The method according to claim 1 , further comprising performing, by the second electronic apparatus: extracting third text data containing negative information from the second subset of text data acquired from the text data to be analyzed, wherein the detecting that the dangerous expression exists in the second subset of text data acquired from the text data to be analyzed includes detecting that the dangerous expression exists in the third text data extracted from the text data to be analyzed. 3. The method according to claim 1 , further comprising performing, based on the dangerous expression existing in the second subset of text data: stopping or suspending transmission or upload of the second subset of text data onto a network; transmitting a message indicating that the second subset of text data contains the dangerous expression to an electronic apparatus of a user that has provided the second subset of text data; and displaying, on the screen, an indication of the particular theme and a number of times of appearance of the dangerous expression. 4. The method according to claim 1 , wherein detecting that the dangerous expression exists in the second subset of text data further includes extracting the particular theme. 5. The method according to claim 1 , wherein the first at least one word includes a co-occurrence expression comprising a first word and a second word such that a third product divided by a fourth product is greater than one, wherein the third product comprises a fifth number of portions of the text data for learning multiplied by a sixth number of appearances of both the first word and the second word in the first subset of text data for learning, wherein the fourth product comprises a seventh number of portions of the first subset of text data for learning multiplied by an eighth number of appearances of both the first word and the second word in the text data for learning. 6. The method according to claim 1 , wherein the extracting the second text data containing the negative information comprises: identifying at least one word that falls under the negative information in the acquired first subset of text data for learning; and extracting the second text data containing the identified at least one word. 7. The method according to claim 6 , wherein identifying the at least one word that falls under the negative information is performed using a negative information dictionary including words and phrases determinable as the negative information. 8. The method according to claim 1 , wherein acquiring the second subset of text data comprises identifying text data associated with the particular theme using a theme identifying dictionary including words and phrases used for the particular theme. 9. The method according to claim 8 , wherein acquiring the second subset of text data comprises: identifying, as text data associated with the particular theme, a range of a predetermined number of characters or a predetermined number of words before and after at least one word that exists in the text data for learning and is included in the theme identifying dictionary. 10. The method according to claim 1 , wherein acquiring the second subset of text data comprises acquiring the second subset of text data associated with the particular theme, by performing a set operation on the text data associated with the particular theme and the text data associated with the particular time period information. 11. The method according to claim 8 , wherein acquiring the second subset of text data comprises: identifying that the same sentence, paragraph, item, or document including text data contains at least one word included in the theme identifying dictionary. 12. The method according to claim 1 , wherein acquiring the second subset of text data comprises acquiring the second subset of text data associated with the particular theme by: identifying text data associated with the particular theme using a theme identifying dictionary including words and phrases used for the particular theme; and identifying text data associated with the particular time period information. 13. A non-transitory computer program product for detecting a dangerous expression based on a particular theme, the computer program product comprising a non-transitory computer readable storage medium, wherein the computer readable storage medium has program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: acquire, from text data for learning, a first subset of the text data associated with the particular theme and with particular time period information; extract second text data containing negative information from the acquired first subset of the text data based on an appearance of at least one negative word; extract a first at least one word different from the at least one negative word and having a high correlation with the extracted second text data, wherein the first at least one word is determined to have a high correlation with the extracted second text data based on a first product divided by a second product being greater than one, wherein the first product compris

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9575959B2 cover?
Embodiments relate to a dangerous expression based on a particular theme. An aspect includes acquiring, by an electronic apparatus, from text data for learning, a subset of the text data associated with the particular theme and with particular time period information. Another aspect includes extracting text data containing negative information from the acquired subset of the text data. Another …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/279. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).