Method and server for text classification using multi-task learning
US-2020364407-A1 · Nov 19, 2020 · US
US11675981B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11675981-B2 |
| Application number | US-202117235386-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 20, 2021 |
| Priority date | Jun 27, 2019 |
| Publication date | Jun 13, 2023 |
| Grant date | Jun 13, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Neural network systems are provided that comprise one or more neural networks. The first neural network can comprise a convolutional neural network (CNN) long short-term memory (LSTM) architecture for receiving a primary data set comprising text messages and output a primary data structure comprising a text pattern-based feature. The second neural network can comprise a CNN architecture for receiving a secondary data sets derived from the primary data set and output a plurality of secondary data structures. The third neural network can combine the data structures to produce a combined data structure, and then process it to produce a categorized data structure comprising the text messages assigned to targets. The primary data set can comprise hate speech and the categorized data structure can comprise target categories, for example, hate targets. Methods of operating neural network systems and computer program products for performing such methods are also provided.
Opening claim text (preview).
What is claimed is: 1. A method of operating a target identification system, the method comprising: receiving a primary data set comprising text messages; constructing a graph comprising nodes corresponding to words in the text messages and edges connecting nodes based on occurrence within a predetermined distance; identifying words biased by predetermined keywords in the graph to produce a first graph-based data set of a secondary data set; identifying words having a high load determined by a number of shortest path passes using a node corresponding to a word to produce a second graph-based data set of the secondary data set; identifying words having similarity to the predetermined keywords based on occurrence with the predetermined keywords within the predetermined distance to produce a semantic based data set of the secondary data set; and processing the primary data set and a plurality of secondary data sets, training one or more neural networks and using the one or more trained neural networks to output a categorized data structure comprising the text messages assigned to targets, wherein the plurality of secondary data sets comprises the first graph-based data set, the second graph-based data set, and the semantic based data set. 2. The method of claim 1 , wherein the text messages comprise language relating to hate, an event, a product, an individual, a hobby, music, a location, an activity, a health issue, a utility issue, a safety issue, a weather phenomenon, a complaint, or an emotion, or any combination thereof. 3. The method of claim 1 , wherein the categorized data structure comprises a plurality of target categories. 4. The method of claim 3 , wherein the target categories comprise hate targets, events, products, individuals, hobbies, music genres, songs, locations, activities, health issues, utility issues, safety issues, weather phenomena, complaints, or emotions, or any combination thereof. 5. The method of claim 1 , wherein: the text messages comprise language relating to hate, an event, a product, an individual, a hobby, music, a location, an activity, a health issue, a utility issue, a safety issue, a weather phenomenon, a complaint, or an emotion, or any combination thereof; the categorized data structure comprises a plurality of target categories; and the target categories comprise hate targets, events, products, individuals, hobbies, music genres, songs, locations, activities, health issues, utility issues, safety issues, weather phenomena, complaints, or emotions, or any combination thereof. 6. The method of claim 1 , further comprising ranking the nodes based on an effect of a bias to produce the first graph-based data set. 7. The method of claim 6 , wherein the bias is based on a predetermined lexicon comprising the predetermined keywords. 8. The method of claim 1 , further comprising determining loads for the nodes based on the number of shortest path passes through each node. 9. The method of claim 8 , further comprising weighing the loads to produce the second graph-based data set. 10. The method of claim 1 , wherein the targets comprise hate targets. 11. The method of claim 10 , wherein the hate targets comprise race, religion, ethnic origin, national origin, biological sex, disability, sexual orientation, or gender identity, or any combination thereof. 12. The method of claim 1 , wherein the one or more neural networks comprise a convolutional neural network (CNN) having a long short-term memory (LSTM) architecture, a CNN, or a deep neural network (DNN), or any combination thereof. 13. The method of claim 1 , wherein the one or more neural networks comprise at least three neural networks. 14. The method of claim 1 , wherein the one or more neural networks comprise: a first neural network configured to receive the primary data set and output a primary data structure; and a second neural network configured to receive the plurality of secondary data sets and output a plurality of secondary data structures. 15. The method of claim 14 , wherein the one or more neural networks further comprise a third data structure configured to: combine the primary data structure and the plurality of second data structures to produce a combined data structure; and process the combined data structure to produce the categorized data structure. 16. A target identification system comprising: a computer readable medium comprising instructions to: receive a primary data set comprising text messages, construct a graph comprising nodes corresponding to words in the text messages and edges connecting nodes based on occurrence within a predetermined distance, identify words biased by predetermined keywords in the graph to produce a first graph-based data set of a secondary data set, identify words having a high load determined by a number of shortest path passes using a node corresponding to a word to produce a second graph-based data set of the secondary data set, identify words having similarity to the predetermined keywords based on occurrence with the predetermined keywords within the predetermined distance to produce a semantic based data set of the secondary data set, and process the primary data set and a plurality of secondary data sets, training one or more neural networks and using the one or more trained neural networks to output a categorized data structure comprising the text messages assigned to targets, wherein the plurality of secondary data sets comprises the first graph-based data set, the second graph-based data set, and the semantic based data set; and a processor configured to perform the instructions. 17. The system of claim 16 , wherein the one or more neural networks comprise a convolutional neural network (CNN) having a long short-term memory (LSTM) architecture, a CNN, or a deep neural network (DNN), or any combination thereof. 18. The system of claim 16 , wherein the one or more neural networks comprise: a first neural network configured to receive the primary data set and output a primary data structure; and a second neural network configured to receive the plurality of secondary data sets and output a plurality of secondary data structures. 19. The system of claim 18 , wherein the one or more neural networks further comprise a third data structure configured to: combine the primary data structure and the plurality of second data structures to produce a combined data structure; and process the combined data structure to produce the categorized data structure. 20. A computer program product comprising a non-transitory computer readable medium, wherein the non-transitory computer readable medium stores a computer program code for operating a neural network system, wherein the computer program code is executable by one or more processors of an application server of the system to: receive a primary data set comprising text messages; construct a graph comprising nodes corresponding to words in the text messages and edges connecting nodes based on occurrence within a predetermined distance; identify words biased by predetermined keywords in the graph to produce a first graph-based data set of a secondary data set; identify words having a high load determined by a number of shortest path passes using a node corresponding to a word to produce a second graph-based data set of the secondary data set; identify words having similarity to the predetermined keywords based on occurrence with the predetermined keywords within the predetermined distance to produce a sema
Related publications grouped by family.
Answers are generated from the same data shown on this page.