Task-specific language sets for multilingual learning
US-2023410682-A1 · Dec 21, 2023 · US
US12579371B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12579371-B2 |
| Application number | US-202318327780-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 1, 2023 |
| Priority date | Jun 1, 2023 |
| Publication date | Mar 17, 2026 |
| Grant date | Mar 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to systems, non-transitory computer-readable media, and methods for segmenting unstructured text into salient portions and identifying sentiments expressed in each segment. In particular, the disclosed systems utilize a segmentation machine learning model to segment unstructured text into salient portions and a sentiment identifying machine learning model to identify sentiments for each segment. Additionally, the disclosed systems determine a topic for each segment and associate it with an emotion label, a sentiment label or a predicted action label. In one or more embodiments, based on the topic associated with the emotional label, sentiment label or predicted action label, the disclosed systems determine and perform additional actions.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising: receiving user feedback data comprising unstructured text; training, using unclean text, a segmentation machine learning model to generate segments of salient text in a plurality of languages; identifying, using the segmentation machine learning model, a plurality of segments from the unstructured text, wherein a given segment of the plurality of segments comprises a salient portion of the unstructured text; generating, using a sentiment identifying machine learning model, a sentiment label, and an emotion label for each segment of the plurality of segments; generating a predicted action label for a given segment based on the sentiment label and the emotion label associated with the given segment; and generating, based on a combination of the sentiment label and the emotion label associated with the predicted action label, a response for a respondent associated with the user feedback data. 2 . The computer-implemented method of claim 1 , wherein the user feedback data comprises digital text responses to digital survey questions. 3 . The computer-implemented method of claim 1 , wherein identifying, using the segmentation machine learning model, segments further comprises: associating tokens for each word in the unstructured text; and generating a token label for each token. 4 . The computer-implemented method of claim 3 , wherein generating the token label for each token comprises: generating a first token label identifying a start of a given segment of the plurality of segments; and generating a second token label identifying a portion of the given segment other than the start of the given segment. 5 . The computer-implemented method of claim 1 , further comprising: determining a topic for each segment of the plurality of segments; and associating the topic for each segment with the sentiment label for each segment. 6 . The computer-implemented method of claim 5 , further comprising: generating, using the sentiment identifying machine learning model, the emotion label for each segment of the plurality of segments by selecting the emotion label from a set of emotion labels comprising joy anger, trust, fear, sadness, disgust, surprise, anticipation, and no emotion; and associating the emotion label for each segment with the topic and the sentiment label. 7 . The computer-implemented method of claim 5 , further comprising generating the predicted action label for the given segment based on a combination of the topic, the emotion label and the sentiment label associated with the given segment. 8 . The computer-implemented method of claim 1 , wherein the segmentation machine learning model comprises: a document encoder capable of processing a plurality of languages; and a sequence labeling machine learning model. 9 . The computer-implemented method of claim 1 , wherein generating the sentiment label comprises one of: very positive, positive, neutral, negative, or very negative. 10 . A non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computer system to: receive user feedback data comprising unstructured text; train, using unclean text, a segmentation machine learning model to generate segments of salient text in a plurality of languages; identify, using the segmentation machine learning model, a plurality of segments from the unstructured text, wherein a given segment of the plurality of segments comprises a salient portion of the unstructured text; generate, using a sentiment identifying machine learning model, a sentiment label and an emotion label for each segment of the plurality of segments; generate a predicted action label for a given segment based on the sentiment label and the emotion label associated with the given segment; and generate, based on a combination of the sentiment label and the emotion label associated with the predicted action label, a response for a respondent associated with the user feedback data. 11 . The non-transitory computer-readable medium of claim 10 , further comprising instruction that, when executed by the at least one processor, cause the computer system to train the segmentation machine learning model by: accessing a training dataset comprising annotated unclean unstructured text, the annotated unclean unstructured text indicating one or more starting points for a corresponding one or more salient portions within unclean unstructured text; modifying parameters of the segmentation machine learning model based on the annotated unclean unstructured text; providing an instance of unstructured text to the segmentation machine learning model; and receiving, from the segmentation machine learning model, segments from the instance of unstructured text, wherein each segment comprises salient portions of the unstructured text. 12 . The non-transitory computer-readable medium of claim 11 , wherein the annotated unclean unstructured text comprises text from a plurality of languages. 13 . The non-transitory computer-readable medium of claim 11 , wherein the annotated unclean unstructured text comprises unstructured text comprising misspelled words, sentence fragments, misplaced punctuation, or nonsensical text. 14 . The non-transitory computer-readable medium of claim 10 , further comprising instructions that, when executed by the at least one processor, cause the computer system to: determine a topic for each segment of the plurality of segments; associating the topic for each segment with the sentiment label for each segment; and generating the predicted action label for the given segment based on the topic, the emotion label, and the sentiment label associated with the given segment. 15 . The non-transitory computer-readable medium of claim 10 , wherein generating the sentiment label comprises selecting, using the sentiment identifying machine learning model, one of very positive, positive, neutral, negative, or very negative. 16 . A system comprising: at least one processor; and at least one non-transitory computer-readable storage medium storing instructions that, when executed by the at least one processor, cause the system to: receive user feedback data comprising unstructured text; train, using unclean text, a segmentation machine learning model to generate segments of salient text in a plurality of languages; identify, using the segmentation machine learning model, a plurality of segments from the unstructured text, wherein a given segment of the plurality of segments comprises a salient portion of the unstructured text; generate, using a sentiment identifying machine learning model, a sentiment label and an emotion label for each segment of the plurality of segments; generate a predicted action label for a given segment based on the sentiment label and the emotion label associated with the given segment; and generate, based on a combination of the sentiment label and the emotion label associated with the predicted action label, a response for a respondent associated with the user feedback data. 17 . The system of claim 16 , wherein the user feedback data comprises digital text responses to digital survey questions. 18 . The system of claim 16 , further comprising instructions that, when executed by the at least one processor, cause the system to identify segments by: associating tokens for each word in the unstructured text; generating a first token label identifying a start of a given segment of the plur
Lexical analysis, e.g. tokenisation or collocates · CPC title
Parsing · CPC title
Semantic analysis · CPC title
Discourse or dialogue representation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.