Using machine learning to generate segments from unstructured text and identify sentiments for each segment

US12579371B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12579371-B2
Application numberUS-202318327780-A
CountryUS
Kind codeB2
Filing dateJun 1, 2023
Priority dateJun 1, 2023
Publication dateMar 17, 2026
Grant dateMar 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to systems, non-transitory computer-readable media, and methods for segmenting unstructured text into salient portions and identifying sentiments expressed in each segment. In particular, the disclosed systems utilize a segmentation machine learning model to segment unstructured text into salient portions and a sentiment identifying machine learning model to identify sentiments for each segment. Additionally, the disclosed systems determine a topic for each segment and associate it with an emotion label, a sentiment label or a predicted action label. In one or more embodiments, based on the topic associated with the emotional label, sentiment label or predicted action label, the disclosed systems determine and perform additional actions.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: receiving user feedback data comprising unstructured text; training, using unclean text, a segmentation machine learning model to generate segments of salient text in a plurality of languages; identifying, using the segmentation machine learning model, a plurality of segments from the unstructured text, wherein a given segment of the plurality of segments comprises a salient portion of the unstructured text; generating, using a sentiment identifying machine learning model, a sentiment label, and an emotion label for each segment of the plurality of segments; generating a predicted action label for a given segment based on the sentiment label and the emotion label associated with the given segment; and generating, based on a combination of the sentiment label and the emotion label associated with the predicted action label, a response for a respondent associated with the user feedback data. 2 . The computer-implemented method of claim 1 , wherein the user feedback data comprises digital text responses to digital survey questions. 3 . The computer-implemented method of claim 1 , wherein identifying, using the segmentation machine learning model, segments further comprises: associating tokens for each word in the unstructured text; and generating a token label for each token. 4 . The computer-implemented method of claim 3 , wherein generating the token label for each token comprises: generating a first token label identifying a start of a given segment of the plurality of segments; and generating a second token label identifying a portion of the given segment other than the start of the given segment. 5 . The computer-implemented method of claim 1 , further comprising: determining a topic for each segment of the plurality of segments; and associating the topic for each segment with the sentiment label for each segment. 6 . The computer-implemented method of claim 5 , further comprising: generating, using the sentiment identifying machine learning model, the emotion label for each segment of the plurality of segments by selecting the emotion label from a set of emotion labels comprising joy anger, trust, fear, sadness, disgust, surprise, anticipation, and no emotion; and associating the emotion label for each segment with the topic and the sentiment label. 7 . The computer-implemented method of claim 5 , further comprising generating the predicted action label for the given segment based on a combination of the topic, the emotion label and the sentiment label associated with the given segment. 8 . The computer-implemented method of claim 1 , wherein the segmentation machine learning model comprises: a document encoder capable of processing a plurality of languages; and a sequence labeling machine learning model. 9 . The computer-implemented method of claim 1 , wherein generating the sentiment label comprises one of: very positive, positive, neutral, negative, or very negative. 10 . A non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computer system to: receive user feedback data comprising unstructured text; train, using unclean text, a segmentation machine learning model to generate segments of salient text in a plurality of languages; identify, using the segmentation machine learning model, a plurality of segments from the unstructured text, wherein a given segment of the plurality of segments comprises a salient portion of the unstructured text; generate, using a sentiment identifying machine learning model, a sentiment label and an emotion label for each segment of the plurality of segments; generate a predicted action label for a given segment based on the sentiment label and the emotion label associated with the given segment; and generate, based on a combination of the sentiment label and the emotion label associated with the predicted action label, a response for a respondent associated with the user feedback data. 11 . The non-transitory computer-readable medium of claim 10 , further comprising instruction that, when executed by the at least one processor, cause the computer system to train the segmentation machine learning model by: accessing a training dataset comprising annotated unclean unstructured text, the annotated unclean unstructured text indicating one or more starting points for a corresponding one or more salient portions within unclean unstructured text; modifying parameters of the segmentation machine learning model based on the annotated unclean unstructured text; providing an instance of unstructured text to the segmentation machine learning model; and receiving, from the segmentation machine learning model, segments from the instance of unstructured text, wherein each segment comprises salient portions of the unstructured text. 12 . The non-transitory computer-readable medium of claim 11 , wherein the annotated unclean unstructured text comprises text from a plurality of languages. 13 . The non-transitory computer-readable medium of claim 11 , wherein the annotated unclean unstructured text comprises unstructured text comprising misspelled words, sentence fragments, misplaced punctuation, or nonsensical text. 14 . The non-transitory computer-readable medium of claim 10 , further comprising instructions that, when executed by the at least one processor, cause the computer system to: determine a topic for each segment of the plurality of segments; associating the topic for each segment with the sentiment label for each segment; and generating the predicted action label for the given segment based on the topic, the emotion label, and the sentiment label associated with the given segment. 15 . The non-transitory computer-readable medium of claim 10 , wherein generating the sentiment label comprises selecting, using the sentiment identifying machine learning model, one of very positive, positive, neutral, negative, or very negative. 16 . A system comprising: at least one processor; and at least one non-transitory computer-readable storage medium storing instructions that, when executed by the at least one processor, cause the system to: receive user feedback data comprising unstructured text; train, using unclean text, a segmentation machine learning model to generate segments of salient text in a plurality of languages; identify, using the segmentation machine learning model, a plurality of segments from the unstructured text, wherein a given segment of the plurality of segments comprises a salient portion of the unstructured text; generate, using a sentiment identifying machine learning model, a sentiment label and an emotion label for each segment of the plurality of segments; generate a predicted action label for a given segment based on the sentiment label and the emotion label associated with the given segment; and generate, based on a combination of the sentiment label and the emotion label associated with the predicted action label, a response for a respondent associated with the user feedback data. 17 . The system of claim 16 , wherein the user feedback data comprises digital text responses to digital survey questions. 18 . The system of claim 16 , further comprising instructions that, when executed by the at least one processor, cause the system to identify segments by: associating tokens for each word in the unstructured text; generating a first token label identifying a start of a given segment of the plur

Assignees

Inventors

Classifications

  • Lexical analysis, e.g. tokenisation or collocates · CPC title

  • Parsing · CPC title

  • Semantic analysis · CPC title

  • G06F40/35Primary

    Discourse or dialogue representation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12579371B2 cover?
The present disclosure relates to systems, non-transitory computer-readable media, and methods for segmenting unstructured text into salient portions and identifying sentiments expressed in each segment. In particular, the disclosed systems utilize a segmentation machine learning model to segment unstructured text into salient portions and a sentiment identifying machine learning model to ident…
Who is the assignee on this patent?
Qualtrics Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/35. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).