Method and system for confidential sentiment analysis
US-2022122628-A1 · Apr 21, 2022 · US
US12032717B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12032717-B2 |
| Application number | US-202016832976-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 27, 2020 |
| Priority date | Mar 27, 2020 |
| Publication date | Jul 9, 2024 |
| Grant date | Jul 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One example method includes transcribing a portion of the audio component to create a transcription file that includes text, searching the text of the transcription file and identifying information in the text that may include personal information, defining a textual window that includes the information, evaluating the text in the textual window to identify personal information, and masking the personal information in the audio component of the recording. The personal information may be masked with information of a non-personal nature.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: creating a recording that includes an audio component of a client; transcribing a portion of the audio component to create a transcription file; receiving a regex; using the regex to search for a matching text between a portion of the transcription file and the regex; identifying one or more textual windows that include the matching text in the transcription file, wherein each identified textual window includes text preceding and following the matching text, and each identified textual window includes only text associated with the client; evaluating, by a trained machine learning classifier, the text in each identified textual window based on a bag of words model, in which a word in a bag of words has a relatively higher weight than other words in the bag of words based on a strength of correlation between words and personal information sought to be located; inferring, based on the evaluating of the text in each identified textual window, presence of personal information of the client in each identified textual window; and removing the personal information from any identified textual window in which presence of the personal information was inferred. 2. The method as recited in claim 1 , wherein the audio component includes words spoken by a human. 3. The method as recited in claim 1 , wherein the recording is an audio recording, or an audio/video recording. 4. The method as recited in claim 1 , wherein the trained machine learning classifier maps words in the one or more identified textual windows as a vector of real numbers, and the vector is one of a group of vectors in a vector space. 5. The method as recited in claim 1 , wherein each textual window comprises a portion of the recording that is bounded by a start time and an end time. 6. The method as recited in claim 1 , wherein the method is performed on-the-fly as the recording is being created. 7. The method as recited in claim 1 , wherein the personal information does not pertain to any person whose voice is in the recording. 8. The method as recited in claim 1 , wherein the removed personal information is replaced with data of a non-personal nature. 9. The method as recited in claim 1 , further comprising generating a set of training data and using the training data as a basis for searching the text of the transcription file. 10. The method as recited in claim 9 , wherein generating the set of training data comprises: tagging data in the training data as comprising the personal information; automatically learning one or more regexes, including the regex; and training a machine learning classifier to infer presence of the personal information in the identified textual window. 11. A non-transitory storage medium having stored therein instructions that are executable by one or more hardware processors to perform operations comprising: creating a recording that includes an audio component of a client; transcribing a portion of the audio component to create a transcription file; receiving a regex; using the regex to search for a matching text between a portion of the transcription file and the regex; identifying one or more textual windows that include the matching text in the transcription file, wherein each identified textual window includes text preceding and following the matching text, and each identified textual window includes only text associated with the client; evaluating, by a trained machine learning classifier, the text in each textual window based on a bag of words model, in which a word in a bag of words has a relatively higher weight than other words in the bag of words based on a strength of correlation between words and personal information sought to be located; inferring, based on the evaluating of the text in each identified textual window, presence of personal information in each textual window; and removing the personal information of the client from any identified textual window in which presence of the personal information was inferred. 12. The non-transitory storage medium as recited in claim 11 , wherein the audio component includes words spoken by a human. 13. The non-transitory storage medium as recited in claim 11 , wherein the recording is an audio recording, or an audio/video recording. 14. The non-transitory storage medium as recited in claim 11 , wherein the trained machine learning classifier maps words in the one or more identified textual windows as a vector of real numbers, and the vector is one of a group of vectors in a vector space. 15. The non-transitory storage medium as recited in claim 11 , wherein each textual window comprises a portion of the recording that is bounded by a start time and an end time. 16. The non-transitory storage medium as recited in claim 11 , wherein the operations are performed on-the-fly as the recording is being created. 17. The non-transitory storage medium as recited in claim 11 , wherein the personal information does not pertain to any person whose voice is in the recording. 18. The non-transitory storage medium as recited in claim 11 , wherein the removed personal information is replaced with data of a non-personal nature. 19. The non-transitory storage medium as recited in claim 11 , further comprising generating a set of training data and using the training data as a basis for searching the text of the transcription file. 20. The non-transitory storage medium as recited in claim 19 , wherein generating the set of training data comprises: tagging data in the training data as comprising the personal information; automatically learning one or more regexes, including the regex; and training a machine learning classifier to infer presence of the personal information in the identified textual window.
Speech to text systems (G10L15/08 takes precedence) · CPC title
Training · CPC title
Probabilistic grammars, e.g. word n-grams · CPC title
Machine learning · CPC title
Semantic analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.