Language model customization in speech recognition for speech analytics
US-10186255-B2 · Jan 22, 2019 · US
US10643604B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10643604-B2 |
| Application number | US-201816219537-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 13, 2018 |
| Priority date | Jan 16, 2016 |
| Publication date | May 5, 2020 |
| Grant date | May 5, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for generating a language model for an organization includes: receiving, by a processor, organization-specific training data; receiving, by the processor, generic training data; computing, by the processor, a plurality of similarities between the generic training data and the organization-specific training data; assigning, by the processor, a plurality of weights to the generic training data in accordance with the computed similarities; combining, by the processor, the generic training data with the organization-specific training data in accordance with the weights to generate customized training data; training, by the processor, a customized language model using the customized training data; and outputting, by the processor, the customized language model, the customized language model being configured to compute the likelihood of phrases in a medium.
Opening claim text (preview).
What is claimed is: 1. A method for performing voice analytics on interactions with an organization, comprising: training a customized language model for the organization by: receiving, by a speech recognition engine, organization-specific training data and generic training data; computing, by the speech recognition engine, a plurality of similarities between the generic training data and the organization-specific training data; assigning, by the speech recognition engine, a plurality of weights to the generic training data through partitioning the generic training data into a plurality of partitions in accordance with the computed similarities wherein the computed similarities comprise a fixed set of one or more threshold similarities, associating a partition similarity with each of the partitions, the partition similarity corresponding to the average similarity of the data in the partition, and assigning a desired weight to each partition, the desired weight corresponding to the partition similarity of the partition; combining, by the speech recognition engine, the generic training data with the organization-specific training data in accordance with the weights to generate customized training data; training, by the speech recognition engine, the customized language model using the customized training data; and outputting, by the speech recognition engine, the customized language model, the customized language model being configured to compute a likelihood of phrases in a medium; receiving, by the speech recognition engine, an input speech from an interaction between a customer and an agent of the organization; and performing voice analytics on the received input speech. 2. The method of claim 1 , wherein a silhouette score is used to determine a number of the plurality of partitions. 3. The method of claim 1 , wherein a test set of the generic training data and the organization-specific training data empirically determine a number of the plurality of partitions. 4. The method of claim 1 , wherein k-means clustering is used to determine a number of the plurality of partitions. 5. The method of claim 1 , wherein the desired weight of a partition is exponentially decreasing with decreasing partition similarity. 6. The method of claim 1 , wherein the training a customized language model for the organization further comprise: receiving organization-specific in-medium data; combining the organization-specific in-medium data with the generic training data and the organization-specific training data to generate the customized training data; and retraining the language model in accordance with the customized training data. 7. The method of claim 1 , wherein the organization-specific training data comprise at least one of: in-medium data and out-of-medium data. 8. The method of claim 7 , wherein the in-medium data comprise speech recognition transcript text and the out-of-medium data comprise non-speech text. 9. A voice analytics system comprising: a speech model training system comprising: a processor; and memory coupled to the processor and storing instructions that, when executed by the processor, cause the processor to: receive organization-specific training data and generic training data; compute a plurality of similarities between the generic training data and the organization-specific training data; assign a plurality of weights to the generic training data through partitioning the generic training data into a plurality of partitions in accordance with the computed similarities wherein the computed similarities comprise a fixed set of one or more threshold similarities, associating a partition similarity with each of the partitions, the partition similarity corresponding to the average similarity of the data in the partition, and assigning a desired weight to each partition, the desired weight corresponding to the partition similarity of the partition; combine the generic training data with the organization-specific training data in accordance with the weights to generate customized training data; train a customized language model using the customized training data; and output the customized language model, the customized language model being configured to compute the likelihood of phrases in a medium; and a speech analytics system configured to: receive an input speech from an interaction between a customer and an agent of the organization; and perform voice analytics on the received input speech. 10. The speech recognition system of claim 9 , wherein a silhouette score is used to determine a number of the plurality of partitions. 11. The speech recognition system of claim 9 , wherein a test set of the generic training data and the organization-specific training data empirically determine a number of the plurality of partitions. 12. The speech recognition system of claim 9 , wherein k-means clustering is used to determine a number of the plurality of partitions. 13. The speech recognition system of claim 9 , wherein the desired weight of a partition is exponentially decreasing with decreasing partition similarity. 14. The speech recognition system of claim 9 , wherein the memory of the speech training model system further stores instructions that, when executed by the processor, cause the processor to: receive organization-specific in-medium data; combine the organization-specific in-medium data with the generic training data and the organization-specific training data to generate the customized training data; and retrain the language model in accordance with the customized training data. 15. The speech recognition system of claim 9 , wherein the organization-specific training data comprise at least one of: in-medium data and out-of-medium data. 16. The speech recognition system of claim 15 , wherein the in-medium data comprise speech recognition transcript text and the out-of-medium data comprise non-speech text.
Training · CPC title
using context dependencies, e.g. language models · CPC title
Word spotting · CPC title
Threshold criteria for the updating · CPC title
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.