Quality estimation model for packet loss concealment
US-2024127848-A1 · Apr 18, 2024 · US
US12153648B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12153648-B2 |
| Application number | US-202117502680-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 15, 2021 |
| Priority date | Oct 15, 2021 |
| Publication date | Nov 26, 2024 |
| Grant date | Nov 26, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This document relates to training and employing of quality estimation models to estimate the quality of different signal characteristics. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining training signals exhibiting diverse impairments introduced when the training signals are captured or diverse artifacts introduced by different processing characteristics of a plurality of data enhancement models. The method or technique can also include obtaining quality labels for different signal characteristics of the training signals. The method or technique can also include training at least two different quality estimation models to estimate quality of at least two different signal characteristics based at least on the training signals and the quality labels.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: obtaining training signals exhibiting diverse impairments introduced when the training signals are captured or diverse artifacts introduced by different processing characteristics of a plurality of data enhancement models; obtaining quality labels for different signal characteristics of the training signals, the quality labels including speech quality labels, overall quality labels, and background noise quality labels; training a first quality estimation model to estimate speech quality based at least on the speech quality labels; and training a second quality estimation model to estimate overall quality and background noise quality based at least on the overall quality labels, the background noise quality labels, and the speech quality labels. 2. The method of claim 1 , the first quality estimation model and the second quality estimation model being neural networks. 3. The method of claim 2 , the training signals comprising noise-reduced audio signals produced by a plurality of different noise suppression models. 4. The method of claim 1 , wherein the first quality estimation model is trained without using the overall quality labels and without using the background noise quality labels. 5. The method of claim 4 , wherein the first quality estimation model and the second quality estimation model comprise deep neural networks sharing multiple intermediate layers and having different output layers and different internal parameters. 6. The method of claim 1 , further comprising: providing an overall quality estimation model using the first quality estimation model, the second quality estimation model, and another quality estimation model trained on other training signals exhibiting different impairments. 7. The method of claim 1 , the first quality estimation model and the second quality estimation model having respective neural network structures that share one or more convolutional layers and have different output layers. 8. A system comprising: a processor; and a storage medium storing instructions which, when executed by the processor, cause the system to: access a plurality of quality estimation models that have been trained to estimate signal quality of different signal characteristics using training signals, the training signals having corresponding quality labels for the different signal characteristics including speech quality labels, overall quality labels, and background noise quality labels, the training signals exhibiting diverse impairments introduced when the training signals were captured or diverse artifacts introduced by a plurality of data enhancement models, the plurality of quality estimation models including a first quality estimation model trained to estimate speech quality based at least on the speech quality labels and a second quality estimation model trained to estimate overall quality and background noise quality based at least on the overall quality labels, the background noise quality labels, and the speech quality labels; provide an input signal to the plurality of quality estimation models; and process the input signal with the plurality of quality estimation models to obtain a plurality of synthetic quality labels, output by the plurality of quality estimation models, that characterize the different signal characteristics of the input signal. 9. The system of claim 8 , wherein the input signal is produced by another data enhancement model and the instructions, when executed by the processor, cause the system to: modify the another data enhancement model based at least on the plurality of synthetic quality labels, output by the plurality of quality estimation models, that characterize the different signal characteristics of the input signal produced by the another data enhancement model. 10. The system of claim 9 , wherein the another data enhancement model is configured as at least one of a noise removal model, an echo removal model, a distortion removal model, a codec, or a model for addressing quality degradation caused by room response or network loss/jitter. 11. The system of claim 8 , wherein the instructions, when executed by the processor, cause the system to: obtain a plurality of input signals produced by a plurality of other data enhancement models; process the plurality of input signals using the plurality of quality estimation models; and rank the plurality of other data enhancement models based at least on the plurality of synthetic quality labels output by the plurality of quality estimation models. 12. A computer-readable storage medium storing instructions which, when executed by a computing device, cause the computing device to perform acts comprising: obtaining audio training data produced by a plurality of data enhancement models; obtaining quality labels for different characteristics of the audio training data, the quality labels including speech quality labels, overall quality labels, and background noise quality labels; and training a plurality of different quality estimation models using the audio training data to estimate quality of the different characteristics based at least on the quality labels, the plurality of different quality estimation models including a first quality estimation model trained to estimate speech quality based at least on the speech quality labels and a second quality estimation model trained to estimate overall quality and background noise quality based at least on the overall quality labels, the background noise quality labels, and the speech quality labels. 13. The computer-readable storage medium of claim 12 , wherein the plurality of data enhancement models include noise suppressors configured to reduce audio noise. 14. The computer-readable storage medium of claim 13 , the acts further comprising: employing the first quality estimation model to produce synthetic labels characterizing speech quality of input signals; and employing the second quality estimation model to produce synthetic labels characterizing overall quality and background noise quality of the input signals. 15. The computer-readable storage medium of claim 14 , wherein the first quality estimation model is not employed to produce synthetic labels characterizing the overall quality of the input signals and the first quality estimation model is not employed to produce synthetic labels characterizing the background noise quality of the input signals. 16. The computer-readable storage medium of claim 15 , wherein the second quality estimation model is not employed to produce synthetic labels characterizing the speech quality of the input signals. 17. The computer-readable storage medium of claim 16 , the first quality estimation model comprising a first convolutional network having a first neural network structure with multiple convolution and pooling layers. 18. The computer-readable storage medium of claim 17 , the second quality estimation model comprising a second convolutional neural network with a second neural network structure that shares the multiple convolution and pooling layers with the first neural network structure. 19. The computer-readable storage medium of claim 18 , the first quality estimation model comprising an output layer that produces the synthetic labels characterizing the speech quality of the input signals. 20. The computer-readable storage medium of claim 19 , the second quality estimation model comprising a first output layer that produces the synthetic labels characterizing the
using machine learning, e.g. neural networks · CPC title
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
Inspection of images, e.g. flaw detection · CPC title
structured as a network, e.g. client-server architectures · CPC title
Video; Image sequence · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.