Quality estimation models for various signal characteristics

US12153648B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12153648-B2
Application numberUS-202117502680-A
CountryUS
Kind codeB2
Filing dateOct 15, 2021
Priority dateOct 15, 2021
Publication dateNov 26, 2024
Grant dateNov 26, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This document relates to training and employing of quality estimation models to estimate the quality of different signal characteristics. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining training signals exhibiting diverse impairments introduced when the training signals are captured or diverse artifacts introduced by different processing characteristics of a plurality of data enhancement models. The method or technique can also include obtaining quality labels for different signal characteristics of the training signals. The method or technique can also include training at least two different quality estimation models to estimate quality of at least two different signal characteristics based at least on the training signals and the quality labels.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: obtaining training signals exhibiting diverse impairments introduced when the training signals are captured or diverse artifacts introduced by different processing characteristics of a plurality of data enhancement models; obtaining quality labels for different signal characteristics of the training signals, the quality labels including speech quality labels, overall quality labels, and background noise quality labels; training a first quality estimation model to estimate speech quality based at least on the speech quality labels; and training a second quality estimation model to estimate overall quality and background noise quality based at least on the overall quality labels, the background noise quality labels, and the speech quality labels. 2. The method of claim 1 , the first quality estimation model and the second quality estimation model being neural networks. 3. The method of claim 2 , the training signals comprising noise-reduced audio signals produced by a plurality of different noise suppression models. 4. The method of claim 1 , wherein the first quality estimation model is trained without using the overall quality labels and without using the background noise quality labels. 5. The method of claim 4 , wherein the first quality estimation model and the second quality estimation model comprise deep neural networks sharing multiple intermediate layers and having different output layers and different internal parameters. 6. The method of claim 1 , further comprising: providing an overall quality estimation model using the first quality estimation model, the second quality estimation model, and another quality estimation model trained on other training signals exhibiting different impairments. 7. The method of claim 1 , the first quality estimation model and the second quality estimation model having respective neural network structures that share one or more convolutional layers and have different output layers. 8. A system comprising: a processor; and a storage medium storing instructions which, when executed by the processor, cause the system to: access a plurality of quality estimation models that have been trained to estimate signal quality of different signal characteristics using training signals, the training signals having corresponding quality labels for the different signal characteristics including speech quality labels, overall quality labels, and background noise quality labels, the training signals exhibiting diverse impairments introduced when the training signals were captured or diverse artifacts introduced by a plurality of data enhancement models, the plurality of quality estimation models including a first quality estimation model trained to estimate speech quality based at least on the speech quality labels and a second quality estimation model trained to estimate overall quality and background noise quality based at least on the overall quality labels, the background noise quality labels, and the speech quality labels; provide an input signal to the plurality of quality estimation models; and process the input signal with the plurality of quality estimation models to obtain a plurality of synthetic quality labels, output by the plurality of quality estimation models, that characterize the different signal characteristics of the input signal. 9. The system of claim 8 , wherein the input signal is produced by another data enhancement model and the instructions, when executed by the processor, cause the system to: modify the another data enhancement model based at least on the plurality of synthetic quality labels, output by the plurality of quality estimation models, that characterize the different signal characteristics of the input signal produced by the another data enhancement model. 10. The system of claim 9 , wherein the another data enhancement model is configured as at least one of a noise removal model, an echo removal model, a distortion removal model, a codec, or a model for addressing quality degradation caused by room response or network loss/jitter. 11. The system of claim 8 , wherein the instructions, when executed by the processor, cause the system to: obtain a plurality of input signals produced by a plurality of other data enhancement models; process the plurality of input signals using the plurality of quality estimation models; and rank the plurality of other data enhancement models based at least on the plurality of synthetic quality labels output by the plurality of quality estimation models. 12. A computer-readable storage medium storing instructions which, when executed by a computing device, cause the computing device to perform acts comprising: obtaining audio training data produced by a plurality of data enhancement models; obtaining quality labels for different characteristics of the audio training data, the quality labels including speech quality labels, overall quality labels, and background noise quality labels; and training a plurality of different quality estimation models using the audio training data to estimate quality of the different characteristics based at least on the quality labels, the plurality of different quality estimation models including a first quality estimation model trained to estimate speech quality based at least on the speech quality labels and a second quality estimation model trained to estimate overall quality and background noise quality based at least on the overall quality labels, the background noise quality labels, and the speech quality labels. 13. The computer-readable storage medium of claim 12 , wherein the plurality of data enhancement models include noise suppressors configured to reduce audio noise. 14. The computer-readable storage medium of claim 13 , the acts further comprising: employing the first quality estimation model to produce synthetic labels characterizing speech quality of input signals; and employing the second quality estimation model to produce synthetic labels characterizing overall quality and background noise quality of the input signals. 15. The computer-readable storage medium of claim 14 , wherein the first quality estimation model is not employed to produce synthetic labels characterizing the overall quality of the input signals and the first quality estimation model is not employed to produce synthetic labels characterizing the background noise quality of the input signals. 16. The computer-readable storage medium of claim 15 , wherein the second quality estimation model is not employed to produce synthetic labels characterizing the speech quality of the input signals. 17. The computer-readable storage medium of claim 16 , the first quality estimation model comprising a first convolutional network having a first neural network structure with multiple convolution and pooling layers. 18. The computer-readable storage medium of claim 17 , the second quality estimation model comprising a second convolutional neural network with a second neural network structure that shares the multiple convolution and pooling layers with the first neural network structure. 19. The computer-readable storage medium of claim 18 , the first quality estimation model comprising an output layer that produces the synthetic labels characterizing the speech quality of the input signals. 20. The computer-readable storage medium of claim 19 , the second quality estimation model comprising a first output layer that produces the synthetic labels characterizing the

Assignees

Inventors

Classifications

  • using machine learning, e.g. neural networks · CPC title

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • G06T7/0002Primary

    Inspection of images, e.g. flaw detection · CPC title

  • structured as a network, e.g. client-server architectures · CPC title

  • Video; Image sequence · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12153648B2 cover?
This document relates to training and employing of quality estimation models to estimate the quality of different signal characteristics. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining training signals exhibiting diverse impairments introduced when the training signals are captured or diverse artifacts introd…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F18/2148. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).