Teacher and student learning for constructing mixed-domain model

US11416741B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11416741-B2
Application numberUS-201816003790-A
CountryUS
Kind codeB2
Filing dateJun 8, 2018
Priority dateJun 8, 2018
Publication dateAug 16, 2022
Grant dateAug 16, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique for constructing a model supporting a plurality of domains is disclosed. In the technique, a plurality of teacher models, each of which is specialized for different one of the plurality of the domains, is prepared. A plurality of training data collections, each of which is collected for different one of the plurality of the domains, is obtained. A plurality of soft label sets is generated by inputting each training data in the plurality of the training data collections into corresponding one of the plurality of the teacher models. A student model is trained using the plurality of the soft label sets.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for constructing a model supporting a plurality of domains, the method comprising: preparing a plurality of teacher models, each teacher model being specialized for different one of the plurality of the domains; obtaining a plurality of training data collections, each of the plurality of training data collections being collected for a different one of the plurality of the domains; inputting training data from each of the plurality of training data collections into a corresponding one of the plurality of the teacher models to generate a plurality of soft label sets; and training a student model using the plurality of the soft label sets. 2. The method of claim 1 , wherein each teacher model is connected to a matched feature extractor for corresponding one of the plurality of the domains and the student model is connected to a unified feature extractor, the unified feature extractor being common at least partially to the plurality of the domains. 3. The method of claim 2 , wherein the matched feature extractor of each teacher model extracts a matched feature from an input signal in the corresponding one of the plurality of the domains and the preparing of the plurality of the teacher models comprises: training each teacher model using matched features extracted by the matched feature extractor from teacher training data for the corresponding one of the plurality of the domains. 4. The method of claim 2 , wherein the unified feature extractor of the student model extracts an unified feature from an input signal in any one of the plurality of the domains by unifying physical meanings of features between the plurality of the domains. 5. The method of claim 4 , wherein the unified feature extractor of the student model includes a hybrid normalization parameter set used in common for the plurality of the domains. 6. The method of claim 4 , wherein a first unified feature extracted for a first domain has a plurality of elements and a second unified feature extracted for a second domain has a part of elements corresponding to a part of the elements for the first domain and a remaining part of elements corresponding to a remaining part of the elements for the first domain and having a predetermined value. 7. The method of claim 2 , wherein the training of the student model comprises: extracting an unified feature by the unified feature extractor from training data in each of the plurality of training data collections; and using the unified feature and a soft label associated with the unified feature as an input to the student model and privileged information, respectively. 8. The method of claim 2 , wherein the plurality of the teacher models and the student model are acoustic models and the plurality of the domains has difference in sampling condition of an input speech signal. 9. The method of claim 2 , wherein the plurality of the teacher models and the student model are image processing models and the plurality of the domains has difference in color mode of an input image signal. 10. The method of claim 1 , wherein the student model is a neural network based model. 11. A computer system for constructing a model supporting a plurality of domains, the computer system comprising: a memory storing program instructions; a processing circuitry in communications with the memory for executing the program instructions, wherein the program instructions are configured to: prepare a plurality of teacher models, wherein each teacher model is specialized for different one of the plurality of the domains; obtain a plurality of training data collections, wherein each of the plurality of training data collections is collected for a different one of the plurality of the domains; input training data from each of the plurality of training data collections into a corresponding one of the plurality of the teacher models to generate a plurality of soft label sets; and train a student model using the plurality of the soft label sets. 12. The computer system of claim 11 , wherein each teacher model is connected to a matched feature extractor for corresponding one of the plurality of the domains and the student model is connected to a unified feature extractor, the unified feature extractor being common at least partially to the plurality of the domains. 13. The computer system of claim 12 , wherein the matched feature extractor of each teacher model extracts a matched feature from an input signal in the corresponding one of the plurality of the domains and the processing circuitry is further configured to: train each teacher model using matched features extracted by the matched feature extractor from teacher training data for the corresponding one of the plurality of the domains to prepare the plurality of the teacher models. 14. The computer system of claim 12 , wherein the unified feature extractor of the student model extracts an unified feature from an input signal in any one of the plurality of the domains by unifying physical meanings of features between the plurality of the domains. 15. The computer system of claim 14 , wherein the unified feature extractor of the student model includes a hybrid normalization parameter set used in common for the plurality of the domains. 16. The computer system of claim 12 , wherein the processing circuitry is further configured to: extract an unified feature by the unified feature extractor from training data in each of the plurality of training data collections; and use the unified feature and a soft label associated with the unified feature as an input to the student model and privileged information, respectively. 17. The computer system of claim 12 , wherein the plurality of the teacher models and the student model are acoustic models and the plurality of the domains has difference in sampling condition of an input speech signal. 18. A computer program product constructing a model supporting a plurality of domains, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising: preparing a plurality of teacher models, each teacher model being specialized for different one of the plurality of the domains; obtaining a plurality of training data collections, each of the plurality of training data collections being collected for a different one of the plurality of the domains; inputting training data from each of the plurality of the training data collections into a corresponding one of the plurality of the teacher models to generate a plurality of soft label sets; and training a student model using the plurality of the soft label sets. 19. The computer program product of claim 18 , wherein each teacher model is connected to a matched feature extractor for corresponding one of the plurality of the domains and the student model is connected to a unified feature extractor, the unified feature extractor being common at least partially to the plurality of the domains. 20. The computer program product of claim 19 , wherein the matched feature extractor of each teacher model extracts a matched feature from an input signal in the corresponding one of the plurality of the domains and the preparing of the plurality of the teacher models comprises: training each teacher model using matched features extracted by the matched feature extractor from teacher training data for the correspondi

Assignees

Inventors

Classifications

  • Incorporation of unlabelled data, e.g. multiple instance learning [MIL] · CPC title

  • G06N3/088Primary

    Non-supervised learning, e.g. competitive learning · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Normalisation of the pattern dimensions · CPC title

  • the classifiers operating on different input data, e.g. multi-modal recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11416741B2 cover?
A technique for constructing a model supporting a plurality of domains is disclosed. In the technique, a plurality of teacher models, each of which is specialized for different one of the plurality of the domains, is prepared. A plurality of training data collections, each of which is collected for different one of the plurality of the domains, is obtained. A plurality of soft label sets is gen…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/088. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).