Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US2019205748A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2019205748-A1 |
| Application number | US-201815860097-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 2, 2018 |
| Priority date | Jan 2, 2018 |
| Publication date | Jul 4, 2019 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A technique for generating soft labels for training is disclosed. In the method, a teacher model having a teacher side class set is prepared. A collection of class pairs for respective data units is obtained. Each class pair includes classes labelled to a corresponding data unit from among the teacher side class set and from among a student side class set that is different from the teacher side class set. A training input is fed into the teacher model to obtain a set of outputs for the teacher side class set. A set of soft labels for the student side class set is calculated from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for generating soft labels for training, the method comprising: preparing a teacher model having a teacher side class set; obtaining a collection of class pairs for respective data units, each class pair including classes labeled to a corresponding data unit from among the teacher side class set and from among a student side class set different from the teacher side class set; feeding a training input into the teacher model to obtain a set of outputs for the teacher side class set; and calculating a set of soft labels for the student side class set from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs. 2 . The method of claim 1 , wherein calculating the set of soft labels for the student side class set comprises: selecting, for each member of the student side class set, a class frequently observed in the collection together with the member of the student side class set from among the subset. 3 . The method of claim 2 , wherein a class of the subset most frequently observed in the collection together with the member is selected and mapped to the member of the student side class set, the output for the most frequently observed class being used to calculate a soft label corresponding to the member by using softmax function. 4 . The method of claim 1 , wherein the method further comprises: creating a data structure summarizing, for each member of the student side class set, a distribution of observations in the collection over at least classes of the subset of the teacher side class set observed together with the member of the student side class set, the data structure being used in calculating the set of the soft labels. 5 . The method of claim 1 , wherein obtaining the collection of the class pairs for the respective data units comprises: preparing a trained model having a class set same as the student side class set; aligning a class to each data unit from among the student side class set by using the trained model; and aligning a class to each data unit from among the teacher side class set by using the teacher model or other model having a class set same as the teacher side class set. 6 . The method of claim 1 , wherein the training input is fed into the teacher model for each training data in a pool and the set of the soft labels for the student side class set is calculated for each training data in the pool. 7 . The method of claim 6 , wherein the method further comprises: training a student model having the student side class set by using at least a part of the soft labels calculated for each training input. 8 . The method of claim 1 , wherein the teacher side class set is a class set of phonetic units having N (N is a positive integer) classes, the student side class set is a class set of phonetic units having M (M is a positive integer) classes, the data unit represents a frame in a speech data and the teacher model includes an acoustic model and the student model is a neural network for an acoustic model. 9 . The method of claim 8 , wherein the subset of the teacher side class set for each member of the student side class set includes one or more classes having a center phoneme same as the member of the student side class set. 10 . The method of claim 8 , wherein the subset of the teacher side class set for each member of the student side class set includes one or more classes having a center phoneme and a sub-state same as the member of the student side class set. 11 . The method of claim 8 , wherein the M classes in the student side class set belong to a phoneme system of a language same as the N classes in the teacher side class set. 12 . The method of claim 1 , wherein the teacher side class set is an image class set having N (N is a positive integer) image classes, the student side class set is an image class set having M (M is a positive integer) image classes, the data unit represents an image data, and the teacher model includes an image recognition model. 13 . The method of claim 12 , wherein the subset of the teacher side class set for each member of the student side class set includes one or more classes belonging to a superclass related to the member of the student side class set. 14 . A computer system for generating soft labels for training, the computer system comprising: a memory storing program instructions; a processing circuitry in communications with the memory for executing the program instructions, wherein the processing circuitry is configured to: prepare a teacher model having a teacher side class set; obtain a collection of class pairs for respective data units, wherein each class pair includes classes labelled to a corresponding data unit from among the teacher side class set and from among a student side class set different from the teacher side class set; feed a training input into the teacher model to obtain a set of outputs for the teacher side class set; and calculate a set of soft labels for the student side class set from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs. 15 . The computer system of claim 14 , wherein the processing circuitry is further configured to: select, for each member of the student side class set, a class frequently observed in the collection together with the member of the student side class set from among the subset to calculate the set of soft labels for the student side class set. 16 . The computer system of claim 14 , wherein the processing circuitry is further configured to: create a data structure summarizing, for each member of the student side class set, a distribution of observations in the collection over at least classes of the subset of the teacher side class set together with the member of the student side class set, wherein the data structure is used to calculate the set of the soft labels. 17 . The computer system of claim 14 , wherein the processing circuitry is further configured to: prepare a trained model having a class set same as the student side class set; align a class to each data unit from among the student side class set by using the trained model as one for each class pair; and align a class to each data unit from among the teacher side class set by using the teacher model or other model having a class set same as the teacher side class set as other for each class pair. 18 . A computer program product for generating soft labels for training, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising: preparing a teacher model having a teacher side class set; obtaining a collection of class pairs for respective data units, each class pair including classes labelled to a corresponding data unit from among the teacher side class set and from among a student side class set different from the teacher side class set; feeding a training input into the teacher model to obtain a set of outputs for
Recurrent networks, e.g. Hopfield networks · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Combinations of networks · CPC title
Learning methods · CPC title
Training · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.