Language models using spoken language modeling
US-2024386885-A1 · Nov 21, 2024 · US
US2016379665A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016379665-A1 |
| Application number | US-201514752323-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 26, 2015 |
| Priority date | Jun 26, 2015 |
| Publication date | Dec 29, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method is provided for training a Deep Neural Network (DNN) for acoustic modeling in speech recognition. The method includes reading central frames and side frames as input frames from a memory. The side frames are preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames. The method further includes executing pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of the DNN.
Opening claim text (preview).
What is claimed is: 1 . A method in a computer for training a Deep Neural Network for acoustic modeling in speech recognition, said method comprising: reading central frames and side frames as input frames from a memory, the side frames being preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames; and executing pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of the Deep Neural Network. 2 . The method of claim 1 , wherein executing pre-training and fine-tuning comprises: executing the pre-training only for the central frames; executing the fine-tuning only for the central frames for at least one time; and executing the fine-tuning for both the central frames and the side frames. 3 . The method of claim 2 , wherein the execution of the fine-tuning is repeated, and said method further comprising: increasing the number of frames of the central frames with each repetition of the execution of the fine-tuning with the central frames. 4 . The method of claim 1 , wherein executing pre-training and fine-tuning comprises: executing the pre-training for both the central frames and the side frames; and executing the fine-tuning with applying regularization on the connections from the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer of hidden layers of the Deep Neural Network. 5 . A computer program product for training a Deep Neural Network for acoustic modeling in speech recognition, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to: read central frames and side frames as input frames from a memory, the side frames being preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames; and execute pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of the Deep Neural Network. 6 . The computer program product of claim 5 , wherein execute pre-training and fine-tuning comprises: executing the pre-training only for the central frames; executing the fine-tuning only for the central frames for at least one time; and executing the fine-tuning for both the central frames and the side frames. 7 . The computer program product of claim 6 , wherein the execution of the fine-tuning is repeated, and the program instructions further to cause the computer to: increase the number of frames of the central frames with each repetition of the execution of the fine-tuning with the central frames. 8 . The computer program product of claim 5 , wherein execute pre-training and fine-tuning comprises: executing the pre-training for both the central frames and the side frames; and executing the fine-tuning with applying regularization on the connections from the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer of hidden layers of the Deep Neural Network. 9 . An information processing apparatus comprises: a memory storing central frames and side frames, the side frames being preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames; and a processor comprising a pre-trainer module and a fine-tuner module, the pre trainer module and the fine-tuner module configured to: read the central frames and the side frames as input frames from the memory; and execute pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of a Deep Neural Network. 10 . The information processing apparatus of claim 9 , wherein execute pre-training and fine-tuning comprises: executing the pre-training only for the central frames; executing the fine-tuning only for the central frames for at least one time; and executing the fine-tuning for both the central frames and the side frames. 11 . The information processing apparatus of claim 10 , wherein the execution of the fine-tuning is repeated, and the number of frames of the central frames is increased with each repetition of the execution of the fine-tuning with the central frames. 12 . The information processing apparatus of claim 9 , wherein execute pre-training and fine-tuning comprises: executing the pre-training for both the central frames and the side frames; and executing the fine-tuning with applying regularization on the connections from the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer of hidden layers of the Deep Neural Network.
Training · CPC title
using neural networks · CPC title
Training of HMMs · CPC title
using lexical or orthographic knowledge sources · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.