What technology area does this patent fall under?

Primary CPC classification G10L15/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Dec 29 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Training deep neural network for acoustic modeling in speech recognition

US2016379665A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016379665-A1
Application number	US-201514752323-A
Country	US
Kind code	A1
Filing date	Jun 26, 2015
Priority date	Jun 26, 2015
Publication date	Dec 29, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method is provided for training a Deep Neural Network (DNN) for acoustic modeling in speech recognition. The method includes reading central frames and side frames as input frames from a memory. The side frames are preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames. The method further includes executing pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of the DNN.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method in a computer for training a Deep Neural Network for acoustic modeling in speech recognition, said method comprising: reading central frames and side frames as input frames from a memory, the side frames being preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames; and executing pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of the Deep Neural Network. 2 . The method of claim 1 , wherein executing pre-training and fine-tuning comprises: executing the pre-training only for the central frames; executing the fine-tuning only for the central frames for at least one time; and executing the fine-tuning for both the central frames and the side frames. 3 . The method of claim 2 , wherein the execution of the fine-tuning is repeated, and said method further comprising: increasing the number of frames of the central frames with each repetition of the execution of the fine-tuning with the central frames. 4 . The method of claim 1 , wherein executing pre-training and fine-tuning comprises: executing the pre-training for both the central frames and the side frames; and executing the fine-tuning with applying regularization on the connections from the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer of hidden layers of the Deep Neural Network. 5 . A computer program product for training a Deep Neural Network for acoustic modeling in speech recognition, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to: read central frames and side frames as input frames from a memory, the side frames being preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames; and execute pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of the Deep Neural Network. 6 . The computer program product of claim 5 , wherein execute pre-training and fine-tuning comprises: executing the pre-training only for the central frames; executing the fine-tuning only for the central frames for at least one time; and executing the fine-tuning for both the central frames and the side frames. 7 . The computer program product of claim 6 , wherein the execution of the fine-tuning is repeated, and the program instructions further to cause the computer to: increase the number of frames of the central frames with each repetition of the execution of the fine-tuning with the central frames. 8 . The computer program product of claim 5 , wherein execute pre-training and fine-tuning comprises: executing the pre-training for both the central frames and the side frames; and executing the fine-tuning with applying regularization on the connections from the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer of hidden layers of the Deep Neural Network. 9 . An information processing apparatus comprises: a memory storing central frames and side frames, the side frames being preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames; and a processor comprising a pre-trainer module and a fine-tuner module, the pre trainer module and the fine-tuner module configured to: read the central frames and the side frames as input frames from the memory; and execute pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of a Deep Neural Network. 10 . The information processing apparatus of claim 9 , wherein execute pre-training and fine-tuning comprises: executing the pre-training only for the central frames; executing the fine-tuning only for the central frames for at least one time; and executing the fine-tuning for both the central frames and the side frames. 11 . The information processing apparatus of claim 10 , wherein the execution of the fine-tuning is repeated, and the number of frames of the central frames is increased with each repetition of the execution of the fine-tuning with the central frames. 12 . The information processing apparatus of claim 9 , wherein execute pre-training and fine-tuning comprises: executing the pre-training for both the central frames and the side frames; and executing the fine-tuning with applying regularization on the connections from the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer of hidden layers of the Deep Neural Network.

Assignees

Inventors

Kurata Gakuto

Classifications

G10L15/063Primary
Training · CPC title
G10L25/30Primary
using neural networks · CPC title
G10L15/144
Training of HMMs · CPC title
G10L2015/0633
using lexical or orthographic knowledge sources · CPC title

Patent family

Related publications grouped by family.

View patent family 57602699

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016379665A1 cover?: A method is provided for training a Deep Neural Network (DNN) for acoustic modeling in speech recognition. The method includes reading central frames and side frames as input frames from a memory. The side frames are preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames. The method further includes executing pre-training for only the cent…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Dec 29 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).