Who is the assignee on this patent?

Electronics & Telecommunications Res Inst

What technology area does this patent fall under?

Primary CPC classification G10L15/30. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jul 06 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Voice recognition terminal, voice recognition server, and voice recognition method for performing personalized voice recognition

US2017194002A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2017194002-A1
Application number	US-201615193216-A
Country	US
Kind code	A1
Filing date	Jun 27, 2016
Priority date	Jan 5, 2016
Publication date	Jul 6, 2017
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A voice recognition terminal, a voice recognition server, and a voice recognition method for performing personalized voice recognition. The voice recognition terminal includes a feature extraction unit for extracting feature data from an input voice signal, an acoustic score calculation unit for calculating acoustic model scores using the feature data, and a communication unit for transmitting the acoustic model scores and state information to a voice recognition server in units of one or more frames, and receiving transcription data from the voice recognition server, wherein the transcription data is recognized using a calculated path of a language network when the voice recognition server calculates the path of the language network using the acoustic model scores.

First claim

Opening claim text (preview).

What is claimed is: 1 . A voice recognition terminal, comprising: a feature extraction unit for extracting feature data from an input voice signal; an acoustic score calculation unit for calculating acoustic model scores using the feature data; and a communication unit for transmitting the acoustic model scores and state information to a voice recognition server in units of one or more frames, and receiving transcription data from the voice recognition server, wherein the transcription data is recognized using a calculated path of a language network when the voice recognition server calculates the path of the language network using the acoustic model scores. 2 . The voice recognition terminal of claim 1 , further comprising a data selection unit for selecting acoustic model scores to be transmitted to the voice recognition server. 3 . The voice recognition terminal of claim 2 , wherein the data selection unit selects only n-best candidates from among the calculated acoustic model scores. 4 . The voice recognition terminal of claim 2 , wherein the data selection unit selects acoustic model scores corresponding to candidate information, received from the voice recognition server, from among the calculated acoustic model scores. 5 . The voice recognition terminal of claim 2 , wherein the data selection unit selects n-best state scores of a last hidden layer from among the calculated acoustic model scores. 6 . The voice recognition terminal of claim 1 , further comprising a storage unit for matching the extracted feature data with the transcription data received from the voice recognition server, and storing a result of matching as adaptation data. 7 . The voice recognition terminal of claim 6 , further comprising an acoustic model adaptation unit for performing adaptation of an acoustic model using the stored adaptation data. 8 . The voice recognition terminal of claim 7 , wherein the acoustic model adaptation unit performs the adaptation of the acoustic model during a time corresponding to any one of a preset time, a time during which the voice signal is not input, and a time during which communication with the voice recognition server is not performed. 9 . The voice recognition terminal of claim 1 , wherein the acoustic model scores are represented in a fixed point format, and the state information is represented by a binary value. 10 . A voice recognition server, comprising: a reception unit for receiving, from a voice recognition terminal that extracts feature data from a voice signal and calculates acoustic model scores, both state information and the acoustic model scores that are clustered into units of one or more frames; a voice recognition unit for generating transcription data by applying the received acoustic model scores to a large-capacity language network; and a transmission unit for transmitting the transcription data, generated as a result of voice recognition, to the voice recognition terminal. 11 . The voice recognition server of claim 10 , wherein the reception unit receives state information, required for calculation, of scores of a higher token, from the voice recognition terminal. 12 . The voice recognition server of claim 10 , wherein the voice recognition unit calculates a final acoustic model score by applying n-best state scores of a last hidden layer, received from the voice recognition terminal, to a model corresponding to a final output layer, and, performs voice recognition using the calculated final acoustic model score. 13 . A voice recognition method using a voice recognition terminal, comprising: extracting feature data from an input voice signal; calculating acoustic model scores using the extracted feature data; transmitting the acoustic model scores and state information to a voice recognition server in units of one or more frames; and receiving transcription data from the voice recognition server, wherein the transcription data is recognized using a calculated path of a language network when the voice recognition server calculates the path of the language network using the acoustic model scores. 14 . The voice recognition method of claim 13 , further comprising selecting acoustic model scores to be transmitted to the voice recognition server. 15 . The voice recognition method of claim 14 , wherein selecting the acoustic model scores is configured to select only n-best candidates from among the calculated acoustic model scores. 16 . The voice recognition method of claim 14 , wherein selecting the acoustic model scores is configured to select acoustic model scores corresponding to candidate information, received from the voice recognition server from among the calculated acoustic model scores. 17 . The voice recognition method of claim 14 , wherein selecting the acoustic model scores is configured to select n-best state scores of a last hidden layer, from among the calculated acoustic model scores. 18 . The voice recognition method of claim 13 , further comprising matching the extracted feature data with the transcription data received from the voice recognition server, and storing a result of matching as adaptation data. 19 . The voice recognition method of claim 18 , further comprising performing adaptation of an acoustic model using the stored, adaptation data. 20 . The voice recognition method of claim 19 , wherein performing the adaptation of the acoustic model is configured to perform the adaptation of the acoustic model during a time corresponding to any one of a preset time, a time during which the voice signal is not input, and a time during which communication with the voice, recognition server is not performed.

Assignees

Electronics & Telecommunications Res Inst

Inventors

Kim Dong-Hyun

Classifications

G10L15/075
supervised, i.e. under machine guidance · CPC title
G10L15/183
using context dependencies, e.g. language models · CPC title
G10L15/16
using artificial neural networks · CPC title
G10L15/02
Feature extraction for speech recognition; Selection of recognition unit · CPC title
G10L15/30Primary
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

Patent family

Related publications grouped by family.

View patent family 59235774

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017194002A1 cover?: A voice recognition terminal, a voice recognition server, and a voice recognition method for performing personalized voice recognition. The voice recognition terminal includes a feature extraction unit for extracting feature data from an input voice signal, an acoustic score calculation unit for calculating acoustic model scores using the feature data, and a communication unit for transmitting …
Who is the assignee on this patent?: Electronics & Telecommunications Res Inst
What technology area does this patent fall under?: Primary CPC classification G10L15/30. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jul 06 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).