What technology area does this patent fall under?

Primary CPC classification G10L15/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Document transcription system training

US9552809B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9552809-B2
Application number	US-201615066677-A
Country	US
Kind code	B2
Filing date	Mar 10, 2016
Priority date	Aug 20, 2004
Publication date	Jan 24, 2017
Grant date	Jan 24, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

First claim

Opening claim text (preview).

What is claimed is: 1. In a system including a first document containing at least some information in common with a spoken audio stream, a method comprising steps of: (A) determining that text in the first document represents an instance of a concept, comprising determining that the text has a format associated with the concept; (B) replacing the identified text with a context-free grammar specifying the plurality of spoken forms of the concept to produce a second document; (C) generating a document-specific language model based on the second document, comprising generating at least some of the document-specific language model based on the context-free grammar; (D) using the first language model in a speech recognition process to recognize the spoken audio stream and thereby to produce a third document. 2. The method of claim 1 , further comprising: (E) using the third document and the spoken audio stream to train an acoustic model. 3. The method of claim 2 , wherein (E) comprises: (E) (1) filtering text from the third document by reference to the second document to produce a filtered document in which text filtered from the third document is marked as unreliable; and (E) (2) using the filtered document and the spoken audio stream to train the acoustic model. 4. A non-transitory computer-readable medium comprising computer program instructions executable by at least one computer processor to perform a method for use with a system, the system including a first document containing at least some information in common with a spoken audio stream, the method comprising: (E) determining that text in the first document represents an instance of a concept, comprising determining that the text has a format associated with the concept; (F) replacing the identified text with a context-free grammar specifying the plurality of spoken forms of the concept to produce a second document; (G) generating a document-specific language model based on the second document, comprising generating at least some of the document-specific language model based on the context-free grammar; (H) using the first language model in a speech recognition process to recognize the spoken audio stream and thereby to produce a third document. 5. The non-transitory computer-readable medium of claim 4 , wherein the method further comprises: (E) using the third document and the spoken audio stream to train an acoustic model. 6. The non-transitory computer-readable medium of claim 5 , wherein (E) comprises: (E) (1) filtering text from the third document by reference to the second document to produce a filtered document in which text filtered from the third document is marked as unreliable; and (E) (2) using the filtered document and the spoken audio stream to train the acoustic model.

Assignees

Mmodal Ip Llc

Inventors

Classifications

G10L15/063Primary
Training · CPC title
G10L15/193
Formal grammars, e.g. finite state automata, context free grammars or word networks · CPC title
G10L15/26
Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 35910686

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9552809B2 cover?: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual s…
Who is the assignee on this patent?: Mmodal Ip Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).