Who is the assignee on this patent?

Suzuki Hirokazu, Shimogori Nobuhiro, Ikeda Tomoo, and 4 more

What technology area does this patent fall under?

Primary CPC classification G10L15/26. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 28 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Transcription support system and transcription support method

US10304457B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10304457-B2
Application number	US-201213420827-A
Country	US
Kind code	B2
Filing date	Mar 15, 2012
Priority date	Jul 26, 2011
Publication date	May 28, 2019
Grant date	May 28, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to one embodiment, a transcription support system supports transcription work to convert voice data to text. The system includes a first storage unit configured to store therein the voice data; a playback unit configured to play back the voice data; a second storage unit configured to store therein voice indices, each of which associates a character string obtained from a voice recognition process with voice positional information, for which the voice positional information is indicative of a temporal position in the voice data and corresponds to the character string; a text creating unit that creates the text in response to an operation input of a user; and an estimation unit configured to estimate already-transcribed voice positional information indicative of a position at which the creation of the text is completed in the voice data based on the voice indices.

First claim

Opening claim text (preview).

What is claimed is: 1. A text processing device comprising: a memory having computer executable components stored therein; and a processing circuit communicatively coupled to the memory, the processing circuit configured to generate voice indices, each of which associates a character string obtained from a voice recognition process with voice positional information, the voice positional information indicative of a temporal position in voice data and corresponding to the character string; create text in response to an operation input of a user; and when determining that a last character string of the text does not match any of the character strings included in the voice indices and further determining that any of the character strings other than the last character string of the text matches any of the character strings included in the voice indices, retrieve, from the voice indices, the voice positional information corresponding to a basing character string indicative of a character string closest to the last character string among the character strings matched with any of the character strings included in the voice indices, estimate a first playback time indicative of a time necessary to play back mismatched character strings indicative of the character strings from the character string next to the basing character string to the last character string among the character strings constituting the text, estimate already-transcribed voice positional information from the voice positional information corresponding to the basing character string and the first playback time, the already-transcribed voice positional information indicative of a temporal position at which the creation of the text is completed in the voice data, set the temporal position indicated by the estimated already-transcribed voice positional information as a playback starting position, and a playback circuit configured to play back the voice data based on the already-transcribed voice positional information at the first playback time. 2. The device according to claim 1 , wherein a unit of each of the character strings constituting the created text is a morpheme. 3. The device according to claim 1 , wherein the processing circuit estimates the already-transcribed voice positional information by using a predetermined phoneme duration time. 4. The device according to claim 3 , wherein the processing circuit estimates the first playback time based on the predetermined phoneme duration time, and estimates voice positional information the first playback time ahead of the voice positional information corresponding to the basing character string as the already-transcribed voice positional information. 5. A text processing device comprising: a memory having computer executable components stored therein; and a processing circuit communicatively coupled to the memory, the processing circuit configured to generate voice indices, each of which associates a character string obtained from a voice recognition process with voice positional information, the voice positional information indicative of a temporal position in voice data and corresponding to the character string; create text in response to an operation input of a user until a punctuation is input; and when determining that a last character string of the text does not match any of the character strings included in the voice indices and further determining that any of the character strings other than the last character string of the text matches any of the character strings included in the voice indices, retrieve, from the voice indices, the voice positional information corresponding to a basing character string indicative of a character string closest to the last character string among the character strings matched with any of the character strings included in the voice indices, estimate a first playback time indicative of a time necessary to play back mismatched character strings indicative of the character strings from the character string next to the basing character string to the last character string among the character strings constituting the text, estimate already-transcribed voice positional information from the voice positional information corresponding to the basing character string and the first playback time, set the temporal position indicated by the estimated already-transcribed voice positional information as a playback starting position, and a playback circuit configured to play back the voice data based on the already-transcribed voice positional information at the first playback time. 6. The device according to claim 5 , wherein a unit of each of the character strings constituting the created text is a morpheme. 7. A text processing method comprising: generating voice indices, each of which associates a character string obtained from a voice recognition process with voice positional information, the voice positional information indicative of a temporal position in voice data and corresponding to the character string; creating text in response to an operation input of a user; and when it is determined that a last character string of the text does not match any of character strings that are included in the voice indices and when it is further determined that any of the character strings other than the last character string of the text matches any of the character strings included in the voice indices, retrieving, from the voice indices, the voice positional information corresponding to a basing character string indicative of a character string closest to the last character string among the character strings matched with any of the character strings included in the voice indices, first estimating a first playback time indicative of a time necessary to play back mismatched character strings indicative of the character strings from the character string next to the basing character string to the last character string among the character strings constituting the text, second estimating already-transcribed voice positional information from the voice positional information corresponding to the basing character string and the first playback time, the already-transcribed voice positional information indicative of a temporal position at which the creation of the text is completed in the voice data, setting the temporal position indicated by the estimated already-transcribed voice positional information as a playback starting position, and playing back the voice data based on the already-transcribed voice positional information. 8. The method according to claim 7 , wherein the creating includes creating the text in accordance with an input of the user who listens to the voice data. 9. The method according to claim 7 , wherein a unit of each of the character strings constituting the created text is a morpheme. 10. The method according to claim 7 , wherein the second estimating includes estimating the already-transcribed voice positional information by using a predetermined phoneme duration time.

Assignees

Inventors

Classifications

G10L15/26Primary
Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 47597963

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10304457B2 cover?: According to one embodiment, a transcription support system supports transcription work to convert voice data to text. The system includes a first storage unit configured to store therein the voice data; a playback unit configured to play back the voice data; a second storage unit configured to store therein voice indices, each of which associates a character string obtained from a voice recogn…
Who is the assignee on this patent?: Suzuki Hirokazu, Shimogori Nobuhiro, Ikeda Tomoo, and 4 more
What technology area does this patent fall under?: Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 28 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).