Apparatus and method for editing speech synthesis, and computer readable medium

US9020821B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9020821-B2
Application numberUS-201113235656-A
CountryUS
Kind codeB2
Filing dateSep 19, 2011
Priority dateMar 17, 2011
Publication dateApr 28, 2015
Grant dateApr 28, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An acquisition unit analyzes a text, and acquires phonemic and prosodic information. An editing unit edits a part of the phonemic and prosodic information. A speech synthesis unit converts the phonemic and prosodic information before editing the part to a first speech waveform, and converts the phonemic and prosodic information after editing the part to a second speech waveform. A period calculation unit calculates a contrast period corresponding to the part in the first speech waveform and the second speech waveform. A speech generation unit generates an output waveform by connecting a first partial waveform and a second partial waveform. The first partial waveform contains the contrast period of the first speech waveform. The second partial waveform contains the contrast period of the second speech waveform.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus for editing speech synthesis, comprising: an acquisition unit, executed by a computer using a program stored in a memory device, configured to analyze a text, and to acquire a phonemic and prosodic information to synthesize a speech corresponding to the text; a display that displays the phonemic and prosodic information; an editing unit, executed by the computer, configured to edit at least a part of the phonemic and prosodic information displayed on the display; a speech synthesis unit, executed by the computer, configured to convert the phonemic and prosodic information in which the part is not edited to a first speech waveform, and to convert the phonemic and prosodic information in which the part is edited to a second speech waveform; a period calculation unit, executed by the computer, configured to specify a partial sequence corresponding to the part not edited in the phonemic and prosodic information, and the part edited in the phonemic and prosodic information respectively, and to calculate a contrast period corresponding to the partial sequence in the first speech waveform and the second speech waveform respectively; a speech generation unit, executed by the computer, configured to generate an output waveform by connecting a first partial waveform and a second partial waveform, the first partial waveform being the contrast period of the first speech waveform, the second partial waveform being the contrast period of the second speech waveform; and a speaker that reproduces the output waveform. 2. The apparatus according to claim 1 , wherein the speech generation unit inserts a silent period having a predetermined length between the first partial waveform and the second partial waveform in the output waveform. 3. The apparatus according to claim 1 , wherein the acquisition unit comprises a reading/prosodic sign generation unit configured to generate a reading sign and a prosodic sign by analyzing the text, and a synthesized speech control information generation unit configured to generate a synthesized speech control information by analyzing the reading sign and the prosodic sign, and the editing unit edits at least one of the reading sign, the prosodic sign and the synthesized speech control information, or a combination thereof. 4. The apparatus according to claim 3 , wherein the period calculation unit calculates the contrast period by using a duration included in the synthesized speech control information. 5. The apparatus according to claim 4 , wherein the period calculation unit comprises a partial sequence editing unit configured to edit the partial sequence, and calculates the contrast period corresponding to the partial sequence edited by the partial sequence editing unit. 6. The apparatus according to claim 1 , further comprising: wherein the display displays an information representing which of the first partial waveform and the second partial waveform is being outputted by the speaker. 7. A method for editing speech synthesis, comprising: analyzing, by a computer using a program stored in a memory device, a text; acquiring, by the computer, a phonemic and prosodic information to synthesize a speech corresponding to the text; displaying, by the computer, the phonemic and prosodic information via a display; editing, by the computer, at least a part of the phonemic and prosodic information displayed on the display; converting, by the computer, the phonemic and prosodic information in which the part is not edited to a first speech waveform; converting, by the computer, the phonemic and prosodic information in which the part is edited to a second speech waveform; specifying, by the computer, a partial sequence corresponding to the part not edited in the phonemic and prosodic information, and the part edited in the phonemic and prosodic information respectively; calculating, by the computer, a contrast period corresponding to the partial sequence in the first speech waveform and the second speech waveform respectively; generating, by the computer, an output waveform by connecting a first partial waveform and a second partial waveform, the first partial waveform being the contrast period of the first speech waveform, the second partial waveform being the contrast period of the second speech waveform; and reproducing, by the computer, the output waveform via a speaker. 8. A non-transitory computer readable medium for causing a computer to perform a method for editing speech synthesis, the method comprising: analyzing, by the computer using a program stored in a memory device, a text; acquiring, by the computer, a phonemic and prosodic information to synthesize a speech corresponding to the text; displaying, by the computer, the phonemic and prosodic information via a display; editing, by the computer, at least a part of the phonemic and prosodic information displayed on the display; converting, by the computer, the phonemic and prosodic information in which the part is not edited to a first speech waveform; converting, by the computer, the phonemic and prosodic information in which the part is edited to a second speech waveform; specifying, by the computer, a partial sequence corresponding to the part not edited in the phonemic and prosodic information, and the part edited in the phonemic and prosodic information respectively; calculating, by the computer, a contrast period corresponding to the partial sequence in the first speech waveform and the second speech waveform respectively; generating, by the computer, an output waveform by connecting a first partial waveform and a second partial waveform, the first partial waveform being the contrast period of the first speech waveform, the second partial waveform being the contrast period of the second speech waveform; and reproducing, by the computer, the output waveform via a speaker.

Assignees

Inventors

Classifications

  • G10L13/033Primary

    Voice editing, e.g. manipulating the voice of the synthesiser · CPC title

  • Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9020821B2 cover?
An acquisition unit analyzes a text, and acquires phonemic and prosodic information. An editing unit edits a part of the phonemic and prosodic information. A speech synthesis unit converts the phonemic and prosodic information before editing the part to a first speech waveform, and converts the phonemic and prosodic information after editing the part to a second speech waveform. A period calcul…
Who is the assignee on this patent?
Nishiyama Osamu, Toshiba Kk
What technology area does this patent fall under?
Primary CPC classification G10L13/033. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 28 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).