Who is the assignee on this patent?

Chun Byung Ha, Gales Mark John Francis, Toshiba Kk

What technology area does this patent fall under?

Primary CPC classification G10L21/003. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 06 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Voice conversion method and system

Patent metadata
Field	Value
Publication number	US-8930183-B2
Application number	US-201113217628-A
Country	US
Kind code	B2
Filing date	Aug 25, 2011
Priority date	Mar 29, 2011
Publication date	Jan 6, 2015
Grant date	Jan 6, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; in a processor, mapping the speech from the first voice to a second voice using a Gaussian process; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice and using said plurality of kernels to define a non-parametric Gaussian process prior for said mapping. 2. A method according to claim 1 , wherein kernels are derived for both static and dynamic speech features. 3. A method according to claim 1 , wherein the speech to be output is determined according to a Gaussian Process predictive distribution: p ( y t |x t ,x*,y *, )= (μ( x t ),Σ( x t )), where y t is the speech vector for frame t to be output, x t is the speech vector for the input speech for frame t, x*, y* is {x 1 *, y 1 *}, . . . , {x N *, y N *}, where x t * is the t-th frame of training data for the first voice and y t * is the t-th frame of training data for the second voice, M denotes the model, μ(x t ) and Σ(x t ) are the mean and variance of the predictive distribution for given x t . 4. A method according to claim 3 , wherein μ ⁡ ( x t ) = m ⁡ ( x t ) + k t T ⁡ [ K * + σ 2 ⁢ I ] - 1 ⁢ ( y * - μ * ) , ⁢ ∑ ( x t ) = k ⁡ ( x t , x t ) + σ 2 - k t T ⁢ { K * + σ 2 ⁢ I ] - 1 ⁢ k t , ⁢ where μ * = [ m ⁡ ( x 1 * ) ⁢ ⁢ m ⁡ ( x 2 * ) ⁢ ⁢ … ⁢ ⁢ m ⁡ ( x N * ) ] T

Assignees

Inventors

Classifications

G10L21/007
characterised by the process used · CPC title
G10L13/033
Voice editing, e.g. manipulating the voice of the synthesiser · CPC title
G10L2021/0135
Voice conversion or morphing · CPC title
G10L21/00
Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility (G10L19/00 takes precedence) · CPC title
G10L15/063
Training · CPC title

Patent family

Related publications grouped by family.

View patent family 44067599

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US8930183B2 cover?: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the fir…
Who is the assignee on this patent?: Chun Byung Ha, Gales Mark John Francis, Toshiba Kk
What technology area does this patent fall under?: Primary CPC classification G10L21/003. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 06 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).