What technology area does this patent fall under?

Primary CPC classification G10L15/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 21 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Dataset shift compensation in machine learning

US2016019883A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016019883-A1
Application number	US-201414331230-A
Country	US
Kind code	A1
Filing date	Jul 15, 2014
Priority date	Jul 15, 2014
Publication date	Jan 21, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for inter-dataset variability compensation, the method comprising using at least one hardware processor for: receiving a heterogeneous development dataset comprising multiple samples and metadata associated with at least some of the multiple samples; dividing the multiple samples into multiple homogenous subsets, based on the metadata; averaging high-level features of each of the multiple homogenous subsets, to produce multiple central high-level features for the multiple homogenous subsets, respectively; computing an inter-dataset variability subspace spanned by the multiple central high-level features; removing the inter-dataset variability subspace from the high-level features of the multiple homogenous subsets, to produce denoised samples; and training a machine learning system using the denoised speech samples.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for inter-dataset variability compensation, the method comprising using at least one hardware processor for: receiving a heterogeneous development dataset comprising multiple samples and metadata associated with at least some of the multiple samples; dividing the multiple samples into multiple homogenous subsets, based on the metadata; computing a statistical measure of high-level features of each of the multiple homogenous subsets, to produce multiple central high-level features for the multiple homogenous subsets, respectively; computing an inter-dataset variability subspace spanned by the multiple central high-level features; removing the inter-dataset variability subspace from the high-level features of the multiple homogenous subsets, to produce denoised samples; and training a machine learning system using the denoised speech samples. 2 . The method according to claim 1 , wherein the high-level features are selected from the group consisting of: i-vectors, GMM (Gaussian Mixture Model) supervectors, HMM (Hidden Markov Model) supervectors, d-vectors, JFA (Joint Factor Analysis) supervectors, LBP (Local Binary Patterns), HOG (Histograms of Oriented Gradients), and EBIF (Early Biologically-Inspired Features). 3 . The method according to claim 2 , wherein the machine learning system is selected from the group consisting of: a PLDA (Probabilistic Linear Discriminant Analysis)-based system, an SVM (Support Vector Machine)-based system, a neural network-based system, a NAP (Nuisance Attribute Projection)-based system, a WCCN (Within-Speaker Covariance Matrix)-based system, and an LDA (Linear Discriminant Analysis)-based system. 4 . The method according to claim 3 , wherein the multiple samples are speech samples. 5 . The method according to claim 4 , wherein the heterogeneous development dataset is devoid of speech samples from a target domain of the speaker recognition. 6 . The method according to claim 4 , wherein the metadata comprises at least one parameter selected from the group consisting of: speaker gender, spoken language and recordation setting. 7 . The method according to claim 4 , wherein the computing of the inter-dataset variability subspace comprises PCA (Principal Component Analysis). 8 . A method for inter-dataset variability compensation for speaker recognition, the method comprising using at least one hardware processor for: receiving a heterogeneous development dataset comprising multiple speech samples; dividing the multiple speech samples into multiple homogenous subsets; for each subset i of the multiple homogenous subsets: (a) estimating PLDA (Probabilistic Linear Discriminant Analysis) hyper-parameters {μ i , B i , W i }, wherein p denotes a center of an i-vector space, B denotes a between-speaker covariance matrix and W denotes a within-speaker covariance matrix, and (b) computing an i-vector subspace S μ corresponding to {μ i }, an i-vector subspace S W corresponding to {W i }, and an i-vector subspace S B corresponding to {B i }; joining i-vector subspaces S μ , S W and S B into a single subspace S; removing subspace S from i-vectors of the multiple speech samples, to produce denoised speech samples; and training a PLDA speaker recognition system using the denoised speech samples. 9 . The method according to claim 8 , further comprising smoothing B by linear interpolation using an estimated diagonal of B. 10 . The method according to claim 8 , wherein: the heterogeneous development dataset further comprises metadata associated with at least some of the multiple speech samples; and the dividing is based on the metadata. 11 . The method according to claim 8 , wherein the computing of each of the i-vector subspaces S μ , S W and S B comprises PCA (Principal Component Analysis). 12 . The method according to claim 8 , further comprising computing an average of squared {W i }, and finding a k number of largest eigenvalues of the squared {W i }, wherein the k largest eigenvalues span the i-vector subspace S W . 13 . The method according to claim 12 , further comprising whitening the i-vector subspace S W with respect to W. 14 . The method according to claim 8 , further comprising computing an average of squared {B i }, and finding an m number of largest eigenvalues of the squared {B i }, wherein the k largest eigenvalues span the i-vector subspace S B . 15 . The method according to claim 14 , further comprising whitening the i-vector subspace S B with respect to B. 16 . A computer program product for inter-dataset variability compensation for speaker recognition, the computer program product comprising a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by at least one hardware processor to: receive a heterogeneous development dataset comprising multiple speech samples; divide the multiple speech samples into multiple homogenous subsets; for each subset i of the multiple homogenous subsets: (a) estimate PLDA (Probabilistic Linear Discriminant Analysis) hyper-parameters {μ i , B i , W i }, wherein μ denotes a center of an i-vector space, B denotes a between-speaker covariance matrix and W denotes a within-speaker covariance matrix, and (b) compute an i-vector subspace S μ corresponding to {μ i }, an i-vector subspace S W corresponding to {W i }, and an i-vector subspace S B corresponding to {B i }; join i-vector subspaces S μ , S W and S B into a single subspace S; remove subspace S from i-vectors of the multiple speech samples, to produce denoised speech samples; and train a PLDA speaker recognition system using the denoised speech samples. 17 . The computer program product according to claim 16 , wherein the program code is further executable by the at least one hardware processor to smooth B by linear interpolation using an estimated diagonal of B. 18 . The computer program product according to claim 16 , wherein: the heterogeneous development dataset further comprises metadata associated with at least some of the multiple speech samples; and the dividing is based on the metadata. 19 . The computer program product according to claim 16 , wherein the computing of each of the i-vector subspaces S μ , S W and S B comprises PCA (Principal Component Analysis). 20 . The computer program product according to claim 16 , wherein the program code is further executable by the at least one hardware processor to: compute an average of squared {W,}; find a k number of largest eigenvalues of the squared {W i }, wherein the k largest eigenvalues span the i-vector subspace S W ; whiten the i-vector subspace S W with respect to W; compute an average of squared {B i }; find an m number of largest eigenvalues of the squared {B i }, wherein the m largest eigenvalues span the i-vector subspace S b ; and whiten the i-vector subspace S B with respect to B.

Assignees

Inventors

Aronowitz Hagai

Classifications

G10L17/26
Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices · CPC title
G10L21/0208
Noise filtering · CPC title
G10L17/04
Training, enrolment or model building · CPC title
G10L15/20
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
G10L17/20
Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions · CPC title

Patent family

Related publications grouped by family.

View patent family 55075077

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016019883A1 cover?: A method for inter-dataset variability compensation, the method comprising using at least one hardware processor for: receiving a heterogeneous development dataset comprising multiple samples and metadata associated with at least some of the multiple samples; dividing the multiple samples into multiple homogenous subsets, based on the metadata; averaging high-level features of each of the multi…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 21 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).