Who is the assignee on this patent?

Mysore Gautham J, Smaragdis Paris, Duan Zhiyao, and 1 more

What technology area does this patent fall under?

Primary CPC classification G10L21/028. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 08 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Online source separation

US9966088B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9966088-B2
Application number	US-201113335688-A
Country	US
Kind code	B2
Filing date	Dec 22, 2011
Priority date	Sep 23, 2011
Publication date	May 8, 2018
Grant date	May 8, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Online source separation may include receiving a sound mixture that includes first audio data from a first source and second audio data from a second source. Online source separation may further include receiving pre-computed reference data corresponding to the first source. Online source separation may also include performing online separation of the second audio data from the first audio data based on the pre-computed reference data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving a mono channel signal including a sound mixture that includes first audio data from a first source and second audio data from a second source; receiving pre-computed reference data corresponding to the first source; and performing online separation of the second audio data from the first audio data based on the pre-computed reference data. 2. The method of claim 1 , wherein said performing online separation is performed in real-time. 3. The method of claim 1 , wherein said performing online separation includes modeling the second audio data with a plurality of basis vectors. 4. The method of claim 1 , wherein said performing online separation includes: determining that a frame of the sound mixture includes audio data other than the first audio data; and separating the second audio data from the first audio data for the frame. 5. The method of claim 4 , wherein said separating includes: for the frame, determining spectral bases for the second source and determining a plurality of weights for each of the first and second sources; and updating a dictionary for the second source with the determined spectral bases and updating a set of weights with the determined plurality of weights for each of the first and second sources. 6. The method of claim 1 , wherein said performing online separation includes: determining that a frame of the sound mixture does not include second audio data; and bypassing updating a dictionary for the second source for the frame. 7. The method of claim 1 , wherein said performing online separation is performed using probabilistic latent component analysis (PLCA). 8. The method of claim 1 , further comprising reconstructing a signal that includes the second audio data based on said online separation. 9. The method of claim 1 , wherein the pre-computed reference data includes a plurality of spectral basis vectors of the first source. 10. The method of claim 1 , wherein the pre-computed reference data is computed from different audio data than the first audio data, wherein the different audio data is of a same source type as the first source. 11. The method of claim 1 , wherein the sound mixture includes audio data from N sources including the first and second sources, further comprising: receiving pre-computed reference data corresponding to each of the N sources other than the second source; wherein said performing online separation further includes separating the second audio data from audio data from each of the other N−1 sources based on the pre-computed reference data corresponding to each of the other N−1 sources. 12. The method of claim 1 , wherein the first audio data is a spectrogram of a signal from the first source, wherein each segment of the spectrogram is represented by a convex combination of spectral components of the pre-computed reference data. 13. The method of claim 1 , wherein the first source is a non-stationary noise source. 14. A non-transitory computer-readable storage medium storing program instructions, wherein the program instructions are computer-executable to implement: receiving a sound mixture that includes audio data from a plurality of sources including first audio data from a first source and other audio data from one or more other sources; receiving a pre-computed dictionary corresponding to each source other than the first source; and performing online separation of the first audio data by separating the first audio data from each of the one or more other sources based on the pre-computed dictionaries. 15. The non-transitory computer-readable storage medium of claim 14 , wherein said performing online separation is performed in real-time. 16. The non-transitory computer-readable storage medium of claim 14 , wherein said performing online separation includes modeling the first audio data with a plurality of basis vectors. 17. The non-transitory computer-readable storage medium of claim 14 , wherein to implement said performing online separation, the program instructions are further computer-executable to implement: determining that a frame of the sound mixture includes the other audio data; and separating the first audio data from the other audio data for the frame. 18. The non-transitory computer-readable storage medium of claim 14 , wherein to implement said separating, the program instructions are further computer-executable to implement: for the frame, determining spectral bases for the first source and determining a plurality of weights for each of the first and one or more other sources; and updating a dictionary for the first source with the determined spectral bases and updating a set of weights with the determined plurality of weights for each of the first and one or more other sources. 19. The non-transitory computer-readable storage medium of claim 14 , wherein to implement said performing online separation, the program instructions are further computer-executable to implement: determining that a frame of the sound mixture does not include the first audio data; and bypassing updating a dictionary for the first source for the frame. 20. A system, comprising: at least one processor; and a memory comprising program instructions, wherein the program instructions are executable by the at least one processors to: receive a sound mixture comprising signals originated from a plurality of sources combined into a lesser number of channels, the sound mixture having first audio data from a first source and second audio data from a second source; receive pre-computed reference data corresponding to the first source; and perform online separation of the second audio data from the first audio data based on the pre-computed reference data.

Assignees

Inventors

Classifications

G10L21/028Primary
using properties of sound source · CPC title

Patent family

Related publications grouped by family.

View patent family 48280670

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9966088B2 cover?: Online source separation may include receiving a sound mixture that includes first audio data from a first source and second audio data from a second source. Online source separation may further include receiving pre-computed reference data corresponding to the first source. Online source separation may also include performing online separation of the second audio data from the first audio data…
Who is the assignee on this patent?: Mysore Gautham J, Smaragdis Paris, Duan Zhiyao, and 1 more
What technology area does this patent fall under?: Primary CPC classification G10L21/028. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 08 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).