Open earphone
US-2024422466-A1 · Dec 19, 2024 · US
US9576587B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9576587-B2 |
| Application number | US-201414301676-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 11, 2014 |
| Priority date | Jun 12, 2013 |
| Publication date | Feb 21, 2017 |
| Grant date | Feb 21, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for cross-modal signal denoising, the method comprising using at least one hardware processor for: providing a first multi-modal signal comprising at least two relatively clear modalities; correlating features exhibited simultaneously in the at least two relatively clear modalities of the first multi-modal signal; providing a second multi-modal signal comprising at least one relatively noisy modality and at least one relatively clear modality; and denoising the at least one relatively noisy modality of the second multi-modal signal by associating between (a) features exhibited in the at least one relatively noisy modality of the second multi-modal signal and (b) the features of the first multi-modal signal.
Opening claim text (preview).
What is claimed is: 1. A method for cross-modal signal denoising, the method comprising using at least one hardware processor for: providing a first multi-modal signal comprising at least two relatively clear modalities; correlating features exhibited simultaneously in the at least two relatively clear modalities of the first multi-modal signal; providing a second multi-modal signal comprising at least one relatively noisy modality and at least one relatively clear modality; and denoising the at least one relatively noisy modality of the second multi-modal signal by associating between (a) features exhibited in the at least one relatively noisy modality of the second multi-modal signal and (b) the correlated features of the first multi-modal signal. 2. The method according to claim 1 , wherein said denoising comprises replacing the features exhibited in the at least one relatively noisy modality of the second multi-modal signal with the features exhibited in one of the at least two relatively clear modalities of the first multi-modal signal. 3. The method according to claim 2 , wherein said replacing is based on a statistical analysis of the features of: one of the at least two relatively clear modalities of the first multi-modal signal; and features exhibited in the at least one relatively clear modality of the second multi-modal signal. 4. The method according to claim 2 , wherein said replacing is based on a pattern recognition of the features of: one of the at least two relatively clear modalities of the first multi-modal signal, and features exhibited in the at least one relatively clear modality of the second multi-modal signal. 5. The method according to claim 1 , wherein: the at least two relatively clear modalities of the first multi-modal signal are an audio modality and a video modality; the at least one relatively noisy modality of the second multi-modal signal is an audio modality; and the at least one relatively clear modality of the second multi-modal signal is a video modality. 6. The method according to claim 1 , further comprising dividing one of the at least two relatively clear modalities of the first multi-modal signal into a plurality of temporal segments. 7. The method according to claim 6 , wherein each of the plurality of temporal segments is between 0.2 and 0.4 seconds long. 8. An apparatus comprising: an image sensor configured for video capture; a microphone; a non-transient memory having stored thereon correlated features exhibited simultaneously in a relatively clear video modality and in a relatively clear audio modality both belonging to a first multi-modal signal; and at least one hardware processor configured to: (a) receive a second multi-modal signal comprising a relatively clear video modality from said image sensor and a relatively noisy audio modality from said microphone, and (b) denoise the relatively noisy audio modality of the second multi-modal signal by associating between (i) features exhibited in the relatively noisy audio modality of the second multi-modal signal and (ii) the correlated features of the first multi-modal signal. 9. The apparatus according to claim 8 , wherein said at least one hardware processor is further configured to replace the features exhibited in the relatively noisy audio modality of the second multi-modal signal with the features exhibited in the relatively clear audio modality of the first multi-modal signal. 10. The apparatus according to claim 9 , wherein said replace is based on a statistical analysis of the features of: the relatively clear video modality of the first multi-modal signal; and the relatively clear video modality of the second multi-modal signal. 11. The apparatus according to claim 9 , wherein said replace is based on a pattern recognition of the features of: the relatively clear video modality of the first multi-modal signal; and the relatively clear video modality of the second multi-modal signal. 12. The apparatus according to claim 9 , wherein said at least one hardware processor is further configured to divide the relatively clear audio modality of the first multi-modal signal into a plurality of temporal segments. 13. The apparatus according to claim 12 , wherein each of the plurality of temporal segments is between 0.2 and 0.4 seconds long. 14. A computer program product for cross-modal signal denoising, comprising a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by at least one hardware processor to: provide a first multi-modal signal comprising at least two relatively clear modalities; correlate features exhibited simultaneously in the at least two relatively clear modalities of the first multi-modal signal; provide a second multi-modal signal comprising at least one relatively noisy modality and at least one relatively clear modality; and denoise the at least one relatively noisy modality of the second multi-modal signal by associating between (a) features exhibited in the at least one relatively noisy modality of the second multi-modal signal and (b) the correlated features of the first multi-modal signal. 15. The computer program product according to claim 14 , wherein said denoise comprises replacing the features exhibited in the at least one relatively noisy modality of the second multi-modal signal with the features exhibited in one of the at least two relatively clear modalities of the first multi-modal signal. 16. The computer program product according to claim 15 , wherein said replacing is based on a statistical analysis of the features of: one of the at least two relatively clear modalities of the first multi-modal signal; and features exhibited in the at least one relatively clear modality of the second multi-modal signal. 17. The computer program product according to claim 16 , wherein said replacing is based on a pattern recognition of the features of: one of the at least two relatively clear modalities of the first multi-modal signal, and features exhibited in the at least one relatively clear modality of the second multi-modal signal. 18. The computer program product according to claim 14 , wherein: the at least two relatively clear modalities of the first multi-modal signal are an audio modality and a video modality; the at least one relatively noisy modality of the second multi-modal signal is an audio modality; and the at least one relatively clear modality of the second multi-modal signal is a video modality. 19. The computer program product according to claim 14 , wherein said program code is further executable to divide one of the at least two relatively clear modalities of the first multi-modal signal into a plurality of temporal segments. 20. The computer program product according to claim 19 , wherein each of the plurality of temporal segments is between 0.2 and 0.4 seconds long. 21. A method for cross-modal signal denoising, the method comprising using at least one hardware processor for: providing correlated features exhibited simultaneously in a relatively clear video modality and in a relatively clear audio modality both belonging to a first multi-modal signal; providing a second multi-modal signal comprising at least one relatively noisy modality and at least one relatively clear modality; and denoising the at least one relatively noisy modality of the second multi-modal signal by associating between (a) features exhibited in the at least one relati
using position of the lips, movement of the lips or face analysis · CPC title
Processing of audio elementary streams · CPC title
Noise filtering · CPC title
for processing of video signals · CPC title
the noise being separate speech, e.g. cocktail party · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.