Apparatuses and methods for audio classifying and processing

US9842605B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9842605-B2
Application numberUS-201414779322-A
CountryUS
Kind codeB2
Filing dateMar 25, 2014
Priority dateMar 26, 2013
Publication dateDec 12, 2017
Grant dateDec 12, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatus and methods for audio classifying and processing are disclosed. In one embodiment, an audio processing apparatus includes an audio classifier for classifying an audio signal into at least one audio type in real time; an audio improving device for improving experience of audience; and an adjusting unit for adjusting at least one parameter of the audio improving device in a continuous manner based on the confidence value of the at least one audio type.

First claim

Opening claim text (preview).

The invention claimed is: 1. An audio processing apparatus comprising: an audio classifier for classifying an audio signal into at least one audio type in real time, wherein the audio classifier comprises an audio content classifier and an audio context classifier, wherein the audio content classifier performs short-term classification of the audio signal for a short-term segment, wherein the short-term segment includes a plurality of frames of the audio signal, and wherein the audio context classifier performs long-term classification of the audio signal for a long-term segment based on the short-term classification, wherein the long-term segment includes a plurality of short-term segments; an audio improving device for improving experience of audience based upon one or more labels used to steer the audio improving device; and an adjusting unit for adjusting at least one parameter of the audio improving device in a continuous manner based on a confidence value of the at least one audio type, wherein the at least one audio type is based on the short-term classification and the long-term classification, wherein the audio content classifier includes a short-term feature extractor and a short-term classifier, wherein the short-term classifier performs the short-term classification based on an output of the short-term feature extractor, and wherein the audio context classifier includes a statistics extractor, a long-term feature extractor and a long-term classifier, wherein the statistics extractor extracts statistics from an output of the short-term classifier, wherein the long-term feature extractor extracts long-term features based on the output of the short-term feature extractor, and wherein the long-term classifier performs the long-term classification based on the statistics from the statistics extractor and the long-term features. 2. The audio processing apparatus according to claim 1 , wherein the at least one audio type comprises at least one of content types of short-term music, speech, background sound and noise, and/or at least one of context types of long-term music, movie-like media, game and VoIP. 3. The audio processing apparatus according to claim 1 , wherein the at least one audio type comprises context type of VoIP or non-VoIP. 4. The audio processing apparatus according to claim 1 , wherein the at least one audio type comprises context type of high-quality audio or low-quality audio. 5. The audio processing apparatus according to claim 3 , wherein the short-term music comprises music without dominant sources or music with dominant sources. 6. The audio processing apparatus according to claim 3 , wherein the short-term music comprises at least one genre-based cluster or at least one instrument-based cluster or at least one music cluster classified based on rhythm, tempo, timbre of music and/or any other musical attributes. 7. The audio processing apparatus according to claim 1 , where the audio improving device comprises at least one selected from a dialog enhancer, a surround virtualizer, a volume leveler and an equalizer. 8. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises at least one selected from a dialog enhancer, a surround virtualizer, a volume leveler and an equalizer. 9. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises a dialog enhancer, and wherein the adjusting unit is configured to positively correlate the level of dialog enhancement of the dialog enhancer with the confidence value of movie-like media and/or VoIP, and or negatively correlate the level of dialog enhancement of the dialog enhancer with the confidence value of the long-term music and/or game. 10. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises a dialog enhancer, wherein the adjusting unit is configured to positively correlate the level of dialog enhancement of the dialog enhancer with the confidence value of speech. 11. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises a dialog enhancer for enhancing frequency bands above respective thresholds, wherein the adjusting unit is configured to positively correlate the thresholds with a confidence value of short-term music and/or noise and/or background sounds, and/or negatively correlate the thresholds with a confidence value of speech. 12. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises a minimum tracking unit for estimating the background level in the audio signal, wherein the adjusting unit is configured to assign an adjustment to the background level estimated by the minimum tracking unit, wherein the adjusting unit is further configured to positively correlate the adjustment with a confidence value of short-term music and/or noise and/or background sound, and/or negatively correlate the adjustment with a confidence value of speech. 13. The audio processing apparatus according to claim 12 , wherein the adjusting unit is configured to correlate the adjustment with the confidence value of noise and/or background sound more positively than the short-term music. 14. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises a surround virtualizer, wherein the adjusting unit is configured to positively correlate the surround boost amount of the surround virtualizer with a confidence value of noise and/or background sound and/or speech, and/or negatively correlate the surround boost amount with a confidence value of short-term music. 15. The audio processing apparatus according to claim 14 , wherein the adjusting unit is configured to correlate the surround boost amount with the confidence value of noise and/or background sound more positively than the content type speech. 16. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises a surround virtualizer, wherein the adjusting unit is configured to positively correlate the start frequency of the surround virtualizer with a confidence value of short-term music. 17. The audio processing apparatus according to claim 2 , wherein the audio improving device comprises a surround virtualizer, wherein the adjusting unit is configured to positively correlate the surround boost amount of the surround virtualizer with a confidence value of movie-like media and/or game, and/or negatively correlate the surround boost amount with a confidence value of long-term music and/or VolP. 18. The audio processing apparatus according to claim 17 , wherein the adjusting unit is configured to correlate the surround boost amount with the confidence value of movie-like media more positively than of game. 19. The audio processing apparatus according to claim 2 , wherein the adjusting unit is configured to adjust the at least one parameter based on the confidence value of at least one content type and the confidence value of at least one context type. 20. The audio processing apparatus according to claim 19 , wherein the content type in an audio signal of a different context type is assigned a different weight depending on the context type of the audio signal. 21. The audio processing apparatus according to claim 1 , wherein the adjusting unit is configured to consider at least some of the at least one audio type through weighting the confidence values of the at least one audio type based on the importance of the at least one

Assignees

Inventors

Classifications

  • by filtering complex waveforms (G10H1/14, G10H1/16 take precedence) · CPC title

  • Soundscape or sound field simulation, reproduction or control for musical purposes, e.g. surround or 3D sound; Granular synthesis · CPC title

  • Genre classification, i.e. descriptive metadata for classification or selection of musical pieces according to style · CPC title

  • Volume control · CPC title

  • for differentiation between music and non-music signals, based on the identification of musical parameters, e.g. based on tempo detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9842605B2 cover?
Apparatus and methods for audio classifying and processing are disclosed. In one embodiment, an audio processing apparatus includes an audio classifier for classifying an audio signal into at least one audio type in real time; an audio improving device for improving experience of audience; and an adjusting unit for adjusting at least one parameter of the audio improving device in a continuous m…
Who is the assignee on this patent?
Dolby Laboratories Licensing Corp
What technology area does this patent fall under?
Primary CPC classification G10L21/02. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 12 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).