Audio Signal Classification Method and Apparatus

US2016155456A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016155456-A1
Application numberUS-201615017075-A
CountryUS
Kind codeA1
Filing dateFeb 5, 2016
Priority dateAug 6, 2013
Publication dateJun 2, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An audio signal classification method and apparatus, where the method includes determining, according to voice activity of a current audio frame, whether to obtain a frequency spectrum fluctuation of the current audio frame and store the frequency spectrum fluctuation in a frequency spectrum fluctuation memory, and updating, according to whether the audio frame is percussive music or activity of a historical audio frame, frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory, and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory.

First claim

Opening claim text (preview).

What is claimed is: 1 . An audio signal classification method, comprising: determining, according to voice activity of a current audio frame, whether to obtain a current frequency spectrum fluctuation parameter of the current audio frame and store the current frequency spectrum fluctuation parameter, wherein a frequency spectrum fluctuation parameter denotes an energy fluctuation of a frequency spectrum of an audio signal; updating, according to whether the audio frame is percussive music, stored one or more frequency spectrum fluctuation parameters; and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the stored frequency spectrum fluctuation parameters. 2 . The method according to claim 1 , wherein determining, according to voice activity of the current audio frame, whether to obtain the current frequency spectrum fluctuation parameter of the current audio frame and store the current frequency spectrum fluctuation parameter comprises storing the current frequency spectrum fluctuation parameter of the current audio frame when the current audio frame is an active frame. 3 . The method according to claim 1 , wherein determining, according to voice activity of the current audio frame, whether to obtain the current frequency spectrum fluctuation parameter of the current audio frame and store the current frequency spectrum fluctuation parameter comprises storing the current frequency spectrum fluctuation parameter of the current audio frame when the current audio frame is an active frame, and the current audio frame does not belong to an energy attack. 4 . The method according to claim 1 , wherein determining, according to voice activity of the current audio frame, whether to obtain the current frequency spectrum fluctuation parameter of the current audio frame and store the current frequency spectrum fluctuation parameter comprises storing the current frequency spectrum fluctuation parameter of the current audio frame when the current audio frame is an active frame, and none of multiple consecutive frames comprising the current audio frame and a historical frame of the current audio frame belongs to an energy attack. 5 . The method according to claim 1 , wherein updating, according to whether the current audio frame is percussive music, stored one or more frequency spectrum fluctuation parameters comprises modifying values of the stored frequency spectrum fluctuation parameters when the current audio frame belongs to percussive music. 6 . The method according to claim 1 , wherein classifying the current audio frame as the speech frame or the music frame according to statistics of the part or all of effective data of the stored frequency spectrum fluctuation parameters comprises: obtaining an average value of the part or all of the effective data of the stored frequency spectrum fluctuation parameters; and classifying the current audio frame as the music frame when the obtained average value satisfies a music classification condition. 7 . The method according to claim 1 , further comprising: obtaining a frequency spectrum high-frequency-band peakiness parameter, a frequency spectrum correlation degree parameter, and a linear prediction residual energy tilt parameter of the current audio frame, wherein the frequency spectrum high-frequency-band peakiness parameter denotes a peakiness or an energy acutance, on a high frequency band, of a frequency spectrum of the current audio frame, wherein the frequency spectrum correlation degree parameter denotes stability, between adjacent frames, of a signal harmonic structure of the current audio frame, and wherein the linear prediction residual energy tilt parameter denotes an extent to which linear prediction residual energy of the audio signal changes as a linear prediction order increases; determining, according to the voice activity of the current audio frame, whether to store the frequency spectrum high-frequency-band peakiness parameter, the frequency spectrum correlation degree parameter, and the linear prediction residual energy tilt parameter, and wherein classifying the current audio frame as the speech frame or the music frame according to statistics of the part or all of effective data of the stored frequency spectrum fluctuation parameters comprises: obtaining an average value of the part or all of effective data of the stored frequency spectrum fluctuation parameters, an average value of a part or all of effective data of stored frequency spectrum high-frequency-band peakiness parameters, an average value of a part or all of effective data of stored frequency spectrum correlation degrees parameters, and a variance of a part or all of effective data of stored linear prediction residual energy tilt parameters separately; and classifying the current audio frame as the music frame when a music classifying condition comprising one of the following conditions is satisfied: the average value of the effective data of the stored frequency spectrum fluctuation parameters is less than a first threshold; the average value of the effective data of the stored frequency spectrum high-frequency-band peakiness parameters is greater than a second threshold; the average value of the effective data of the stored frequency spectrum correlation degree parameters is greater than a third threshold; and the variance of the effective data of the stored linear prediction residual energy tilt parameters is less than a fourth threshold. 8 . The method according to claim 7 , wherein the music classifying condition further comprises a voicing_cnt, wherein the voicing_cnt is less than a fifth threshold, and wherein the voicing_cnt denotes a quantity of voicing parameters whose values are greater than a sixth threshold in a voicing historical buffer which is used to store a voicing parameter of the current audio frame when the voicing parameter of the current audio frame is needed to be obtained and stored. 9 . The method according to claim 1 , wherein the stored one or more frequency spectrum fluctuation parameters are stored in a frequency spectrum fluctuation buffer when the current frequency spectrum fluctuation parameter is determined to be obtained and stored, and wherein the current frequency spectrum fluctuation parameter is stored to the frequency spectrum fluctuation buffer. 10 . An audio signal classification method, comprising: determining, according to voice activity of a current audio frame, whether to obtain a current frequency spectrum fluctuation parameter of the current audio frame and store the current frequency spectrum fluctuation parameter, wherein a frequency spectrum fluctuation parameter denotes an energy fluctuation of a frequency spectrum of an audio signal; updating, according to activity of a historical audio frame, stored one or more frequency spectrum fluctuation parameters; and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the stored frequency spectrum fluctuation parameters. 11 . The method according to claim 10 , wherein determining, according to voice activity of the current audio frame, whether to obtain the current frequency spectrum fluctuation parameter of the current audio frame and store the current frequency spectrum fluctuation parameter comprises storing the current frequency spectrum fluctuation parameter of the current audio frame when the current audio frame is an active frame. 12 . The method according to claim 10 , wherein determining, according to voice activity of the current audio frame, whether to obtain the c

Assignees

Inventors

Classifications

  • the extracted parameters being prediction coefficients · CPC title

  • Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients · CPC title

  • G10L25/18Primary

    the extracted parameters being spectral information of each sub-band · CPC title

  • G10L25/81Primary

    for discriminating voice from music · CPC title

  • Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016155456A1 cover?
An audio signal classification method and apparatus, where the method includes determining, according to voice activity of a current audio frame, whether to obtain a frequency spectrum fluctuation of the current audio frame and store the frequency spectrum fluctuation in a frequency spectrum fluctuation memory, and updating, according to whether the audio frame is percussive music or activity o…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G10L25/18. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 02 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).