Who is the assignee on this patent?

Cirrus Logic Int Semiconductor Ltd

What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Apr 02 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Sound event detection

US2020105293A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2020105293-A1
Application number	US-201916566162-A
Country	US
Kind code	A1
Filing date	Sep 10, 2019
Priority date	Sep 28, 2018
Publication date	Apr 2, 2020
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An audio processing system is described for an audio event detection (AED) system. The system includes a feature extraction block configured to derive at least one feature which represents a spectral feature of the input signal.

First claim

Opening claim text (preview).

1 . An audio processing system comprising: an input for receiving an input signal, the input signal representing an audio signal; and a feature extraction block configured to determine a measure of the amount of energy in a portion of the input signal, and to derive a matrix representation of the portion of the audio signal, wherein each entry of the matrix comprises the energy in a given frequency band for a given frame of the portion of the input signal, and to concatenate the rows or columns of the matrix to form a supervector, the supervector being a vector representation of the portion of the audio signal. 2 . An audio processing system as claim 1 , wherein the feature extraction block further comprises: a filter bank comprising a plurality of filters, each filter in the filter bank being configured to determine an energy of at least a portion of the input signal in a given frequency range; and wherein each entry of the matrix comprises the energy in a frequency band according to a given filter in the filter bank for a given frame of the input signal. 3 . An audio processing system as claimed in claim 1 , further comprising: an energy detection block configured to process the input signal into a plurality of frames; and wherein each entry of the matrix comprises the energy in a given frequency band for a given frame of the plurality of frames of the input signal. 4 . An audio processing system as claimed in claim 1 , further comprising: an energy detection block configured to process the input signal into L frames; and wherein the feature extraction block further comprises: a filter bank comprising N filters, each filter in the filter bank being configured to determine an energy of at least a portion of the input signal in a given frequency range; and wherein the matrix derived by the feature extraction block is an N×L matrix whose (i,j)th entry comprises the energy of the jth frame in the frequency band defined by the ith filter in the filterbank, and wherein the feature extraction block is configured to concatenate the rows of the matrix to form the supervector. 5 . An audio processing system as claimed in claim 1 , further comprising: an energy detection block configured to process the input signal into L frames; and wherein the feature extraction block further comprises: a filter bank comprising N filters, each filter in the filter bank being configured to determine an energy of at least a portion of the input signal in a given frequency range; and wherein the matrix derived by the feature extraction block is an L×N matrix whose (i,j)th entry comprises the energy of the ith frame in the frequency band defined by the jth filter in the filterbank, and wherein the feature extraction block is configured to concatenate the columns of the matrix to form the supervector. 6 . An audio processing system as claimed in claim 1 , further comprising: an energy detection block configured to process the input signal into a plurality of frames, and to process each frame into a plurality of sub-frames; and wherein, the feature extraction block is configured to derive a matrix representation of the audio signal for each frame, wherein, for each frame, each entry of the matrix comprises the energy in a given frequency band for a given sub-frame of the input signal, and to concatenate the rows or columns of each matrix to form a supervector, the supervector being a vector representation of the frame of the audio signal. 7 . An audio processing system as claimed in claim 6 , further comprising: an energy detection block configured to process each frame into K sub-frames; and wherein the feature extraction block further comprises: a filter bank comprising P filters, each filter in the filter bank being configured to determine an energy of at least a portion of the input signal in a given frequency range; and wherein, for each frame, the matrix derived by the feature extraction block is an P×K matrix whose (i,j)th entry comprises the energy of the jth frame in the frequency band defined by the ith filter in the filterbank, and wherein the feature extraction block is configured to concatenate the rows of the matrix to form the supervector. 8 . An audio processing system as claimed in claim 6 , further comprising: an energy detection block configured to process each frame into K sub-frames; and wherein the feature extraction block further comprises: a filter bank comprising P filters, each filter in the filter bank being configured to determine an energy of at least a portion of the input signal in a given frequency range; and wherein, for each frame, the matrix derived by the feature extraction block is an K×P matrix whose (i,j)th entry comprises the energy of the ith frame in the frequency band defined by the jth filter in the filterbank, and wherein the feature extraction block is configured to concatenate the columns of the matrix to form the supervector. 9 . An audio processing system as claimed in claim 1 , further comprising: a classification unit configured to determine a measure of difference between the or each supervector and an element stored in a dictionary, the element being stored as a vector representing a known sound event. 10 . An audio processing system as claimed in claim 9 wherein, if the measure of difference between a given supervector and a vector in the dictionary representing a known sound event is below a first predetermined threshold, then the classification unit is configured to output a detection signal indicating that the known sound event has been detected for the portion of the input signal corresponding to the given supervector. 11 . An audio processing system as claimed in claim 10 wherein, if a given number of supervectors for which the measure of difference is below the first predetermined threshold is above a second predetermined threshold, then the classification unit is configured to output a detection signal indicating that the known sound event has been detected for the portion of the input signal corresponding to the given number of supervectors. 12 . An audio processing system as claimed in claim 9 , wherein the classification unit is configured to represent the or each supervector in terms of a weighted sum of elements of a dictionary, each element of the dictionary being stored as a vector representing a known sound event, the dictionary storing the elements as a matrix of vectors, the classification unit thereby being configured to represent the or each supervector as a product of a weight vector and the matrix of vectors. 13 . An audio processing system as claimed in claim 12 , wherein vector entries in the dictionary matrix are grouped according to the type of known sound, and wherein the classification unit is configured to, for the or each supervector, determine an activated known sound type being the known sound type having the greatest number of vectors having non-zero coefficients when the or each supervector is represented as the weighted sum, the classification unit being configured to sum the coefficients of the vectors in the activated known sound type and compare the sum to a third predetermined threshold, and if the sum is greater than the third predetermined threshold then the classification unit is configured to output a detection signal indicating that the activated known sound type has been detected for the or each supervector. 14 . An audio processing system as claimed in claim 12 , wherein vector entries in the dictionary matrix are grouped according to the type of known sound, and wherein the classification unit is configured to, for the or each

Assignees

Cirrus Logic Int Semiconductor Ltd

Inventors

Classifications

G10L17/26
Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L25/51Primary
for comparison or discrimination · CPC title
G10L25/18Primary
the extracted parameters being spectral information of each sub-band · CPC title
G10L25/21
the extracted parameters being power information · CPC title

Patent family

Related publications grouped by family.

View patent family 64397481

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020105293A1 cover?: An audio processing system is described for an audio event detection (AED) system. The system includes a feature extraction block configured to derive at least one feature which represents a spectral feature of the input signal.
Who is the assignee on this patent?: Cirrus Logic Int Semiconductor Ltd
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Apr 02 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).