Method and apparatus for attenuating undesired content in an audio signal
US-9779753-B2 · Oct 3, 2017 · US
US9349384B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9349384-B2 |
| Application number | US-201314428419-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 11, 2013 |
| Priority date | Sep 19, 2012 |
| Publication date | May 24, 2016 |
| Grant date | May 24, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In some embodiments, a method for adaptive control of gain applied to an audio signal, including steps of analyzing segments of the signal to identify audio objects (e.g., voices of participants in a voice conference); storing information regarding each distinct identified object; using at least some of the information to determine at least one of a target gain, or a gain change rate for reaching a target gain, for each identified object; and applying gain to segments of the signal indicative of an identified object such that the gain changes (typically, at the gain change rate for the object) from an initial gain to the target gain for the object. The information stored may include a scene description. Aspects of the invention include a system configured (e.g., programmed) to perform any embodiment of the inventive method.
Opening claim text (preview).
The invention claimed is: 1. A method for adaptive control of gain applied to an audio signal, including the steps of: (a) analyzing segments of the signal to identify audio objects indicated by said signal, including by identifying an audio object indicated by each of the segments; (b) storing information regarding each distinct audio object identified in step (a), including an identification of the object and a measure of level of each of at least one segment of the signal indicative of the object; (c) using at least a subset of the information stored in step (b) to determine, independently for each distinct audio object identified in step (a), target gain and a gain change rate for reaching the target gain for the audio object; and (d) applying gain to the segments of the signal indicative of one audio object identified in step (a), such that said gain changes, at the gain change rate for said object, from an initial gain to the target gain for the object, wherein at least some of the audio objects are voices of participants in a voice conference, and the information stored in step (b) includes a data structure regarding each distinct audio object identified in step (a), said data structure including values indicative of: a degree of confidence with which the object is classified as a voice of a conference participant, a degree of confidence with which the object is classified as a nuisance; a direction measure, a distance measure, and a percentage of time during the conference in which the object has been active. 2. The method of claim 1 , wherein the information stored in step (b) includes a scene description indicative of at least one characteristic of the voice conference, the at least one characteristic including data indicating a type or source of each object, and a location or trajectory of at least one source which emits sound comprising the object. 3. The method of claim 1 , also including a step of using at least some of the information stored in step (b) to determine independently the target gain for each distinct audio object identified in step (a), such that application of the target gain for each said audio object to a segment of the audio signal indicative of said object is sufficient to move the measure of level of said segment to an output level for the object, where the output level for the object is determined by a predetermined target level and at least a subset of the information stored in step (b). 4. The method of claim 1 , wherein step (d) includes the steps of: applying a time-varying gain to segments of the signal indicative of a first audio object, wherein the time-varying gain approaches the target gain for the first audio object at the gain change rate determined in step (c) for said first audio object; and applying a second time-varying gain to segments of the signal indicative of a second audio object, wherein the second time-varying gain approaches the target gain for the second audio object at the gain change rate determined in step (c) for said second audio object. 5. The method of claim 1 , wherein the gain change rate for each audio object identified with a first confidence as a voice of a conference participant is different from the gain change rate for each audio object which is not identified with the first confidence as a voice of a conference participant. 6. The method of claim 5 , wherein the gain change rate for each audio object identified with a first confidence as a voice of a conference participant is greater than the gain change rate for each audio object which is identified with less than the first confidence as a voice of a conference participant. 7. The method of claim 1 , wherein the information stored in step (b) is indicative of at least one of: degree of confidence with which each distinct audio object is identified in step (a); or proportion of time that each distinct audio object identified in step (a) has been present in the audio signal. 8. The method of claim 7 , wherein the information stored in step (b) is indicative of the proportion of time that each distinct audio object identified in step (a) has been present in the audio signal, and the gain change rate for each audio object identified in step (a) is determined in response to information stored in step (b) which is indicative of the proportion of time that that the audio object has been present in the audio signal. 9. The method of claim 1 , wherein each said measure of level is one of an RMS value, an amplitude, a weighted signal level, and a perceptual loudness measure. 10. The method of claim 2 , wherein step (b) includes the step of storing an updated target level for said each distinct object, an updated identification of said each distinct object, and an updated scene description, and wherein step (c) includes a step of determining the target gain for said each distinct object in response to the updated target level for the object, the updated identification of the object, and the updated scene description. 11. A system for adaptive control of gain applied to an audio signal, said system including: a gain stage coupled and configured to apply gain to the signal, including by applying different amounts of gain to different segments of the signal; a signal analysis stage coupled and configured to analyze segments of the signal to identify audio objects indicated by said signal, including by identifying an audio object indicated by each of the segments, wherein the signal analysis stage is configured to store information regarding each distinct identified audio object, said information including an identification of the object and a measure of level of each of at least one segment of the signal indicative of the object; and a gain determination stage coupled to the gain stage and to the signal analysis stage and configured to determine, independently for said each distinct identified audio object and in response to at least a subset of the information, a target gain and a gain change rate for reaching the target gain for said audio object, wherein at least some of the audio objects are voices of participants in a voice conference, and the stored information includes a data structure regarding each distinct identified audio object, said data structure including values indicative of: a degree of confidence with which the object is classified as a voice of a conference participant, a degree of confidence with which the object is classified as a nuisance; a direction measure, a distance measure, and a percentage of time during the conference in which the object has been active. 12. The system of claim 11 , wherein the gain determination stage is configured to operate in a mode to determine gain to be applied by the gain stage to a subset of the segments of the signal, each of the segments in the subset is indicative of one said identified audio object, and the gain determined in said mode changes at the gain change rate for the object, from an initial gain to the target gain for the object. 13. The system of claim 11 , wherein the information includes a scene description indicative of at least one characteristic of the voice conference, the at least one characteristic including data indicating a type or source of each object, and a location or trajectory of at least one source which emits sound comprising the object. 14. The system of claim 11 , wherein the gain determination stage is coupled and configured to operate in a gain limiting mode in response to an indication from the signal analysis stage that the measure of level of a segment of the signal exceeds a predetermined threshold, wherein the ga
Details of processing therefor · CPC title
Voice signal separating · CPC title
Automatic adjustment · CPC title
the control being dependent upon ambient noise level or sound level · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.