Source-based sound quality adjustment tool
US-2022269473-A1 · Aug 25, 2022 · US
US11869478B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11869478-B2 |
| Application number | US-202217655511-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 18, 2022 |
| Priority date | Mar 18, 2022 |
| Publication date | Jan 9, 2024 |
| Grant date | Jan 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device includes one or more processors configured to receive an input audio signal. The one or more processors are also configured to process the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal. The combined representation is used to selectively retain or remove sounds of the multiple sound sources from the input audio signal. The one or more processors are further configured to provide the output audio signal to a second device.
Opening claim text (preview).
What is claimed is: 1. A device comprising: one or more processors configured to: receive an input audio signal including a plurality of sound sources; process the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal, wherein at least one of the plurality of sound sources of the input audio signal is included in the multiple sound sources, and wherein: based on a retain flag having a first value, the combined representation is used to retain the at least one of the plurality of sound sources; and based on the retain flag having a second value, the combined representation is used to remove the at least one of the plurality of sound sources; and provide the output audio signal to a second device. 2. The device of claim 1 , wherein the one or more processors are configured to, based on the retain flag having the first value, use the combined representation to retain the at least one of the plurality of sound sources and to remove one or more additional sound sources from the input audio signal. 3. The device of claim 2 , wherein the one or more processors are configured to, responsive to a detected condition indicating that processing of the input audio signal is to be initiated, set the retain flag to have the first value indicating that the multiple sound sources are to be retained, wherein the first value of the retain flag is based on a user input, a default configuration, a configuration input from an application, a configuration request from another device, or a combination thereof. 4. The device of claim 1 , wherein the multiple sound sources include one or more authorized users. 5. The device of claim 1 , wherein the multiple sound sources include an emergency vehicle. 6. The device of claim 1 , wherein the one or more processors are configured to, based on the retain flag having the second value, use the combined representation to remove the at least one of the plurality of sound sources and to retain one or more additional sound sources from the input audio signal. 7. The device of claim 6 , wherein the one or more processors are configured to, responsive to a detected condition indicating that processing of the input audio signal is to be initiated, set the retain flag to have the second value indicating that the multiple sound sources are to be removed, wherein the second value of the retain flag is based on a user input, a default configuration, a configuration input from an application, a configuration request from another device, or a combination thereof. 8. The device of claim 1 , wherein the multiple sound sources include traffic, wind, reverberation, channel distortion, another non-speech sound source, a person, or a combination thereof. 9. The device of claim 1 , wherein the multiple sound sources are associated with background noise in a particular environment. 10. The device of claim 9 , wherein the particular environment corresponds to an interior of a particular type of vehicle. 11. The device of claim 1 , wherein the combined representation is based on particular sounds from particular sound sources, and wherein a particular sound source is a same sound source type as one of the multiple sound sources. 12. The device of claim 1 , wherein the one or more processors are further configured to update the combined representation based on the sounds of any of the multiple sound sources. 13. The device of claim 1 , wherein the one or more processors are further configured to, based on a combination setting, generate the combined representation based on individual representations of the multiple sound sources. 14. The device of claim 13 , wherein the one or more processors are further configured to update the combination setting based on a user input, a detected condition, or both. 15. The device of claim 1 , wherein the multiple sound sources include at least a first sound source and a second sound source, wherein a first representation of the first sound source indicates a first value of a particular feature, wherein a second representation of the second sound source indicates a second value of the particular feature, and wherein a value of the particular feature indicated by the combined representation is based on the first value and the second value. 16. The device of claim 15 , wherein the first representation includes one or more spectrograms that are based on sounds from a particular sound source that is of the same type as the first sound source. 17. The device of claim 15 , wherein the combined representation corresponds to a concatenation of a first representation of the first sound source with a second representation of the second sound source. 18. The device of claim 1 , wherein the one or more processors are configured to process the input audio signal using a neural network to generate the output audio signal. 19. The device of claim 18 , wherein the neural network includes a convolutional neural network (CNN), an autoregressive (AR) generative network, an audio generative network (AGN), an attention network (AN), a long short-term memory (LSTM) network, or a combination thereof. 20. The device of claim 18 , further comprising a sound source encoder configured to process sounds from one or more sound sources to generate a representation of the one or more sound sources, wherein the sound source encoder and the neural network are jointly trained. 21. The device of claim 1 , further comprising a receiver configured to receive audio data representing the input audio signal. 22. The device of claim 1 , further comprising a transmitter configured to transmit audio data to the second device, the audio data based on the output audio signal. 23. A method comprising: receiving an input audio signal at a first device, the input audio signal including a plurality of sound sources; processing the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal, wherein at least one of the plurality of sound sources of the input audio signal is included in the multiple sound sources, and wherein: based on a retain flag having a first value, the combined representation is used to retain the at least one of the plurality of sound sources; and based on the retain flag having a second value, the combined representation is used to remove the at least one of the plurality of sound sources; and providing the output audio signal to a second device. 24. The method of claim 23 , wherein, based on the retain flag having the first value, the combined representation is used to retain the at least one of the plurality of sound sources and to remove one or more additional sound sources from the input audio signal. 25. The method of claim 23 , wherein the multiple sound sources are associated with background noise in a particular environment. 26. The method of claim 25 , wherein the particular environment corresponds to an interior of a particular type of vehicle. 27. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to: receive an input audio signal at a first device, the input audio signal including a plurality of sound sources; process the input audio signal based on a combined representation of multiple sound sources to generate an output audio signal, wherein at least
for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title
Reduction of ambient noise (active noise reduction per se G10K11/175; protective devices for the ear, e.g. providing acoustic protection A61F11/06) · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
the extracted parameters being spectral information of each sub-band · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.