Mitigating voice frequency loss

US11854572B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11854572-B2
Application numberUS-202117302981-A
CountryUS
Kind codeB2
Filing dateMay 18, 2021
Priority dateMay 18, 2021
Publication dateDec 26, 2023
Grant dateDec 26, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer-implemented methods, computer program products, and computer systems for mitigating frequency loss may include one or more processors configured for receiving first audio data corresponding to unobstructed user utterances, receiving second audio data corresponding to first obstructed user utterances, generating a frequency loss (FL) model representing frequency loss between the first audio data and the second audio data, receiving third audio data corresponding to one or more second obstructed user utterances, processing the third audio data using the FL model to generate fourth audio data corresponding to a frequency loss mitigated version of the second obstructed user utterances, and transmitting the fourth audio data to a recipient computing device. The first obstructed user utterances are obstructed by a facemask and the one or more second obstructed user utterances is obstructed by the facemask. The FL model may be executed as an audio plugin in a web conferencing program.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, by one or more processors, first audio data corresponding to one or more unobstructed utterances of a user, wherein the received first audio data is associated with a received age of the user; receiving, by one or more processors, second audio data corresponding to one or more first obstructed utterances of the user; generating and storing, by one or more processors, a frequency loss model (FLM) for the user representing frequency loss between the first audio data and the second audio data at the received age of the user; adjusting, by one or more processors, the stored FLM for the user based on changes to the user's age and stored measurements for a given age; receiving, by one or more processors, third audio data corresponding to one or more second obstructed user utterances; processing, by one or more processors, the third audio data using the tuned FLM to generate fourth audio data corresponding to a frequency loss mitigated version of the one or more second obstructed user utterances; and transmitting, by one or more processors, the fourth audio data to a recipient computing device. 2. The computer-implemented method of claim 1 , wherein the first audio data and the second audio data are captured via a microphone of a computing device. 3. The computer-implemented method of claim 1 , wherein generating the FLM further comprises: converting, by one or more processors, the first audio data and the second audio data to frequency domains; determining, by one or more processors, frequency deltas for one or more of a range of frequencies in the frequency domains; determining, by one or more processors, attenuation values for one or more of the range of frequencies for the first audio data and the second audio data; and mapping, by one or more processors, the frequency deltas and the attenuation values in the frequency domains for the first audio data and the second audio data in a matrix representing the FLM. 4. The computer-implemented method of claim 3 , further comprising: generating, by one or more processors, a graphical display of the FLM on a user interface of a computing device, wherein the graphical display comprises visualizations of time domain images and frequency domain images of: the first audio data, the second audio data, the attenuation values, and the frequency deltas. 5. The computer-implemented method of claim 1 , wherein the recipient computing device is a wearable device configured to process and reproduce the fourth audio data as an audio signal via a speaker of the wearable device. 6. The computer-implemented method of claim 1 , wherein the FLM is executed as an audio plugin in a web conferencing program. 7. The computer-implemented method of claim 1 , wherein generating the fourth audio data utilizes an adjustment selected from the group consisting of: bell-shaped equalizer response and Brickwall. 8. A computer program product comprising: one or more computer readable storage media and program instructions collectively stored on the one or more computer readable storage media, the stored program instructions comprising: program instructions to receive first audio data corresponding to one or more unobstructed utterances of a user, wherein the received first audio data is associated with a received age of the user; program instructions to receive second audio data corresponding to one or more first obstructed utterances of the user; program instructions to generate and store a frequency loss model (FLM) for the user representing frequency loss between the first audio data and the second audio data at the received age of the user; program instructions to adjust the stored FLM for the user based on changes to the user's age and stored measurements for a given age; program instructions to receive third audio data corresponding to one or more second obstructed user utterances; program instructions to process the third audio data using the tuned FLM to generate fourth audio data corresponding to a frequency loss mitigated version of the one or more second obstructed user utterances; and program instructions to transmit the fourth audio data to a recipient computing device. 9. The computer program product of claim 8 , wherein the first audio data and the second audio data are captured via a microphone of a computing device. 10. The computer program product of claim 8 , wherein generating the FLM further comprises: program instructions to convert the first audio data and the second audio data to frequency domains; program instructions to determine frequency deltas for each of a range of frequencies in the frequency domains; program instructions to determine attenuation values for each of the range of frequencies for the first audio data and the second audio data; and program instructions to map the frequency deltas and the attenuation values in the frequency domains for the first audio data and the second audio data in a matrix representing the FLM. 11. The computer program product of claim 10 , further comprising: program instructions to generate a graphical display of the FLM on a user interface of a computing device, wherein the graphical display comprises visualizations of time domain images and frequency domain images of: the first audio data, the second audio data, the attenuation values, and the frequency deltas. 12. The computer program product of claim 8 , wherein the recipient computing device is a wearable device configured to process and reproduce the fourth audio data as an audio signal via a speaker of the wearable device. 13. The computer program product of claim 8 , wherein the one or more first obstructed user utterances is obstructed by a facemask and the one or more second obstructed user utterances is obstructed by the facemask. 14. The computer program product of claim 8 , wherein the FLM is executed as an audio plugin in a web conferencing program. 15. A computer system comprising: one or more computer processors; one or more computer readable storage media; program instructions collectively stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the stored program instructions comprising: program instructions to receive first audio data corresponding to one or more unobstructed utterances of a user, wherein the received first audio data is associated with a received age of the user; program instructions to receive second audio data corresponding to one or more first obstructed utterances of the user; program instructions to generate and store a frequency loss model (FLM) for the user representing frequency loss between the first audio data and the second audio data at the received age of the user; program instructions to adjust the stored FLM for the user based on changes to the user's age and stored measurements for a given age; program instructions to receive third audio data corresponding to one or more second obstructed user utterances; program instructions to process the third audio data using the tuned FLM to generate fourth audio data corresponding to a frequency loss mitigated version of the one or more second obstructed user utterances; and program instructions to transmit the fourth audio data to a recipient computing device. 16. The computer system of claim 15 , wherein the first audio data and the second audio data are captured via a microphone of a computing device, and the FL model is executed as an audio plugin in a web conferencing program.

Assignees

Inventors

Classifications

  • G10L25/18Primary

    the extracted parameters being spectral information of each sub-band · CPC title

  • using distance or distortion measures between unknown speech and reference templates · CPC title

  • using statistical models, e.g. Hidden Markov Models [HMMs] (G10L15/18 takes precedence) · CPC title

  • Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title

  • using subband decomposition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11854572B2 cover?
Computer-implemented methods, computer program products, and computer systems for mitigating frequency loss may include one or more processors configured for receiving first audio data corresponding to unobstructed user utterances, receiving second audio data corresponding to first obstructed user utterances, generating a frequency loss (FL) model representing frequency loss between the first a…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G10L25/18. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).