What technology area does this patent fall under?

Primary CPC classification G10L25/18. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Mitigating voice frequency loss

US11854572B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11854572-B2
Application number	US-202117302981-A
Country	US
Kind code	B2
Filing date	May 18, 2021
Priority date	May 18, 2021
Publication date	Dec 26, 2023
Grant date	Dec 26, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer-implemented methods, computer program products, and computer systems for mitigating frequency loss may include one or more processors configured for receiving first audio data corresponding to unobstructed user utterances, receiving second audio data corresponding to first obstructed user utterances, generating a frequency loss (FL) model representing frequency loss between the first audio data and the second audio data, receiving third audio data corresponding to one or more second obstructed user utterances, processing the third audio data using the FL model to generate fourth audio data corresponding to a frequency loss mitigated version of the second obstructed user utterances, and transmitting the fourth audio data to a recipient computing device. The first obstructed user utterances are obstructed by a facemask and the one or more second obstructed user utterances is obstructed by the facemask. The FL model may be executed as an audio plugin in a web conferencing program.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, by one or more processors, first audio data corresponding to one or more unobstructed utterances of a user, wherein the received first audio data is associated with a received age of the user; receiving, by one or more processors, second audio data corresponding to one or more first obstructed utterances of the user; generating and storing, by one or more processors, a frequency loss model (FLM) for the user representing frequency loss between the first audio data and the second audio data at the received age of the user; adjusting, by one or more processors, the stored FLM for the user based on changes to the user's age and stored measurements for a given age; receiving, by one or more processors, third audio data corresponding to one or more second obstructed user utterances; processing, by one or more processors, the third audio data using the tuned FLM to generate fourth audio data corresponding to a frequency loss mitigated version of the one or more second obstructed user utterances; and transmitting, by one or more processors, the fourth audio data to a recipient computing device. 2. The computer-implemented method of claim 1 , wherein the first audio data and the second audio data are captured via a microphone of a computing device. 3. The computer-implemented method of claim 1 , wherein generating the FLM further comprises: converting, by one or more processors, the first audio data and the second audio data to frequency domains; determining, by one or more processors, frequency deltas for one or more of a range of frequencies in the frequency domains; determining, by one or more processors, attenuation values for one or more of the range of frequencies for the first audio data and the second audio data; and mapping, by one or more processors, the frequency deltas and the attenuation values in the frequency domains for the first audio data and the second audio data in a matrix representing the FLM. 4. The computer-implemented method of claim 3 , further comprising: generating, by one or more processors, a graphical display of the FLM on a user interface of a computing device, wherein the graphical display comprises visualizations of time domain images and frequency domain images of: the first audio data, the second audio data, the attenuation values, and the frequency deltas. 5. The computer-implemented method of claim 1 , wherein the recipient computing device is a wearable device configured to process and reproduce the fourth audio data as an audio signal via a speaker of the wearable device. 6. The computer-implemented method of claim 1 , wherein the FLM is executed as an audio plugin in a web conferencing program. 7. The computer-implemented method of claim 1 , wherein generating the fourth audio data utilizes an adjustment selected from the group consisting of: bell-shaped equalizer response and Brickwall. 8. A computer program product comprising: one or more computer readable storage media and program instructions collectively stored on the one or more computer readable storage media, the stored program instructions comprising: program instructions to receive first audio data corresponding to one or more unobstructed utterances of a user, wherein the received first audio data is associated with a received age of the user; program instructions to receive second audio data corresponding to one or more first obstructed utterances of the user; program instructions to generate and store a frequency loss model (FLM) for the user representing frequency loss between the first audio data and the second audio data at the received age of the user; program instructions to adjust the stored FLM for the user based on changes to the user's age and stored measurements for a given age; program instructions to receive third audio data corresponding to one or more second obstructed user utterances; program instructions to process the third audio data using the tuned FLM to generate fourth audio data corresponding to a frequency loss mitigated version of the one or more second obstructed user utterances; and program instructions to transmit the fourth audio data to a recipient computing device. 9. The computer program product of claim 8 , wherein the first audio data and the second audio data are captured via a microphone of a computing device. 10. The computer program product of claim 8 , wherein generating the FLM further comprises: program instructions to convert the first audio data and the second audio data to frequency domains; program instructions to determine frequency deltas for each of a range of frequencies in the frequency domains; program instructions to determine attenuation values for each of the range of frequencies for the first audio data and the second audio data; and program instructions to map the frequency deltas and the attenuation values in the frequency domains for the first audio data and the second audio data in a matrix representing the FLM. 11. The computer program product of claim 10 , further comprising: program instructions to generate a graphical display of the FLM on a user interface of a computing device, wherein the graphical display comprises visualizations of time domain images and frequency domain images of: the first audio data, the second audio data, the attenuation values, and the frequency deltas. 12. The computer program product of claim 8 , wherein the recipient computing device is a wearable device configured to process and reproduce the fourth audio data as an audio signal via a speaker of the wearable device. 13. The computer program product of claim 8 , wherein the one or more first obstructed user utterances is obstructed by a facemask and the one or more second obstructed user utterances is obstructed by the facemask. 14. The computer program product of claim 8 , wherein the FLM is executed as an audio plugin in a web conferencing program. 15. A computer system comprising: one or more computer processors; one or more computer readable storage media; program instructions collectively stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the stored program instructions comprising: program instructions to receive first audio data corresponding to one or more unobstructed utterances of a user, wherein the received first audio data is associated with a received age of the user; program instructions to receive second audio data corresponding to one or more first obstructed utterances of the user; program instructions to generate and store a frequency loss model (FLM) for the user representing frequency loss between the first audio data and the second audio data at the received age of the user; program instructions to adjust the stored FLM for the user based on changes to the user's age and stored measurements for a given age; program instructions to receive third audio data corresponding to one or more second obstructed user utterances; program instructions to process the third audio data using the tuned FLM to generate fourth audio data corresponding to a frequency loss mitigated version of the one or more second obstructed user utterances; and program instructions to transmit the fourth audio data to a recipient computing device. 16. The computer system of claim 15 , wherein the first audio data and the second audio data are captured via a microphone of a computing device, and the FL model is executed as an audio plugin in a web conferencing program.

Assignees

Inventors

Classifications

G10L25/18Primary
the extracted parameters being spectral information of each sub-band · CPC title
G10L15/10
using distance or distortion measures between unknown speech and reference templates · CPC title
G10L15/14
using statistical models, e.g. Hidden Markov Models [HMMs] (G10L15/18 takes precedence) · CPC title
G10L19/008
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
G10L19/0204
using subband decomposition · CPC title

Patent family

Related publications grouped by family.

View patent family 84103017

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11854572B2 cover?: Computer-implemented methods, computer program products, and computer systems for mitigating frequency loss may include one or more processors configured for receiving first audio data corresponding to unobstructed user utterances, receiving second audio data corresponding to first obstructed user utterances, generating a frequency loss (FL) model representing frequency loss between the first a…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G10L25/18. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for generating labeled data to facilitate configuration of network microphone devices

Respirator acoustic amelioration

Method and system for improving quality of degraded speech

Detection of loudspeaker playback

System and/or method for enhancing hearing using a camera module, processor and/or audio input and/or output devices

Mask assembly

Voice conversion method and system

Frequently asked questions