What technology area does this patent fall under?

Primary CPC classification G10L21/02. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 24 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Methods and apparatus for enhancing musical sound during a networked conference

US11562761B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11562761-B2
Application number	US-202016945364-A
Country	US
Kind code	B2
Filing date	Jul 31, 2020
Priority date	Jul 31, 2020
Publication date	Jan 24, 2023
Grant date	Jan 24, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to enhance voice characteristics to generate an enhanced audio signal when the musical content is not detected, processing the audio signal to enhance music characteristic to generate the enhanced audio signal when the musical content is detected, and transmitting the enhanced audio signal over the network.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for sound enhancement performed by a device coupled to a network, the method comprising: receiving an audio signal associated with a current virtual online meeting to be transmitted over the network; detecting whether voice content is present in a first portion of the audio signal; in response to detecting voice content present in the first portion of the audio signal, setting a state flag as representing a first state, the state flag corresponding to a hysteresis wait time interval; initiating an instance of the hysteresis wait time interval responsive to setting the state flag to the first state; upon expiration of the instance of the hysteresis wait time interval that corresponds to the set first state, processing the first portion of the audio signal to enhance voice characteristics of the first portion of audio signal by generating a voice enhanced audio signal; detecting whether musical content is present in a second portion of the audio signal, by: (i) processing the second portion of the audio signal and one or more historical audio segments; (ii) extracting input audio features from the second portion of the audio signal and the one or more historical audio segments, the input audio features corresponding to a neural network; (iii) generating a probability indicator, via feeding the input audio features (“audio features”) into the neural network, that indicates a probability that the second portion of the audio signal includes presence of musical content; in response to detecting musical content present in the second portion of the audio signal, setting the state flag as representing a second state; initiating an instance of the hysteresis wait time interval responsive to setting the state flag to the second state; upon expiration of the instance of the hysteresis wait time interval that corresponds to the set second state, enhancing one or more music characteristics of the second portion of the audio signal by generating a music enhanced audio signal; and transmitting the voice enhanced audio signal and the music enhanced audio signal to the current virtual online meeting over the network at respective different moments during the current virtual online meeting. 2. The method of claim 1 , wherein the operation of receiving comprises receiving the audio signal from a microphone. 3. The method of claim 1 , wherein the operation of processing the audio signal to enhance the music characteristics comprises retrieving music parameters that identify processing for the audio signal. 4. The method of claim 3 , wherein the operation of processing the audio signal to enhance the music characteristics comprises performing at least one of DC removal, noise suppression, echo cancellation, gain control, and encoding on the audio signal based on the music parameters. 5. The method of claim 1 , wherein the operation of processing the audio signal to enhance the voice characteristics comprises: retrieving voice parameters; and performing at least one of DC removal, noise suppression, echo cancellation, gain control, and encoding on the audio signal based on the voice parameters. 6. Apparatus for sound enhancement, the apparatus comprising: a detector that: (i) receives an audio signal associated with a current virtual online meeting to be transmitted over the network; (ii) detects whether voice content is present in a first portion of the audio signal; (iii) sets a state flag as representing a first state upon detection of the voice content, the state flag corresponding to a hysteresis wait time interval; (iv) initiates an instance of the hysteresis wait time interval responsive to setting the state flag to the first state; (v) detects whether musical content is present in a second portion of the audio signal by: (a) processing the second portion of the audio signal and one or more historical audio segments captured prior to initiation of the current virtual online meeting; (b) extracting input audio features from the second portion of the audio signal and the one or more historical audio segments, the input audio features corresponding to a neural network; (c) generating a probability indicator, via feeding the input audio features (“audio features”) into the neural network, that indicates a probability that the second portion of the audio signal includes presence of musical content; (vi) sets the state flag as representing a second state upon detection of the music content, the state flag corresponding to a hysteresis wait time interval; and (vii) initiates an instance of the hysteresis wait time interval responsive to setting the state flag to the second state; a processor that: (i) upon expiration of the instance of the hysteresis wait time interval that corresponds to the set first state, processes the first portion of the audio signal, in response to the detector detecting voice content present in the first portion of the audio signal, to enhance voice characteristics of the first portion of audio signal by generating a voice enhanced audio signal; and (ii) upon expiration of the instance of the hysteresis wait time interval that corresponds to the set first state, enhances one or more music characteristics of the second portion of the audio signal, in response to the detector detecting musical content present in the second portion of the audio signal, by generating a music enhanced audio signal; and a transmitter that transmits the voice enhanced audio signal and the music enhanced audio signal to the current virtual online meeting over the network at respective different moments during the current virtual online meeting. 7. The apparatus of claim 6 , wherein the detector receives the audio signal from a microphone. 8. The apparatus of claim 6 , wherein the processor processes the audio signal to enhance the music characteristics by: performing at least one of DC removal, noise suppression, echo cancellation, and gain control on the audio signal based on music parameters; and performing audio encoding based on the music parameters. 9. The apparatus of claim 6 , wherein the processor processes the audio signal to enhance the voice characteristics by: performing at least one of DC removal, noise suppression, echo cancellation, and gain control on the audio signal based on voice parameters; and performing audio encoding based on the voice parameters. 10. A non-transitory computer readable medium on which are stored program instructions that, when executed by a processor, cause the processor to perform operations of: receiving an audio signal associated with a current virtual online meeting to be transmitted over the network; detecting whether voice content is present in a first portion of the audio signal; in response to detecting voice content present in the first portion of the audio signal, setting a state flag as representing a first state, the state flag corresponding to a hysteresis wait time interval; initiating an instance of the hysteresis wait time interval responsive to setting the state flag to the first state; upon expiration of the instance of the hysteresis wait time interval that corresponds to the set first state, processing the first portion of the audio signal to enhance voice characteristics of the first portion of audio signal by generating a voice enhanced audio signal; detecting whether musical content is present in a second portion of the audio signal, by: (i) processing the second portion of the audio signal and one or more historical audio segments; (ii) extracting input audio features from the second portion of the audio signal and the one or more historical audio segments, the input audio features correspo

Assignees

Zoom Video Communications Inc

Inventors

Classifications

G10L25/81
for discriminating voice from music · CPC title
G10L25/51
for comparison or discrimination · CPC title
G10L21/02Primary
Speech enhancement, e.g. noise reduction or echo cancellation (reducing echo effects in line transmission systems H04B3/20; echo suppression in hands-free telephones H04M9/08) · CPC title
G10L25/30
using neural networks · CPC title
H04M3/568
audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants (echo suppression in two-way loud-speaking telephone systems H04M9/02; sound field processing per se H04S7/30) · CPC title

Patent family

Related publications grouped by family.

View patent family 80003420

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11562761B2 cover?: Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to …
Who is the assignee on this patent?: Zoom Video Communications Inc
What technology area does this patent fall under?: Primary CPC classification G10L21/02. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 24 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method for generating action according to audio signal and electronic device

Audio Device with Speech-Based Audio Signal Processing

Systems and methods for classifying sounds

Multichannel noise cancellation using deep neural network masking

Generating a probability of music using machine learning technology

Adaptive processing of sound data

Frequently asked questions