Audio-focus for ambient noise cancellation

US2024428818A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024428818-A1
Application numberUS-202418751015-A
CountryUS
Kind codeA1
Filing dateJun 21, 2024
Priority dateJun 23, 2023
Publication dateDec 26, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method including identifying an audio capture device and a target direction associated with the audio capture device, detecting first audio associated with the target direction, enhancing the first audio using a machine learning model configured to detect audio associated with the target direction, optionally, detecting second audio associated with a direction different from the target direction, and optionally, diminishing the second audio using the machine learning model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: identifying an audio capture device and a target direction associated with the audio capture device; detecting first audio associated with the target direction; and enhancing the first audio using a first machine learning model configured to detect audio associated with the target direction. 2 . The method of claim 1 , further comprising: detecting second audio associated with a direction different from the target direction; and diminishing the second audio using a second machine learning model. 3 . The method of claim 2 , wherein diminishing the second audio includes decreasing an amplitude of at least one sound wave associated with the second audio. 4 . The method of claim 2 , wherein diminishing the second audio includes attenuating the second audio by reducing a signal strength associated with the second audio. 5 . The method of claim 2 , wherein diminishing the second audio includes eliminating the second audio by removing the second audio from an output of the second machine learning model. 6 . The method of claim 2 , wherein the first machine learning model and the second machine learning model are a same machine learning model. 7 . The method of claim 1 , wherein enhancing the first audio includes increasing an amplitude of at least one sound wave associated with the first audio. 8 . The method of claim 1 , wherein enhancing the first audio includes de-reverbing the first audio by removing resonant frequencies from the first audio. 9 . The method of claim 1 , wherein enhancing the first audio includes de-noising the first audio by filtering the first audio. 10 . The method of claim 1 , wherein the target direction is associated with a focus region. 11 . The method of claim 1 , wherein the first machine learning model is trained to detect the audio associated with the target direction using an impulse response dataset. 12 . The method of claim 1 , wherein the enhancing of the first audio using the first machine learning model includes: compressing the first audio using a first machine learning model; and decompressing the compressed audio using a second machine learning model. 13 . The method of claim 1 , wherein the first machine learning model is a neural network model, the neural network model is configured to detect the audio associated with the target direction by training the neural network model, and training the neural network model includes: receiving first training data including at least one first audio signal; receiving second training data including at least one second audio signal; receiving an impulse response dataset; convolving the first training data with a first subset of the impulse response dataset as a first convolved audio, the first subset of the impulse response dataset being associated with the target direction; convolving the second training data with a second subset of the impulse response dataset as a second convolved audio; and training the neural network model based on the first convolved audio and the second convolved audio. 14 . The method of claim 13 , wherein training the neural network model includes training a first neural network model and a second neural network model, the first neural network model being associated with compressing the first audio as compressed first audio, and the second neural network model being associated with decompressing the compressed first audio. 15 . The method of claim 13 , wherein the first training data is associated with a focus region, and the first subset of the impulse response dataset represents an impulse response associated with the focus region. 16 . A non-transitory computer-readable storage medium comprising instructions stored thereon that, when executed by a processor, are configured to cause a computing system to: identify a audio capture device and a target direction associated with the audio capture device; detect first audio associated with the target direction; enhance the first audio using a machine learning model configured to detect audio associated with the target direction; detect second audio associated with a direction different from the target direction; and diminish the second audio using the machine learning model. 17 . The non-transitory computer-readable storage medium of claim 16 , wherein diminishing the second audio includes one of decreasing an amplitude of the second audio, attenuating the second audio, or eliminating the second audio. 18 . The non-transitory computer-readable storage medium of claim 16 , wherein enhancing the first audio includes at least one of increasing an amplitude of the first audio, de-reverbing the first audio and de-noising the first audio. 19 . The non-transitory computer-readable storage medium of claim 16 , wherein the target direction is associated with a focus region. 20 . The non-transitory computer-readable storage medium of claim 16 , wherein the enhancing of the first audio using the machine learning model includes: compressing the first audio using a first machine learning model; and decompressing the compressed audio using a second machine learning model. 21 . The non-transitory computer-readable storage medium of claim 16 , wherein the machine learning model is a neural network model, the neural network model is configured to detect the audio associated with the target direction by training the neural network model, and training the neural network model includes: receiving first training data including at least one first audio signal; receiving second training data including at least one second audio signal; receiving an impulse response dataset; convolving the first training data with a first subset of the impulse response dataset as a first convolved audio, the first subset of the impulse response dataset being associated with the target direction; convolving the second training data with a second subset of the impulse response dataset as a second convolved audio; and training the neural network model based on the first convolved audio and the second convolved audio. 22 . The non-transitory computer-readable storage medium of claim 21 , wherein training the neural network model includes training a first neural network model and a second neural network model, the first neural network model being associated with compressing the first audio as compressed first audio, and the second neural network model being associated with decompressing the compressed first audio. 23 . An apparatus comprising: at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to: identify a audio capture device and a target direction associated with the audio capture device; detect first audio associated with the target direction; enhance the first audio using a machine learning model configured to detect audio associated with the target direction; detect second audio associated with a direction different from the target direction; and diminish the second audio using the machine learning model. 24 . The apparatus of claim 23 , wherein diminishing the second audio includes one of decreasing an amplitude of the second audio, attenuating the second audio, or eliminating the second audio, and enhancing

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024428818A1 cover?
A method including identifying an audio capture device and a target direction associated with the audio capture device, detecting first audio associated with the target direction, enhancing the first audio using a machine learning model configured to detect audio associated with the target direction, optionally, detecting second audio associated with a direction different from the target direct…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G10L21/0364. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).