Perspective-based dynamic audio volume adjustment
US-2018210697-A1 · Jul 26, 2018 · US
US11735203B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11735203-B2 |
| Application number | US-202217896785-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 26, 2022 |
| Priority date | Oct 28, 2020 |
| Publication date | Aug 22, 2023 |
| Grant date | Aug 22, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The audio content (e.g., an audio track, an audio file, an audio signal, etc.) of a content item (e.g., multimedia content, a movie, streaming content, etc.) may be modified to augment and/or include one or more auditory events, such as a sound, a plurality of sounds, a sound effect(s), a voice(s), and/or music.
Opening claim text (preview).
What is claimed: 1. One or more non-transitory computer-readable media storing processor-executable instructions that, when executed by at least one processor, cause the at least one processor to: determine, for a portion of a content item, one or more media elements and one or more auditory events; determine, based on the one or more media elements, one or more candidate auditory events; determine, based on the one or more auditory events and the one or more candidate auditory events, a target auditory event in audio content associated with the portion of the content item; and modify, based on the target auditory event, the audio content. 2. The non-transitory computer-readable media of claim 1 , wherein the one or more media elements comprises one or more textual elements, and wherein the processor-executable instructions, when executed by the at least one processor, further cause the at least one processor to determine the one or more media elements based on one or more of natural language processing, optical character recognition, and an output from a machine learning model. 3. The non-transitory computer-readable media of claim 1 , wherein the one or more media elements comprises one or more visual elements, and wherein the processor-executable instructions, when executed by the at least one processor, further cause the at least one processor to determine the one or more media elements based on one or more of object recognition, and an output from a machine learning model. 4. The non-transitory computer-readable media of claim 1 , wherein the processor-executable instructions, when executed by the at least one processor, further cause the at least one processor to determine the one or more auditory events based on one or more of audio signal analysis, speech recognition, and an output from a machine learning model. 5. The non-transitory computer-readable media of claim 1 , wherein the processor-executable instructions, when executed by the at least one processor, further cause the at least one processor to modify the audio content to include the target auditory event. 6. The non-transitory computer-readable media of claim 1 , wherein the processor-executable instructions, that when executed by the at least one processor, cause the at least one processor to modify the audio content, further cause the at least one processor to: determine audio data associated with the target auditory event; and update, based on the audio data, a waveform associated with the audio content. 7. The non-transitory computer-readable media of claim 1 , wherein the processor-executable instructions, that when executed by the at least one processor, cause the at least one processor to determine the one or more candidate auditory events, further cause the at least one processor to: determine that the one or more media elements are associated with one or more auditory events in an auditory event repository; and determine, based on the one or more media elements being associated with the one or more auditory events in the auditory event repository, the one or more candidate auditory events. 8. The non-transitory computer-readable media of claim 1 , wherein the target auditory event comprises a candidate auditory event of the one or more candidate auditory events that is missing from the one or more auditory events. 9. The non-transitory computer-readable media of claim 1 , wherein the target auditory event comprises an attenuated auditory event of the one or more auditory events or an accentuated auditory event of the one or more auditory events, wherein the processor-executable instructions, that when executed by the at least one processor, cause the at least one processor to modify the audio content, further cause the at least one processor to increase an audio level associated with the attenuated auditory event or decrease an audio level associated with the accentuated auditory event. 10. The non-transitory computer-readable media of claim 1 , wherein the target auditory event comprises an unintelligible auditory event of the one or more auditory events, wherein the processor-executable instructions, that when executed by the at least one processor, cause the at least one processor to modify the audio content, further cause the at least one processor to: determine an audio file associated with the unintelligible auditory event; and update, based on the audio file, a waveform associated with the audio content. 11. One or more non-transitory computer-readable media storing processor-executable instructions that, when executed by at least one processor, cause the at least one processor to: determine, for a portion of a content item, a distribution of visual elements, and a distribution of auditory events; determine, based on the distribution of visual elements, one or more candidate auditory events; determine, based on the distribution of auditory events and the one or more candidate auditory events, a target auditory event in audio content associated with the portion of the content item; and modify, based on the target auditory event, the audio content. 12. The non-transitory computer-readable media of claim 11 , wherein the target auditory event comprises a candidate auditory event of the one or more candidate auditory events that is missing from the distribution of auditory events. 13. The non-transitory computer-readable media of claim 11 , wherein the processor-executable instructions, when executed by the at least one processor, further cause the at least one processor to modify the audio content to include the target auditory event. 14. The non-transitory computer-readable media of claim 11 , wherein the target auditory event comprises an attenuated auditory event of the distribution of auditory events or an accentuated auditory event of the distribution of auditory events, wherein the processor-executable instructions, that when executed by the at least one processor, cause the at least one processor to modify the audio content, further cause the at least one processor to increase an audio level associated with the attenuated auditory event or decrease an audio level associated with the accentuated auditory event. 15. The non-transitory computer-readable media of claim 11 , wherein the target auditory event comprises an unintelligible auditory event of the distribution of auditory events, wherein the processor-executable instructions, that when executed by the at least one processor, cause the at least one processor to modify the audio content, further cause the at least one processor to: determine an audio file associated with the unintelligible auditory event; and update, based on the audio file, a waveform associated with the audio content. 16. The non-transitory computer-readable media of claim 11 , wherein the processor-executable instructions, that when executed by the at least one processor, cause the at least one processor to determine the one or more candidate auditory events, further cause the at least one processor to: determine that one or more visual elements of the distribution of visual elements are associated with one or more auditory events in an auditory event repository; and determine, based on the one or more visual elements being associated with the one or more auditory events in the auditory event repository, the one or more candidate auditory events. 17. The non-transitory computer-readable media of claim 11 , wherein the processor-executable instructions, when executed by the at least one processor, further cause the at least one processor to determine, for
Details of processing therefor · CPC title
Extraction of image or video features · CPC title
Scenes; Scene-specific elements (control of digital cameras H04N23/60) · CPC title
Classification techniques · CPC title
for comparison or discrimination · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.