Perspective-based dynamic audio volume adjustment
US-2018210697-A1 · Jul 26, 2018 · US
US11456004B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11456004-B2 |
| Application number | US-202017082866-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 28, 2020 |
| Priority date | Oct 28, 2020 |
| Publication date | Sep 27, 2022 |
| Grant date | Sep 27, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The audio content (e.g., an audio track, an audio file, an audio signal, etc.) of a content item (e.g., multimedia content, a movie, streaming content, etc.) may be modified to augment and/or include one or more auditory events, such as a sound, a plurality of sounds, a sound effect(s), a voice(s), and/or music.
Opening claim text (preview).
What is claimed is: 1. A method comprising: determining, for a portion of a content item, one or more media elements and one or more auditory events; determining, based on the one or more media elements, one or more candidate auditory events; determining, based on the one or more auditory events and the one or more candidate auditory events, a target auditory event in audio content associated with the portion of the content item; and modifying, based on the target auditory event, the audio content. 2. The method of claim 1 , wherein the one or more media elements comprises one or more textual elements, and wherein determining the one or more media elements comprises one or more of natural language processing, optical character recognition, and an output from a machine learning model. 3. The method of claim 1 , wherein the one or more media elements comprises one or more visual elements, and wherein determining the one or more media elements comprises one or more of object recognition, and an output from a machine learning model. 4. The method of claim 1 , wherein determining the one or more auditory events comprises one or more of audio signal analysis, speech recognition, and an output from a machine learning model. 5. The method of claim 1 , wherein modifying the audio content comprises modifying the audio content to include the target auditory event. 6. The method of claim 1 , wherein modifying the audio content comprises: determining audio data associated with the target auditory event; and updating, based on the audio data, a waveform associated with the audio content. 7. The method of claim 1 , wherein determining the one or more candidate auditory events comprises: determining that the one or more media elements are associated with one or more auditory events in an auditory event repository; and determining, based on the one or more media elements being associated with the one or more auditory events in the auditory event repository, the one or more candidate auditory events. 8. The method of claim 1 , wherein the target auditory event comprises a candidate auditory event of the one or more candidate auditory events that is missing from the one or more auditory events. 9. The method of claim 1 , wherein the target auditory event comprises an attenuated auditory event of the one or more auditory events or an accentuated auditory event of the one or more auditory events, wherein modifying the audio content comprises increasing an audio level associated with the attenuated auditory event or decreasing an audio level associated with the accentuated auditory event. 10. The method of claim 1 , wherein the target auditory event comprises an unintelligible auditory event of the one or more auditory events, wherein modifying the audio content comprises: determining an audio file associated with the unintelligible auditory event; and updating, based on the audio file, a waveform associated with the audio content. 11. A method comprising: determining, for a portion of a content item, a distribution of visual elements, and a distribution of auditory events; determining, based on the distribution of visual elements, one or more candidate auditory events; determining, based on the distribution of auditory events and the one or more candidate auditory events, a target auditory event in audio content associated with the portion of the content item; and modifying, based on the target auditory event, the audio content. 12. The method of claim 11 , wherein the target auditory event comprises a candidate auditory event of the one or more candidate auditory events that is missing from the distribution of auditory events. 13. The method of claim 11 , wherein modifying the audio content comprises modifying the audio content to include the target auditory event. 14. The method of claim 11 , wherein the target auditory event comprises an attenuated auditory event of the distribution of auditory events or an accentuated auditory event of the distribution of auditory events, wherein modifying the audio content comprises increasing an audio level associated with the attenuated auditory event or decreasing an audio level associated with the accentuated auditory event. 15. The method of claim 11 , wherein the target auditory event comprises an unintelligible auditory event of the distribution of auditory events, wherein modifying the audio content comprises: determining an audio file associated with the unintelligible auditory event; and updating, based on the audio file, a waveform associated with the audio content. 16. The method of claim 11 , wherein determining the one or more candidate auditory events comprises: determining that one or more visual elements of the distribution of visual elements are associated with one or more auditory events in an auditory event repository; and determining, based on the one or more visual elements being associated with the one or more auditory events in the auditory event repository, the one or more candidate auditory events. 17. The method of claim 11 , further comprising determining, for the portion of the content item, a distribution of textual elements, wherein determining the one or more candidate auditory events comprises: determining that one or more textual elements of the distribution of textual elements are associated with one or more visual elements of the distribution of visual elements; and determining, based on the one or more textual elements associated with the one or more visual elements being associated with one or more auditory events in an auditory event repository, the one or more candidate auditory events. 18. A method comprising: determining, for a portion of a content item, one or more visual elements, and one or more auditory events; determining, based on the one or more visual elements, one or more candidate auditory events; determining, based on the one or more auditory events and the one or more candidate auditory events, that a candidate auditory event of the one or more candidate auditory events is missing from audio content associated with the portion of the content item; and modifying the audio content to include the candidate auditory event. 19. The method of claim 18 , wherein determining the one or more candidate auditory events comprises: determining that the one or more visual elements are associated with one or more auditory events in an auditory event repository; and determining, based on the one or more visual elements being associated with the one or more auditory events in the auditory event repository, the one or more candidate auditory events. 20. The method of claim 18 , further comprising determining, for the portion of the content item, one or more textual elements, wherein determining the one or more candidate auditory events comprises: determining that the one or more textual elements are associated with the one or more visual elements; and determining, based on the one or more textual elements associated with the one or more visual elements being associated with one or more auditory events in an auditory event repository, the one or more candidate auditory events.
Extraction of image or video features · CPC title
Natural language analysis (semantic analysis of natural language G06F40/30) · CPC title
by using information signals recorded by the same method as the main recording {(G11B27/22 takes precedence)} · CPC title
for synchronising with other signals, e.g. video signals · CPC title
Learning process for intelligent management, e.g. learning user preferences for recommending movies (details of learning user preferences for the retrieval of video data in a video database G06F16/739; computer systems using learning methods G06N3/08) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.