Systems and methods for inserting emoticons within a media asset
US-2022132217-A1 · Apr 28, 2022 · US
US11857877B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11857877-B2 |
| Application number | US-202117561477-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 23, 2021 |
| Priority date | Dec 23, 2021 |
| Publication date | Jan 2, 2024 |
| Grant date | Jan 2, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An approach is provided for a gaming overlay application to provide automatic in-game subtitles and/or closed captions for video game applications. The overlay application accesses an audio stream and a video stream generated by an executing game application. The overlay application processes the audio stream through a text conversion engine to generate at least one subtitle. The overlay application determines a display position to associate with the at least one subtitle. The overlay application generates a subtitle overlay comprising the at least one subtitle located at the associated display position. The overlay application causes a portion of the video stream to be displayed with the subtitle overlay.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: accessing an audio stream and a video stream generated by an executing game application; generating, at a text conversion engine circuitry, at least one subtitle based on the audio stream; determining a display position of the at least one subtitle by analyzing the video stream for exclusion areas that contain user interface elements of the executing game application; generating a subtitle overlay comprising the at least one subtitle located at the display position; and causing a portion of the video stream to be displayed with the subtitle overlay. 2. The method of claim 1 , wherein processing the audio stream loads the audio stream into size limited buffers for real-time or near real-time processing. 3. The method of claim 1 , further comprising determining the display position by analyzing the audio stream to identify an in-game speaker associated with the at least one subtitle and setting the display position proximate to an in-game object associated with the in-game speaker in the video stream. 4. The method of claim 3 , wherein analyzing the audio stream to identify the in-game speaker includes matching at least one trait of the in-game speaker to an associated classification in a voice profile database. 5. The method of claim 4 , wherein the at least one trait comprises age, gender, and dialect. 6. The method of claim 3 , further comprising determining the display position by processing the video stream with computer vision to identify the in-game speaker. 7. The method of claim 3 , wherein the audio stream comprises multichannel or positional audio, and wherein analyzing the audio stream to identify the in-game speaker includes locating the in-game object associated with the in-game speaker by triangulation from the multichannel or positional audio. 8. The method of claim 1 , wherein generating the subtitle overlay includes configuring one or more visual characteristics of the at least one subtitle. 9. The method of claim 8 , wherein the one or more visual characteristics include at least one of: font attribute, font color, font size, and speech bubble type. 10. The method of claim 8 , wherein the one or more visual characteristics are determined based on at least one of: stored user preferences, readability when the video stream is displayed with the subtitle overlay, and speaker sentiment analyzed from the audio stream. 11. The method of claim 1 , wherein the executing game application is a multiplayer game, and wherein the audio stream includes voice chat from participants in the multiplayer game. 12. The method of claim 1 , further comprising determining the display position by accessing stored user preferences for subtitle positioning. 13. A system comprising: one or more processors configured to: access an audio stream and a video stream generated by an executing game application; generate, at a text conversion engine circuitry, at least one subtitle based on the audio stream; determine a display position of the at least one subtitle by analyzing the video stream for exclusion areas that contain user interface elements of the executing game application; generate a subtitle overlay comprising the at least one subtitle located at the display position; and cause a portion of the video stream to be displayed with the subtitle overlay. 14. The system of claim 13 , wherein the one or more processors are configured to process the audio stream by loading the audio stream into size limited buffers for real-time or near real-time processing. 15. The system of claim 13 , wherein the one or more processors are configured to determine the display position by analyzing the audio stream to identify an in-game speaker associated with the at least one subtitle and setting the display position proximate to an in-game object associated with the in-game speaker in the video stream. 16. The system of claim 15 , wherein the one or more processors are configured to access the audio stream by accessing multichannel or positional audio of the audio stream, and wherein the one or more processors are configured to analyze the audio stream to identify the in-game speaker by locating the in-game object associated with the in-game speaker by triangulation from the multichannel or positional audio of the audio stream. 17. One or more non-transitory computer readable media comprising instructions executable by one or more processors, which cause the one or more processors to: access an audio stream and a video stream generated by an executing game application; generate, at a text conversion engine circuitry, at least one subtitle based on the audio stream; determine a display position of the at least one subtitle by analyzing the video stream for exclusion areas that contain user interface elements of the executing game application; generate a subtitle overlay comprising the at least one subtitle located at the display position; and cause a portion of the video stream to be displayed with the subtitle overlay. 18. The one or more non-transitory computer readable media of claim 17 , wherein the instructions, when executed by the one or more processors, further cause the processing of the audio stream to load the audio stream into size limited buffers for real-time or near real-time processing. 19. The one or more non-transitory computer readable media of claim 17 , wherein the instructions, when executed by the one or more processors, further cause determining of the display position by analyzing the audio stream to identify an in-game speaker associated with the at least one subtitle and set the display position proximate to an in-game object associated with the in-game speaker in the video stream.
Games · CPC title
involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams (arrangements characterised by components specially adapted for monitoring, identification or recognition of audio in broadcast systems H04H60/58) · CPC title
for displaying subtitles · CPC title
involving acoustic input signals, e.g. by using the results of pitch or rhythm extraction or voice recognition · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.