System to correct closed captioning display using context from audio/video
US-2021136459-A1 · May 6, 2021 · US
US11699289B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11699289-B2 |
| Application number | US-202117336965-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 2, 2021 |
| Priority date | Jun 10, 2020 |
| Publication date | Jul 11, 2023 |
| Grant date | Jul 11, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A display apparatus for generating multimedia content and an operation method thereof are provided. The display apparatus includes a display, a memory storing one or more instructions, and a processor configured to execute the one or more instructions stored in the memory. The processor is configured to obtain plot information of the multimedia content, and generate sequence information including one or more sequences of the multimedia content corresponding to the plot information by using a first artificial intelligence (AI) model, generate scene information based on the sequence information by using a second AI model, generate the multimedia content based on the scene information, and control the display to output the multimedia content.
Opening claim text (preview).
What is claimed is: 1. A display apparatus for generating multimedia content, the display apparatus comprising: a display; a memory storing one or more instructions; and a processor configured to execute the one or more instructions stored in the memory, wherein the processor is configured to: obtain genre information and emotion information of the multimedia content; generate plot information of the multimedia content based on the genre information and emotion information of the multimedia content by using a first artificial intelligence (AI) model; generate sequence information based on one or more sequences of the multimedia content corresponding to the plot information by using a second (AI) model; generate scene information based on the sequence information by using a third AI model; generate the multimedia content based on the scene information; and control the display to output the multimedia content. 2. The display apparatus of claim 1 , wherein the emotion information of the multimedia content comprises valence information represented with respect to a reproduction time of the multimedia content and arousal information represented with respect to the reproduction time of the multimedia content. 3. The display apparatus of claim 1 , wherein the processor is further configured to randomly obtain story information of the multimedia content from a story database (DB), and wherein the first AI model is trained to output the plot information based on the story information, the genre information and the emotion information. 4. The display apparatus of claim 1 , wherein the second AI model is trained to output the sequence information based on the plot information. 5. The display apparatus of claim 1 , wherein the third AI model is trained to output the scene information based on receiving character information of the multimedia content and the sequence information of the multimedia content. 6. The display apparatus of claim 1 , wherein the scene information comprises at least one of background information of a scene, information about a behavior of a character appearing in the scene, and conversation contents of the character. 7. The display apparatus of claim 6 , wherein the processor is further configured to: select a character from a character database (DB), based on the character information of the multimedia content; and generate the multimedia content based on the selected character and the scene information. 8. The display apparatus of claim 1 , wherein the processor is further configured to generate the background audio corresponding to the emotion information and the genre information of the multimedia content by using a fourth AI model, and wherein the fourth AI model is trained to output the background audio based on the emotion information and the genre information of the multimedia content. 9. The display apparatus of claim 1 , further comprising: an audio output interface; wherein the processor is further configured to: obtain genre information and emotion information of the multimedia content, and generate the plot information of the multimedia content based on the genre information and the emotion information of the multimedia content by using the first AI model, obtain emotion information about the scene, based on the scene information, and generate a background audio of the scene, based on the genre information of the multimedia content and the emotion information, and control the audio output interface to output the background audio. 10. The display apparatus of claim 9 , wherein the processor is further configured to generate the background audio corresponding to the emotion information and the genre information of the multimedia content by using a fourth AI model, wherein the fourth AI model is trained to output the background audio based on the emotion information and the genre information of the multimedia content. 11. The display apparatus of claim 10 , wherein the processor is further configured to implement a data learner, wherein the data learner is configured to: train the first AI model by using learning data including sample story information of the multimedia content, sample emotion information of the multimedia content, and sample plot information of the multimedia content by iteratively feeding into an initial first AI model until the initial first AI model satisfies a predetermined condition train the second AI model by using learning data including sample plot information of the multimedia content and sample sequence information of the multimedia content; train the third AI model by using learning data including sample character information of the multimedia content, sample sequence information of the multimedia content, and sample scene information of the multimedia content, train the fourth AI model by using learning data including sample emotion information of a scene and sample genre information of the multimedia content, and store the first AI model, the second AI model, the third AI model and the fourth AI model in the memory of the display apparatus. 12. An operation method of a display apparatus for generating multimedia content, the operation method comprising: obtaining genre information and emotion information of the multimedia content; generating plot information of the multimedia content based on the genre information and emotion information of the multimedia content by using a first artificial intelligence (AI) model; generating sequence information including one or more sequences of the multimedia content corresponding to the plot information by using a second AI model; generating scene information based on the sequence information by using a third AI model; generating the multimedia content based on the scene information; and outputting the multimedia content. 13. The operation method of claim 12 , wherein the emotion information of the multimedia content comprises valence information represented with respect to a reproduction time of the multimedia content and arousal information represented with respect to the reproduction time of the multimedia content. 14. The operation method of claim 12 , wherein the obtaining of the plot information of the multimedia content further comprises randomly obtaining story information of the multimedia content from a story database (DB), and wherein the first AI model is trained to output the plot information based on the story information, the genre information, and the emotion information. 15. The operation method of claim 12 , wherein the second AI model is trained to output the sequence information based on the plot information. 16. The operation method of claim 12 , wherein the third AI model is trained to output the scene information based on receiving character information of the multimedia content and the sequence information of the multimedia content. 17. The operation method of claim 12 , wherein the scene information comprises at least one of background information of a scene, information about a behavior of a character appearing in the scene, and conversation contents of the character. 18. The operation method of claim 17 , wherein the generating of the multimedia content comprises: selecting a character from a character database (DB), based on the character information of the multimedia content; and generating the multimedia content based on the selected character and the scene information. 19. A non-transitory computer-readable recording medium having recorded thereon a computer progra
Management of the audio stream, e.g. setting of volume, audio stream path · CPC title
Recognising information on displays, dials, clocks · CPC title
Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title
Learning methods · CPC title
for processing of video signals · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.