Information displaying method, apparatus, electronic device and storage medium
US-2024420201-A1 · Dec 19, 2024 · US
US12513348B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12513348-B2 |
| Application number | US-202418747082-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 18, 2024 |
| Priority date | Dec 12, 2023 |
| Publication date | Dec 30, 2025 |
| Grant date | Dec 30, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided is a video rendering method for a live broadcast scene, relating to the field of live broadcast and the field of large model. The method includes: recording a live broadcast of an anchor to obtain a first video stream; performing speech recognition on live speech in the first video stream to obtain first text information; determining topic popularity of the live broadcast based on audience response information in a process of recording the live broadcast and the first text information; determining corresponding reply text information based on the first text information when the topic popularity of the live broadcast meets a first set condition; rendering virtual characters based on the reply text information to obtain a second video stream; and generating a third video stream of the anchor chatting with the virtual characters based on the first video stream and the second video stream.
Opening claim text (preview).
What is claimed is: 1 . A video rendering method for a live broadcast scene, comprising: recording a live broadcast of an anchor to obtain a first video stream; performing speech recognition on live speech in the first video stream to obtain first text information; determining topic popularity of the live broadcast based on audience response information in a process of recording the live broadcast and the first text information; determining corresponding reply text information based on the first text information when the topic popularity of the live broadcast meets a first set condition; rendering virtual characters based on the reply text information to obtain a second video stream; and generating a third video stream of the anchor chatting with the virtual characters based on the first video stream and the second video stream. 2 . The method of claim 1 , wherein the determining topic popularity of the live broadcast based on audience response information in a process of recording the live broadcast and the first text information, comprises: extracting at least one first text segment from the first text information; for each first text segment, searching the audience response information for a second text segment that can form a key-value pair with the first text segment, and counting the number of key-value pairs; and determining the topic popularity of the live broadcast based on the number of key-value pairs. 3 . The method of claim 1 , wherein the determining corresponding reply text information based on the first text information when the topic popularity of the live broadcast meets a first set condition, comprises: extracting keywords from the first text information to obtain a plurality of keywords when the topic popularity of the live broadcast meets the first set condition; performing topic classification on the plurality of keywords to obtain at least one topic set; determining a topic repetition degree of the live broadcast based on the number of topic sets; and determining the corresponding reply text information based on the first text information when the topic repetition degree of the live broadcast meets a second set condition. 4 . The method of claim 1 , wherein the virtual characters comprise N virtual characters, N is a positive integer greater than 1, and the determining the corresponding reply text information based on the first text information comprises: determining a corresponding target text generation model among M text generation models based on a style of a first virtual character among the N virtual characters, wherein Mis a positive integer greater than 1; inputting the first text information into the target text generation model corresponding to the first virtual character to obtain reply text information of the first virtual character; for an i th virtual character among the N virtual characters, performing operations of: determining a corresponding target text generation model among the M text generation models based on a style of the i th virtual character, wherein i is a positive integer greater than 1; and inputting the first text information and reply text information of the first virtual character to an i−1 th virtual character into the target text generation model corresponding to the i th virtual character to obtain reply text information of the i th virtual character. 5 . The method of claim 4 , wherein the rendering the virtual characters based on the reply text information to obtain a second video stream, comprises: rendering each virtual character based on the reply text information of each virtual character to obtain a second video stream of each virtual character. 6 . The method of claim 5 , wherein the generating a third video stream of the anchor chatting with the virtual characters based on the first video stream and the second video stream, comprises: mixing the first video stream with the second video stream of each virtual character based on a generation order of the reply text information of the virtual characters, to obtain a third video stream of the anchor chatting with each virtual character. 7 . The method of claim 1 , wherein the determining the corresponding reply text information based on the first text information, comprises: processing the first text information based on styles of the virtual characters to obtain second text information; and processing the second text information based on a text generation model to obtain the reply text information of the virtual characters. 8 . An electronic device, comprising: at least one processor; and a memory connected in communication with the at least one processor, wherein the memory stores an instruction executable by the at least one processor, and the instruction, when executed by the at least one processor, enables the at least one processor to execute: recording a live broadcast of an anchor to obtain a first video stream; performing speech recognition on live speech in the first video stream to obtain first text information; determining topic popularity of the live broadcast based on audience response information in a process of recording the live broadcast and the first text information; determining corresponding reply text information based on the first text information when the topic popularity of the live broadcast meets a first set condition; rendering virtual characters based on the reply text information to obtain a second video stream; and generating a third video stream of the anchor chatting with the virtual characters based on the first video stream and the second video stream. 9 . A non-transitory computer-readable storage medium storing a computer instruction thereon, wherein the computer instruction is used to cause a computer to execute: recording a live broadcast of an anchor to obtain a first video stream; performing speech recognition on live speech in the first video stream to obtain first text information; determining topic popularity of the live broadcast based on audience response information in a process of recording the live broadcast and the first text information; determining corresponding reply text information based on the first text information when the topic popularity of the live broadcast meets a first set condition; rendering virtual characters based on the reply text information to obtain a second video stream; and generating a third video stream of the anchor chatting with the virtual characters based on the first video stream and the second video stream. 10 . The electronic device of claim 8 , wherein the determining topic popularity of the live broadcast based on audience response information in a process of recording the live broadcast and the first text information, comprises: extracting at least one first text segment from the first text information; for each first text segment, searching the audience response information for a second text segment that can form a key-value pair with the first text segment, and counting the number of key-value pairs; and determining the topic popularity of the live broadcast based on the number of key-value pairs. 11 . The electronic device of claim 8 , wherein the determining corresponding reply text information based on the first text information when the topic popularity of the live broadcast meets a first set condition, comprises: extracting keywords from the first text information to obtain a plurality of keywords when the topic popularity of the live broadcast meets the first set condition; performing topic classification on the plurality of keywords to obtain at least one topic set; determining a topic repetition degree o
Speech to text systems (G10L15/08 takes precedence) · CPC title
Three-dimensional [3D] modelling for computer graphics · CPC title
Speech synthesis; Text to speech systems · CPC title
Multimedia information · CPC title
for supporting social networking services · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.