Language models using spoken language modeling
US-2024386885-A1 · Nov 21, 2024 · US
US2020168202A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020168202-A1 |
| Application number | US-201916682324-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 13, 2019 |
| Priority date | Nov 27, 2018 |
| Publication date | May 28, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are an electronic device and an operation method thereof. The electronic device may include a memory storing one or more instructions, and a processor configured to execute the one or more instructions stored in the memory to: analyze a meaning of a speech section in audio data included in a content being played on the electronic device, based on an analysis result of the speech section, identify, from among a plurality of image frames included in the content, an image candidate section for generating a highlight image, analyze an object included in an image frame corresponding to the image candidate section, and identify a target section for generating the highlight image based on an analysis result of the image candidate section.
Opening claim text (preview).
What is claimed is: 1 . An electronic device comprising: a memory storing one or more instructions; and a processor configured to execute the one or more instructions stored in the memory to: analyze a meaning of a speech section in audio data included in a content being played on the electronic device, based on an analysis result of the speech section, identify, from among a plurality of image frames included in the content, an image candidate section for generating a highlight image, analyze an object included in an image frame corresponding to the image candidate section, and identify a target section for generating the highlight image based on an analysis result of the image candidate section. 2 . The electronic device of claim 1 , wherein the processor is further configured to execute the one or more instructions to: generate the highlight image based on the identified target section, and overlap and reproduce the generated highlight image on the content being played when the highlight image is generated. 3 . The electronic device of claim 1 , wherein the processor is further configured to execute the one or more instructions to: analyze sound wave characteristics of the audio data included in the content being played. 4 . The electronic device of claim 1 , wherein the processor is further configured to execute the one or more instructions to: determine whether a predetermined keyword is included in the speech section. 5 . The electronic device of claim 1 , wherein the processor is further configured to execute the one or more instructions to: adjust a time distance for identifying the image candidate section according to a weight of a predetermined keyword in the speech section. 6 . The electronic device of claim 1 , further comprising a microphone, wherein the processor is further configured to execute the one or more instructions to: analyze external audio data input from outside the electronic device through the microphone. 7 . The electronic device of claim 1 , wherein, when a first image candidate section in the plurality of image frames and a second image candidate section in the plurality of image frames overlap at least partially, the processor is further configured to execute the one or more instructions to: identify the image candidate section based on a comparison between a first weight of a first keyword corresponding to the first image candidate section and a second weight of a second keyword corresponding to the second image candidate section. 8 . The electronic device of claim 1 , wherein the processor is further configured to execute the one or more instructions to: detect the object included in the image frame corresponding to the image candidate section and calculate a motion variation amount of the detected object. 9 . The electronic device of claim 1 , wherein the processor is further configured to execute the one or more instructions to: generate the highlight image based on the identified target section, display an interface requesting a user input as to whether to display the generated highlight image when the highlight image is generated, in response to the user input, overlap and reproduce the highlight image on the content being played, and reset a weight of a keyword corresponding to the reproduced highlight image. 10 . The electronic device of claim 1 , wherein the processor is further configured to execute the one or more instructions to: control the display to display a list comprising one or more highlight images generated from the content being played. 11 . An operation method of an electronic device, the operation method comprising: analyzing a meaning of a speech section in audio data included in a content being played on the electronic device; based on an analysis result of the speech section, identifying, from among a plurality of image frames included in the content, an image candidate section for generating a highlight image; analyzing an object included in an image frame corresponding to the image candidate section; and identifying a target section for generating the highlight image based on an analysis result of the image candidate section. 12 . The operation method of claim 11 , further comprising: generating the highlight image based on the identified target section; and overlapping and reproducing the generated highlight image on the content being played when the highlight image is generated. 13 . The operation method of claim 11 , further comprising: analyzing sound wave characteristics of the audio data included in the content being played back. 14 . The operation method of claim 11 , further comprising: determining whether a predetermined keyword is included in the speech section. 15 . The operation method of claim 11 , further comprising: analyzing external audio data input from outside the electronic device through a microphone. 16 . The operation method of claim 11 , wherein the identifying of the image candidate section comprises: determining that a first image candidate section in the plurality of image frames and a second image candidate section in the plurality of image frames overlap at least partially; and identify the image candidate section based on a comparison between a first weight of a first keyword corresponding to the first image candidate section and a second weight of a second keyword corresponding to the second image candidate section. 17 . The operation method of claim 11 , wherein the analyzing of the object comprises: detecting the object included in the image frame corresponding to the image candidate section; and calculating a motion variation amount of the detected object. 18 . The operation method of claim 11 , further comprising: generating the highlight image based on the identified target section; displaying an interface requesting a user input as to whether to display the generated highlight image when the highlight image is generated; in response to the user input, overlapping and reproducing the highlight image on the content being played; and resetting a weight of a keyword corresponding to the reproduced highlight image. 19 . The operation method of claim 11 , further comprising: controlling a display to display a list comprising one or more highlight images generated from the content being played. 20 . A non-transitory computer-readable recording medium having recorded thereon a program for performing the method of claim 11 on a computer.
Electricity · mapped topic
Physics · mapped topic
Concept to speech synthesisers; Generation of natural phrases from machine-based concepts (generation of parameters for speech synthesis out of text G10L13/08) · CPC title
Physics · mapped topic
sound input device, e.g. microphone · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.