Method and apparatus for managing images using a voice tag
US-9916864-B2 · Mar 13, 2018 · US
US10347296B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10347296-B2 |
| Application number | US-201815918900-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 12, 2018 |
| Priority date | Oct 14, 2014 |
| Publication date | Jul 9, 2019 |
| Grant date | Jul 9, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An electronic device is provided. The electronic device includes a voice input module which receives a voice from an outside to generate voice data, a memory which stores one or more images or videos, and a processor which is electrically connected to the voice input module and the memory. The memory includes instructions, when executed by the processor, causing the electronic device to link at least one of the voice data, the first metadata information based on the voice data, or second metadata information generated from the voice data and/or the first metadata information with the second image or video.
Opening claim text (preview).
What is claimed is: 1. An electronic device comprising: a microphone; a display; a memory; and a processor electrically connected to the microphone and the memory, wherein the memory is configured to store one or more images or videos, wherein the memory comprises instructions, the instructions, when executed by the processor, causing the electronic device to: generate voice data on a voice received through the microphone with respect to a first image or video stored on the memory, link the voice data or first metadata information based on the voice data, with the first image or video, determine a relation between a second image or video stored on the memory, and the first image or video, and link at least one of (1) the voice data, (2) the first metadata information, or (3) second metadata information generated from the voice data and/or the first metadata information with the second image or video, based on at least a part of the relation determined between the second image or video stored on the memory and the first image or video, and wherein the instructions, when executed by the processor, cause the electronic device further to display a list of voice tags including a first voice tag corresponding to the voice data on the display. 2. The electronic device of claim 1 , wherein the electronic device links the first metadata information with the first image or video in the form of a tag, and wherein the electronic device is configured to link at least one of (1) the voice data, (2) the first metadata information, or (3) the second metadata information with the second image or video in the form of a tag. 3. The electronic device of claim 1 , wherein the first metadata information comprises speech-to-text information extracted from the voice data. 4. The electronic device of claim 1 , wherein the electronic device is configured to determine the relation using at least one of an image analysis, location information, time information, text information, or face recognition information associated with the first image or video and the second image or video. 5. An electronic device comprising: a microphone configured to receive a voice from an outside to generate voice data; a transceiver; a display; a memory; and a processor electrically connected to the microphone, the transceiver, and the memory, wherein the memory is configured to store one or more images or videos, and wherein the memory comprises instructions, the instructions, when executed by the processor, causing the electronic device to: generate voice data on a voice received through the microphone with respect to a first image or video stored on the memory, link the voice data or first metadata information based on the voice data, with the first image or video, transmit the first image or video and the linked voice data or the first metadata information to the outside of the electronic device through the transceiver, transmit a request for requiring one or more images or videos associated with the linked voice data or the first metadata information to the outside of the electronic device, and receive one or more images or videos linked with (1) the voice data, (2) the first metadata information, or (3) second metadata information generated from the voice data and/or the first metadata information from the outside of the electronic device; and wherein the instructions, when executed by the processor, cause the electronic device further to display a list of voice tags including a first voice tag corresponding to the voice data on the display. 6. The electronic device of claim 5 , wherein the list includes icons or texts corresponding to the voice tags, respectively. 7. An electronic device comprising: a microphone configured to obtain voice data on a specific image; a display; and a processor configured to: analyze the voice data to determine at least one portion of metadata information of the specific image, register the voice data as a voice tag with the specific image; register the voice data as the voice tag with at least one association image, which satisfies a specific reference with respect to the specific image or the determined metadata information, from among a plurality of images; and display a list of voice tags including the voice tag corresponding to the voice data. 8. The electronic device of claim 7 , wherein a plurality of metadata information comprises at least one of information on a location or a time where the specific image is captured, information on a device capturing the specific image, or information on a shooting mode of the specific image. 9. The electronic device of claim 7 , further comprising: a camera, wherein if the specific image is captured by the camera, the processor is configured to activate the microphone to guide obtaining of the voice data. 10. The electronic device of claim 7 , wherein the processor is configured to provide a user interface (UI) for guiding obtaining of the voice data if the specific image is selected. 11. The electronic device of claim 7 , wherein the processor is configured to register a text tag, which is obtained by converting the voice data into a text, together with the voice tag with respect to the at least one association image. 12. The electronic device of claim 7 , wherein the processor is configured to analyze the voice data using an object appearing at the specific image. 13. The electronic device of claim 7 , wherein the processor is configured to determine at least one portion of metadata information among information on the location, the time, the device capturing the specific image, and the shooting mode, based on a relation between an analysis result of the voice data and each of the plurality of information. 14. The electronic device of claim 13 , wherein the processor is configured to determine an image, which includes location information belonging within a specific range from a position of the specific image as metadata information, from among the plurality of images as the at least one association image. 15. The electronic device of claim 13 , wherein the processor is configured to determine an image, which includes time information belonging within a specific range from the time of the specific image as metadata information, from among the plurality of images as the at least one association image. 16. The electronic device of claim 13 , wherein the processor is configured to determine an image, which includes location information having a specific relation with the time of the specific image as metadata information, from among the plurality of images as the at least one association image. 17. The electronic device of claim 7 , wherein the processor is configured to determine an image, which has a similarity of a threshold value or more to the specific image, from among the plurality of images as the at least one association image. 18. The electronic device of claim 7 , wherein at least a part of the plurality of images is stored on an external device functionally connected with the electronic device, and wherein the electronic device further comprises a transceiver communicating with the external device. 19. The electronic device of claim 7 , wherein the processor is further configured to reproduce the voice data in response to selecting the voice tag corresponding the voice data from the list. 20. The electronic device of claim 7 , wherein the processor is further configured to search the at least one association image in respo
Television signal processing therefor · CPC title
Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof · CPC title
Motion video recording combined with still video recording (television signal recording H04N5/76) · CPC title
Speech classification or search · CPC title
Indexing; Addressing; Timing or synchronising; Measuring tape travel · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.