What technology area does this patent fall under?

Primary CPC classification H04N23/66. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Apparatus and method for controlling a robot photographer with semantic intelligence

US12452531B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12452531-B2
Application number	US-202318373078-A
Country	US
Kind code	B2
Filing date	Sep 26, 2023
Priority date	Sep 29, 2022
Publication date	Oct 21, 2025
Grant date	Oct 21, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An electronic device for controlling a photographic system may obtain a video stream and a user query for a target event, obtain a set of photos from the video stream, obtain at least one photoshoot suggestion based on the user query via a language model, obtain a snapped photo for the target event based on the at least one photoshoot suggestion, in response to a given video frame included in the video stream satisfying a target content criterion, and output one or more photos selected from the set of photos and the snapped photo as event photos.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device for controlling a photographic system, the electronic device comprising: a memory storing one or more instructions; and one or more processors configured to: obtain a video stream and a user query for a target event; obtain a set of photos from the video stream; obtain at least one photoshoot suggestion based on the user query via a language model; obtain a snapped photo for the target event based on the at least one photoshoot suggestion, in response to a given video frame included in the video stream satisfying a target content criterion; and output one or more photos selected from the set of photos and the snapped photo as event photos. 2. The electronic device of claim 1 , wherein the given video frame meets the target content criterion when a similarity score between a text embedding extracted from the current video frame and an image embedding extracted from the at least one photoshoot suggestion, is greater than similarity scores between each of text embeddings extracted from previous video frames within the video stream and the image embedding extracted from the at least one photoshoot suggestion. 3. The electronic device of claim 1 , further comprising a first camera configured to acquire the video stream and a second camera configured to acquire the snapped photo, wherein: the at least one photoshoot suggestion comprises a plurality of photoshoot suggestions, any one or any combination of the one or more processors are configured to: extract an image embedding from the current video frame acquired at a current pose of the first camera; obtain a plurality of text embeddings from the plurality of photoshoot suggestions, respectively; compute similarity scores between the image embedding and each of the plurality of text embeddings; select a first photoshoot suggestion that has a highest similarity score, from among the similarity scores; increment a counter that is initially set for the selected first photoshoot suggestion over time; decrease the similarity score for the selected first photoshoot suggestion over time by reducing the similarity score by a value of the counter that increases over time; select a second photoshoot suggestion that initially had a second-highest similarity score and has surpassed all other photoshoot suggestions in similarity score; and adjust the current pose of the first camera to capture the selected second photoshoot suggestion. 4. The electronic device of claim 1 , further comprising a first camera configured to acquire the video stream and a second camera configured to acquire the snapped photo, wherein any one or any combination of the one or more processors are configured to: extract an image embedding from the given video frame that is acquired at a current pose of the first camera; obtain a text embedding from the at least one photoshoot suggestion; acquire translation coordinates and rotation angles of a next pose of the first camera, based on a change in similarity between the image embedding and the text embedding with respect to change in each pixel in the video frame; adjust the pose of the first camera based on the translation coordinates and the rotation angles; and control the first camera to acquire a next video frame in the adjusted pose. 5. The electronic device of claim 1 , further comprising a first camera configured to acquire the video stream and a second camera configured to acquire the snapped photo, wherein any one or any combination of the one or more processors are configured to: extract an image embedding from the video frame that is acquired at a current pose of the first camera; obtain a text embedding from the at least one photoshoot suggestion; acquire translation coordinates and rotation angles of a next pose of the first camera, based on a change in similarity between the image embedding and the text embedding with respect to change in camera pose parameters of the current pose of the first camera, adjust the pose of the first camera based on the translation coordinates and the rotation angles; and control the camera to acquire a next video frame in the adjusted pose. 6. The electronic device of claim 1 , wherein any one or any combination of the one or more processors are configured to: construct a full query based on the user query; input the full query to the language model; acquire the at least one photoshoot suggestion as an output of the language model; and control a camera to obtain the snapped photo based on the at least one photoshoot suggestion. 7. The electronic device of claim 6 , wherein any one or any combination of the one or more processors are configured to: obtain a voice signal during the target event; identify a key event descriptor based on the voice signal acquired during the target event; construct the full query based on the user query and the key event descriptor identified from the voice signal; and input the full query to the language model to acquire the least one photoshoot suggestion that reflects the identified key event descriptor. 8. The electronic device of claim 1 , wherein any one or any combination of the one or more processors are configured to: identify a key event descriptor from the set of photos; construct a full query based on the key event descriptor identified from the set of photos and the user query; and input the full query to the language model to acquire the least one photoshoot suggestion that reflects the identified key event descriptor. 9. The electronic device of claim 1 , wherein any one or any combination of the one or more processors are configured to: determine whether any one of the at least one photoshoot suggestion includes a photography composition directive; and discard the photoshoot suggestion including the photography composition directive. 10. The electronic device of claim 1 , wherein any one or any combination of the one or more processors are configured to: determine whether to use a photo gallery application or a camera application based on device capabilities of the electronic device and the user query; based on the photo gallery application being activated, access a photo gallery of the electronic device to acquire the set of photos that has been stored in the memory; and based on the camera application being activated, acquire the set of photos and the snapped photo to be stored in the memory. 11. A method for controlling a photographic system, the method comprising: obtaining a video stream and a user query for a target event; obtaining a set of photos from the video stream; obtaining at least one photoshoot suggestion based on the user query via a language model; obtaining a snapped photo for the target event based on the at least one photoshoot suggestion, in response to a given video frame included in the video stream satisfying a target content criterion; and outputting one or more photos selected from the set of photos and the snapped photo as event photos. 12. The method of claim 11 , further comprising: determining that the given video frame satisfies the target content criterion when a similarity score between a text embedding extracted from the given video frame and an image embedding extracted from the at least one photoshoot suggestion, is greater than similarity scores between each of text embeddings extracted from previous video frames within the video stream and the image embedding extracted from the at least one photoshoot suggestion. 13. The method of claim 11 , wherein: the video stream is acquired by a first camera, and the snapped photo is acquired by a second camera, t

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06F40/30
Semantic analysis · CPC title
B25J13/003
by means of an audio-responsive input (audible safety signals B25J19/061) · CPC title
G06T2207/10016
Video; Image sequence · CPC title
G06T7/70
Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title
B25J19/023
including video camera means · CPC title

Patent family

Related publications grouped by family.

View patent family 90470253

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12452531B2 cover?: An electronic device for controlling a photographic system may obtain a video stream and a user query for a target event, obtain a set of photos from the video stream, obtain at least one photoshoot suggestion based on the user query via a language model, obtain a snapped photo for the target event based on the at least one photoshoot suggestion, in response to a given video frame included in t…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification H04N23/66. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Video processing method and electronic device

Systems and methods implementing a machine learning architecture for video processing

In-game dynamic camera angle adjustment

Conditional camera control via automated assistant commands

Weakly Supervised Natural Language Localization Networks

Voice directed context sensitive visual search

Frequently asked questions