What technology area does this patent fall under?

Primary CPC classification G06T7/248. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus with object tracking

US12430776B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12430776-B2
Application number	US-202318355725-A
Country	US
Kind code	B2
Filing date	Jul 20, 2023
Priority date	Jul 29, 2022
Publication date	Sep 30, 2025
Grant date	Sep 30, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus with object tracking is provided. The method includes generating a mixed filter by fusing a short-term filter with a long-term filter; and performing object tracking on a current frame image based on the mixed filter. The short-term filter is dependent on a prediction of the current frame image in a video sequence, and the long-term filter is a previously generated long-term filter or is generated by optimizing the previously generated long-term filter based on an object template feature pool.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor-implemented method, comprising: generating a mixed filter by fusing a short-term filter with a long-term filter; and performing object tracking on a current frame image based on the mixed filter, wherein the short-term filter is dependent on a prediction of the current frame image in a video sequence, and the long-term filter is a previously generated long-term filter or is generated by optimizing the previously generated long-term filter based on an object template feature pool. 2. The method of claim 1 , further comprising, prior to the generating of the mixed filter, predicting the short-term filter based on a first frame image of the video sequence, the current frame image and an auxiliary frame image of the video sequence, wherein the auxiliary frame image is an image frame that has a determined greater tracking success confidence than a first threshold value and is closest to the current frame image in time sequence. 3. The method of claim 2 , wherein the predicting of the short-term filter comprises: extracting features, through a feature extraction network, for a first search region from the first frame image, an auxiliary search region from the auxiliary frame image, and a current search region from the current frame image, and extracting a first deep feature of the first search region, an auxiliary deep feature of the auxiliary search region, and a current deep feature of the current search region; generating an object state encoding vector by performing object state encoding on the first deep feature, a first bounding box of the first frame image with respect to the object, the auxiliary deep feature, and an auxiliary bounding box of the auxiliary frame image with respect to the object; obtaining a current frame encoding vector by performing encoding on the current deep feature; generate a hidden feature using a trained transformer model provided an input based on the object state encoding vector and the current frame encoding vector to thus; and generating the short-term filter by linearly transforming the hidden feature, wherein the first search region is determined according to the first bounding box, the auxiliary search region is determined according to the auxiliary bounding box, and the current search region is determined according to a predicted bounding box of a predicted object based on N number of frame images prior to the current frame image, wherein N is an integer greater than or equal to 1. 4. The method of claim 1 , further comprising, prior to the generating of the mixed filter, in response to the current frame image being determined to be an image frame at a predetermined position in the video sequence, generating the long-term filter by optimizing the previously obtained long-term filter based on the object template feature pool; or in response to the current frame image being determined to not be an image frame at the predetermined position in the video sequence, generating the previously obtained long-term filter as the long-term filter. 5. The method of claim 1 , wherein the optimizing of the previously obtained long-term filter comprises: extracting a predetermined number of deep features and bounding boxes of the object corresponding to respective ones of accumulated deep features from the object template feature pool and determining the extracted deep features and bounding boxes to be a filter training set; and training and/or optimizing, based on the filter training set, the previously obtained long-term filter through a filter optimization algorithm. 6. The method of claim 1 , wherein the generating of the mixed filter by fusing the short-term filter with the long-term filter comprises: generating a short-term object positioning response map and a long-term object positioning response map by respectively performing correlation processing on the current frame image using the short-term filter and the long-term filter; and generating the mixed filter by fusing the short-term filter with the long-term filter according to the short-term object positioning response map and the long-term object positioning response map. 7. The method of claim 6 , wherein the generating of the mixed filter further comprises: evaluating short-term map quality of the short-term object positioning response map, and long-term map quality of the long-term object positioning response map; determining a mixture weight of the short-term filter and a mixture weight of the long-term filter according to a result of comparing a second predetermined threshold value to the short-term map quality and the long-term map quality; and generating the mixed filter by fusing the short-term filter with the long-term filter according to the mixture weight of the short-term filter and the mixture weight of the long-term filter. 8. The method of claim 7 , wherein the determining of the mixture weight of the short-term filter and the mixture weight of the long-term filter comprises: in response to the short-term map quality being determined greater than or equal to the second predetermined threshold value and the long-term map quality is less than the second predetermined threshold value, setting the mixture weight of the short-term filter as 1 and the mixture weight of the long-term filter as 0; in response to the short-term map quality being determined less than the second predetermined threshold value and the long-term map quality is greater than or equal to the second predetermined threshold value, setting the mixture weight of the short-term filter as 0 and the mixture weight of the long-term filter as 1; in response to both the mixture weights of the short-term filter and the long-term map being determined to have respective qualities that are less than the second predetermined threshold value, setting each of the mixture weights as a weight value corresponding to a previously obtained mixed filter; or in response to both the mixture weights of the short-term filter and the long-term map being determined to have respective qualities that are greater than or equal to the second predetermined threshold value, setting each of the mixture weights as a mixture weight of a normalized output of a Softmax activation function of the short-term map quality and the long-term map quality. 9. The method of claim 6 , wherein the generating of the mixed filter further comprises: generating a mixture weight of the short-term filter and a mixture weight of the long-term filter by using a convolutional neural network and a normalization function, according to the short-term object positioning response map and the long-term object positioning response map; and generating the mixed filter by fusing the short-term filter with the long-term filter according to the mixture weight of the short-term filter and the mixture weight of the long-term filter. 10. The method of claim 9 , wherein the generating of the mixture weight of the short-term filter and the mixture weight of the long-term filter further comprises: generating a mixed response map by mixing and processing the short-term object positioning response map and the long-term object positioning response map; extracting a feature from the mixed response map using the convolutional neural network, and generating a mixture weight vector by linearly transforming the extracted feature using a linear transformation layer; and generating the mixture weight of the short-term filter and the mixture weight of the long-term filter by normalizing the mixture weight vector according to a Softmax activation function. 11. The method of claim 1 , wherein the performing of the object tracking further comprises: g

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06T2207/10016
Video; Image sequence · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title
G06T2207/20076
Probabilistic image processing · CPC title
G06T7/215
Motion-based segmentation · CPC title

Patent family

Related publications grouped by family.

View patent family 89769277

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12430776B2 cover?: A method and apparatus with object tracking is provided. The method includes generating a mixed filter by fusing a short-term filter with a long-term filter; and performing object tracking on a current frame image based on the mixed filter. The short-term filter is dependent on a prediction of the current frame image in a video sequence, and the long-term filter is a previously generated long-t…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06T7/248. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and apparatus with adaptive object tracking

Systems and methods for object tracking

System and method of hybrid tracking for match moving

Frequently asked questions