Equidistant-temporal aggregation for moving object segmentation
US-2024425042-A1 · Dec 26, 2024 · US
US10049277B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10049277-B2 |
| Application number | US-201414568337-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 12, 2014 |
| Priority date | Feb 7, 2007 |
| Publication date | Aug 14, 2018 |
| Grant date | Aug 14, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and apparatus for tracking an object, and a method and apparatus for calculating object pose information are provided. The method of tracking the object obtains object feature point candidates by using a difference between pixel values of neighboring frames. A template matching process is performed in a predetermined region having the object feature point candidates as the center. Accordingly, it is possible to reduce a processing time needed for the template matching process. The method of tracking the object is robust in terms of sudden changes in lighting and partial occlusion. In addition, it is possible to track the object in real time. In addition, since the pose of the object, the pattern of the object, and the occlusion of the object are determined, detailed information on action patterns of the object can be obtained in real time.
Opening claim text (preview).
What is claimed is: 1. A method of tracking an object comprising: detecting predetermined object feature points in a previous frame; estimating object feature points of a present frame by using a difference between a pixel value of the previous frame and a pixel value of the present frame; estimating whether the estimated object feature points of the present frame are occluded based on a similarity between the estimated object feature points of the present frame and the detected object feature points of the previous frame, wherein the similarity is a distance between the estimated object feature points in the present frame and object features obtained by performing an affine transformation on the detected object feature points of the previous frame; and tracking the object based on the detected object feature points of the previous frame corresponding to the occluded object feature points of the present frame when the estimated object feature points of the present frame are occluded. 2. The method of claim 1 , wherein the object feature points of the present frame are estimated by calculating an optical flow depending on the detected object feature points on the basis of the difference between the pixel value of the previous frame and the pixel value of the present frame and using the calculated optical flow. 3. The method of claim 2 , wherein the optical flow is calculated by calculating the difference between the pixel value of the previous frame and the pixel value of the present frame based on an LKT (Lucas-Kanade-Tomasi) algorithm, and the object feature points of the present frame are estimated by using the calculated optical flow. 4. The method of claim 1 , wherein the object feature points of the previous frame are detected by using AAMs (Active Appearance Models) or ASMs (Active Shape Models). 5. A non-transitory computer-readable recording medium having embodied thereon a computer program for executing the method of claim 1 . 6. A method of calculating object pose information comprising: detecting predetermined object feature points in a previous frame; estimating object feature points of a present frame by using a difference between a pixel value of the previous frame and a pixel value of the present frame; calculating object pose information by using coordinate values detected in the previous frame and coordinate values of the object feature points estimated in the present frame; calculating an affine transformation between the coordinate values of the object feature points detected in the previous frame and the coordinate values of the object feature points estimated in the present frame; calculating the object feature points of the present frame by using the coordinate values of the object feature points detected in the previous frame and the calculated affine transformation; and calculating a similarity between the estimated object feature points of the present frame and the calculated object feature points and estimating whether the estimated object feature points of the present frame are occluded on the basis of the calculated similarity, wherein the similarity is a distance between the estimated object feature points in the present frame and the calculated object feature points of the present frame obtained by using the calculated affine transformation, wherein the object pose information further includes occlusion information on whether the object feature points in the present frame are occluded. 7. The method of claim 6 , wherein the object feature points are estimated by estimating 2D (X-Y) locations of the object feature points, and the object pose information is calculated by further considering a 3D model of a general object and a table in which information (Z) on heights of the general object is stored. 8. The method of claim 6 , wherein the object pose information includes rotation matrix information on directions of the object. 9. The method of claim 6 , wherein the object feature points includes feature points related to eyes and a mouth selected from a human face, and the object pose information further includes information on blinking of the eyes or information on a shape of the mouth as pattern information. 10. The method of claim 6 , wherein the object feature points include feature points related to an eye selected from a human face, wherein the method of calculating object pose information further comprises calculating an edge of an eye region by applying a Sobel operator to the present frame and the previous frame, generating a gradient map depending on the calculated edge of the eye region, and estimating whether the eye blinks according to a similarity or correlation between the gradient map of the present frame and the gradient map of the previous frame, and wherein the object pose information further includes pattern information on the blinking of the eye. 11. The method of claim 6 , wherein the object feature points include feature points related to a mouth selected from a human face, wherein the method of calculating object pose information further comprises converting a predetermined image including the feature points related to the mouth into a binary level image and estimating an ellipse function that represents an edge of the mouth by using an ellipse fitting process, and wherein the object pose information further includes pattern information on a shape of the mouth according to the estimated ellipse function. 12. A non-transitory computer-readable recording medium having embodied thereon a computer program for executing the method of claim 6 . 13. An apparatus for tracking a video image object, including at least one processor, configured to initialize, using the at least one processing device, a tracking process by detecting object feature points from a video image, estimate object feature points of a present frame by using a difference between a pixel value of a previous frame and a pixel value of the present frame, estimate whether the estimated object feature points of the present frame are occluded on the basis of a similarity between the estimated object feature points of the present frame and the detected object feature points of the video image from a previous frame, and track the object based on the detected object feature points of the previous frame corresponding to occluded object feature points of the present frame when the estimated object feature points of the present frame are occluded, wherein the similarity is a distance between the estimated object feature points in the present frame and object features obtained by performing an affine transformation on the detected object feature points of the previous frame. 14. An apparatus for calculating object pose information, including at least one processor, configured to initialize a tracking process by detecting object feature points from a video image, detect predetermined object feature points in a previous frame, estimate object feature points of a present frame by using a difference between a pixel value of a previous frame and a pixel value of the present frame, calculate object pose information by using coordinate values of the object feature points detected by initializing the tracking process and coordinate values of the object feature points of the present frame, generate rotation matrix information on directions of an object, generate occlusion information on whether the object feature points of the present frame are occluded, and generate pattern information on blinking of eyes and a shape of a mouth, wherein the at least one processor estimates whether the estimated object feature points of the pre
in video content (extracting overlay text G06V20/62; video retrieval G06F16/70; processing of video elementary streams in video servers H04N21/234; processing of video elementary streams in video clients H04N21/44) · CPC title
using feature-based methods, e.g. the tracking of corners or segments · CPC title
Matching criteria, e.g. proximity measures · CPC title
involving reference images or patches · CPC title
Registration of image sequences · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.