Markerless Human Movement Tracking in Virtual Simulation
US-2020097732-A1 · Mar 26, 2020 · US
US12148182B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12148182-B2 |
| Application number | US-202117476920-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 16, 2021 |
| Priority date | Sep 29, 2020 |
| Publication date | Nov 19, 2024 |
| Grant date | Nov 19, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, an apparatus, an electronic device for estimating a pose of an object include determining a confidence of a depth image of an object based on a color image and the depth image of the object, estimating a pose of the object based on a three-dimensional (3D) keypoint in response to the depth image being reliable, and estimating the pose of the object based on a two-dimensional (2D) keypoint in response to the depth image being unreliable.
Opening claim text (preview).
What is claimed is: 1. An object pose estimation method, the method comprising: determining a confidence of a depth image of an object based on a feature determined based on a color image and the depth image of the object; estimating a pose of the object based on a three-dimensional (3D) keypoint in response to the depth image being reliable; and estimating the pose of the object based on a two-dimensional (2D) keypoint in response to the depth image being unreliable. 2. The method of claim 1 , wherein the determining of the confidence of the depth image comprises: extracting an image feature based on the color image or based on the color image and the depth image, extracting a point cloud feature based on the depth image, acquiring a fusion feature by fusing the image feature and the point cloud feature, and determining the confidence of the depth image based on the fusion feature. 3. The method of claim 2 , wherein the determining of the confidence of the depth image based on the fusion feature comprises: acquiring an object instance segmentation image and a depth confidence image based on the fusion feature; and determining the confidence of the depth image corresponding to each target object in the color image based on the object instance segmentation image and the depth confidence image. 4. The method of claim 1 , wherein the determining of the confidence of the depth image comprises extracting an image feature based on the color image and the depth image and determining the confidence of the depth image based on the image feature. 5. The method of claim 4 , wherein the determining of the confidence of the depth image based on the image feature comprises: acquiring an object instance segmentation image and a depth confidence image based on the image feature; and determining the confidence of the depth image corresponding to each target object in the color image based on the object instance segmentation image and the depth confidence image. 6. The method of claim 5 , wherein the acquiring of the object instance segmentation image and the depth confidence image based on the image feature comprises: acquiring a region image feature of an image region corresponding to the each target object based on the image feature; and determining a depth confidence image of a target object based on a region image feature corresponding to the target object and acquiring the object instance segmentation image based on the region image feature corresponding to the each target object. 7. The method of claim 1 , further comprising: acquiring a first appearance feature of each target object and a geometric relationship feature between the respective target objects based on the color image and the depth image; and determining a second appearance feature of a target object based on a first appearance feature of the target object, a first appearance feature of another target object, and a geometric relationship feature between the target object and the other target object, with respect to the each target object. 8. The method of claim 7 , wherein the estimating of the pose of the object based on the 3D keypoint comprises estimating a pose of the each target object based on a fusion feature and a second appearance feature of the each target object. 9. The method of claim 7 , wherein the estimating of the pose of the object based on the 3D keypoint comprises estimating a pose of the each target object based on an image feature and a second appearance feature of the each target object. 10. The method of claim 7 , wherein the acquiring of the first appearance feature of the each target object and the geometric relationship feature between the respective target objects based on the color image and the depth image comprises: extracting an image feature based on the color image or based on the color image and the depth image, extracting a point cloud feature based on the depth image, acquiring a fusion feature by fusing the image feature and the point cloud feature, acquiring the first appearance feature of the each target object and an object instance segmentation image based on the fusion feature, and acquiring a geometric relationship feature between the respective target objects based on the object instance segmentation image. 11. The method of claim 7 , wherein the acquiring of the first appearance feature of the each target object and the geometric relationship feature between the respective target objects based on the color image and the depth image comprises: extracting an image feature based on the color image and the depth image, acquiring a region image feature corresponding to an image region of the each target object based on the image feature, acquiring the first appearance feature of the each target object and a corresponding object detection result based on the region image feature corresponding to the image region of the each target object, and acquiring the geometric relationship feature between the respective target objects based on the object detection result of each target object. 12. The method of claim 1 , further comprising: detecting whether a target object or a target pose first appears in a video frame and determining whether the video frame is an initial frame. 13. The method of claim 12 , wherein the detecting of whether the target object or the target pose first appears in the video frame and the determining of whether the video frame is the initial frame comprises: acquiring an image bounding box of each target object in a corresponding video frame; matching the image bounding box of each target object and an image bounding box corresponding to each object of a pose result list; in response to a matching target object being present in the pose result list, comparing first point cloud data of the image bounding box corresponding to each target object in the corresponding video frame and a second point cloud data frame corresponding to each target object in a previous video frame of the corresponding video frame and determining whether a difference is present between the first point cloud data and second point cloud data, and determining that a pose of an object corresponding to the target object first appears, in response to the difference being present; and determining that the target object first appears, in response to the matching target object being absent in the pose result list. 14. The method of claim 12 , further comprising: acquiring a motion parameter corresponding to the video frame and determining a pose result corresponding to the video frame based on the motion parameter and an object pose result of an initial frame corresponding to the video frame, in response to the video frame being a non-initial frame; and updating the object pose result of the initial frame corresponding to the video frame in a pose result list based on the pose result corresponding to the video frame. 15. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1 . 16. An electronic device comprising: a processor configured to determine a confidence of a depth image of an object based on a feature determined based on a color image and the depth image of the object; estimate a pose of the object based on a three-dimensional (3D) keypoint in response to the depth image being reliable; and estimate the pose of the object based on a two-dimensional (2D) keypoint in response to the depth image being unreliable. 17. The electronic device of claim 16 , wher
Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title
Image fusion; Image merging · CPC title
Range image; Depth image; 3D point clouds · CPC title
Color image · CPC title
Video; Image sequence · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.