Method and apparatus with object pose estimation

US12347141B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12347141-B2
Application numberUS-202117537729-A
CountryUS
Kind codeB2
Filing dateNov 30, 2021
Priority dateDec 18, 2020
Publication dateJul 1, 2025
Grant dateJul 1, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method with object pose estimation includes: obtaining an instance segmentation image and a normalized object coordinate space (NOCS) map by processing an input single-frame image using a deep neural network (DNN); obtaining a two-dimensional and three-dimensional (2D-3D) mapping relationship based on the instance segmentation image and the NOCS map; and determining a pose of an object instance in the input single-frame image based on the 2D-3D mapping relationship.

First claim

Opening claim text (preview).

What is claimed is: 1. A method with object pose estimation, comprising: obtaining an instance segmentation image and a normalized object coordinate space (NOCS) map by processing an input single-frame image using a deep neural network (DNN); generating a two-dimensional and three-dimensional (2D-3D) mapping relationship between the instance segmentation image and the NOCS map based on 2D coordinates of a pixel point in the instance segmentation image and 3D coordinates of NOCS point corresponding to the pixel point in the NOCS map; and determining a pose of an object instance in the single-frame image based on the 2D-3D mapping relationship. 2. The method of claim 1 , further comprising: obtaining a pixel coordinate error map by processing the single-frame image using the DNN, wherein the obtaining of the 2D-3D mapping relationship comprises: constructing a preliminary 2D-3D mapping relationship of the object instance by obtaining a pixel point in the object instance in the single-frame image and a NOCS point of the pixel point using the instance segmentation image and the NOCS map; and obtaining the 2D-3D mapping relationship by removing abnormal 2D-3D mapping from the preliminary 2D-3D mapping relationship using the pixel coordinate error map. 3. The method of claim 2 , wherein each error value among error values of the pixel coordinate error map represents a difference between a predicted NOCS coordinate value and a real NOCS coordinate value for each pixel point among pixel points of the single-frame image. 4. The method of claim 2 , wherein the obtaining of the 2D-3D mapping relationship by removing the abnormal 2D-3D mapping from the preliminary 2D-3D mapping relationship using the pixel coordinate error map comprises: determining an error value greater than a preset threshold value in the pixel coordinate error map; and obtaining the 2D-3D mapping relationship by removing, from the preliminary 2D-3D mapping relationship, 2D-3D mapping corresponding to a NOCS point corresponding to the error value greater than the preset threshold value. 5. The method of claim 2 , wherein the processing of the input single-frame image using the DNN comprises obtaining a multi-scale image feature by extracting a feature from the input single-frame image using a feature extraction module of the DNN. 6. The method of claim 5 , wherein the obtaining of the NOCS map comprises: obtaining a single-scale image feature by fusing the multi-scale image feature using a multi-level feature fusion module of the DNN; and obtaining the NOCS map by performing a convolution on the single-scale image feature using a first convolution module of the DNN. 7. The method of claim 6 , wherein the obtaining of the pixel coordinate error map by processing the input single-frame image using the DNN comprises obtaining the pixel coordinate error map by performing a convolution on the single-scale image feature using the first convolution module. 8. The method of claim 6 , wherein the obtaining of the instance segmentation image comprises: obtaining a mask feature image by performing a convolution on the single-scale image feature using a second convolution module of the DNN; obtaining an object category image and a mask convolution weight for each of multiple scales through a convolution corresponding to each of multi-scale image features using a third convolution module of the DNN; obtaining a multi-scale instance mask image by performing a convolution on the mask feature image and a multi-scale mask convolution weight; and obtaining the instance segmentation image using the multi-scale instance mask image and a multi-scale object category image. 9. The method of claim 1 , wherein the determining of the pose of the object instance in the input single-frame image based on the 2D-3D mapping relationship comprises: in the presence of a depth image corresponding to the input single-frame image, determining a (three-dimensional and three-dimensional) 3D-3D mapping relationship based on the 2D-3D mapping relationship and the depth image, and determining the pose and a size of the object instance using the 3D-3D mapping relationship. 10. The method of claim 1 , wherein the determining of the pose of the object instance in the input single-frame image based on the 2D-3D mapping relationship comprises: in the absence of a depth image corresponding to the input single-frame image, determining a three-dimensional (3D) rotation transformation and a 3D translation transformation between a camera coordinate system and an object coordinate system using the 2D-3D mapping relationship, and determining the pose of the object instance in a preset size. 11. A method with object pose estimation, comprising: obtaining an instance segmentation image and a two-dimensional and three-dimensional (2D-3D) mapping relationship of each of frame images, using a deep neural network (DNN); calculating a camera motion parameter between two frame images among the frame images; determining a three-dimensional and three-dimensional (3D-3D) mapping relationship of a same object instance in the two frame images based on the camera motion parameter, the instance segmentation image, and the 2D-3D mapping relationship that correspond to the two frame images based on 2D coordinates of a pixel point in the instance segmentation image and 3D coordinates of NOCS point corresponding to the pixel point in NOCS map; and determining a pose and a size of the same object instance using the 3D-3D mapping relationship. 12. The method of claim 11 , wherein the obtaining of the instance segmentation image and the 2D-3D mapping relationship of each of the frame images using the DNN comprises: obtaining the instance segmentation image and a normalized object coordinate space (NOCS) map by processing each of the frame images using the DNN; and obtaining the 2D-3D mapping relationship of each of the frame images based on the instance segmentation image and the NOCS map of each of the frame images. 13. The method of claim 12 , further comprising: obtaining a pixel coordinate error map by processing each of the frame images using the DNN, wherein the obtaining of the 2D-3D mapping relationship of each of the frame images comprises: constructing a preliminary 2D-3D mapping relationship of the same object instance by obtaining a pixel point in the same object instance in each of the frame images and a NOCS point of the pixel point using the NOCS map and the instance segmentation image; and obtaining the 2D-3D mapping relationship by removing abnormal 2D-3D mapping from the preliminary 2D-3D mapping relationship using the pixel coordinate error map. 14. The method of claim 11 , wherein the determining of the 3D-3D mapping relationship of the same object instance in the two frame images based on the camera motion parameter, the instance segmentation image, and the 2D-3D mapping relationship that correspond to the two frame images comprises: determining a corresponding relationship between pixels in the same object instance in the two frame images based on the instance segmentation image and the 2D-3D mapping relationship; obtaining three-dimensional (3D) coordinates by calculating a depth of a pixel point in the same object instance in a real scene, using the corresponding relationship between the pixels in the same object instance and the camera motion parameter; and constructing the 3D-3D mapping relationship based on the 3D coordinates of the pixel point in the same object instance in the real scene and on the 2D-3D mapping relationship. 15. An apparatus with object pose est

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12347141B2 cover?
A method with object pose estimation includes: obtaining an instance segmentation image and a normalized object coordinate space (NOCS) map by processing an input single-frame image using a deep neural network (DNN); obtaining a two-dimensional and three-dimensional (2D-3D) mapping relationship based on the instance segmentation image and the NOCS map; and determining a pose of an object instan…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T7/11. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).