What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 27 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Detecting and estimating the pose of an object using a neural network model

US11074717B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11074717-B2
Application number	US-201916405662-A
Country	US
Kind code	B2
Filing date	May 7, 2019
Priority date	May 17, 2018
Publication date	Jul 27, 2021
Grant date	Jul 27, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An object detection neural network receives an input image including an object and generates belief maps for vertices of a bounding volume that encloses the object. The belief maps are used, along with three-dimensional (3D) coordinates defining the bounding volume, to compute the pose of the object in 3D space during post-processing. When multiple objects are present in the image, the object detection neural network may also generate vector fields for the vertices. A vector field comprises vectors pointing from the vertex to a centroid of the object enclosed by the bounding volume defined by the vertex. The object detection neural network may be trained using images of computer-generated objects rendered in 3D scenes (e.g., photorealistic synthetic data). Automatically labelled training datasets may be easily constructed using the photorealistic synthetic data. The object detection neural network may be trained for object detection using only the photorealistic synthetic data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving an image including an object; processing the image, by a neural network model, to generate a belief map corresponding to a location of a keypoint associated with the object, the belief map comprising a probability value for each pixel of the image; processing the image to generate a vector field comprising vectors pointing from a vertex of a bounding volume that encloses the object to a centroid associated with the object; and estimating a pose for the object based on the location and the vector field. 2. The computer-implemented method of claim 1 , wherein the keypoint is the vertex of the bounding volume. 3. The computer-implemented method of claim 2 , wherein the processing of the image to generate the belief map further comprises generating an additional belief map corresponding to an additional location of the centroid. 4. The computer-implemented method of claim 1 , wherein the estimating comprises projecting the vertex and additional vertices of the bounding volume into image space and inferring thea pose in three-dimensional space from perspective-n-point. 5. The method of claim 1 , wherein the keypoint is the centroid and the location is a geometric center of the object. 6. The method of claim 1 , wherein the vector field comprises a vector pointing toward the centroid for each pixel in the image. 7. The computer-implemented method of claim 1 , further comprising detecting a vertex location corresponding to the vector field and the vertex. 8. The computer-implemented method of claim 7 , wherein the vertex location is identified based on at least one peak in the probability values of the belief map. 9. The computer-implemented method of claim 8 , wherein a first peak in the belief map corresponding to the vertex location is greater than a threshold value and a second peak in the belief map is less than the threshold value. 10. The computer-implemented method of claim 7 , wherein the pose for the object is computed based on the detected vertex location, intrinsic parameters of a camera configured to capture the image, and dimensions of the object. 11. The computer-implemented method of claim 1 , wherein the image includes a second object, and further comprising identifying a second location for a geometric center of the second object. 12. The computer-implemented method of claim 11 , further comprising: determining a direction from an additional vertex to the location, the additional vertex corresponding to an additional vector field; determining an angle of the additional vector field evaluated at a vertex location detected for the additional vertex determining a difference between the angle and the direction is greater than an angular threshold value; determining an additional direction from the additional vertex to the second location; and assigning the additional vertex to the second object when a difference between the angle and the additional direction is less than or equal to the angular threshold value. 13. The computer-implemented method of claim 11 , wherein the second location is further from the additional vertex compared with the location. 14. The computer-implemented method of claim 1 , further comprising: determining a direction from the vertex to the location; determining an angle of the vector field evaluated at a vertex location detected for the vertex; and assigning the vertex to the object when a difference between the angle and the direction is within an angular threshold value. 15. The computer-implemented method of claim 1 , wherein the pose is a six degrees-of-freedom pose defined by a position in three-dimensional (3D) space and an orientation. 16. The computer-implemented method of claim 1 , wherein the neural network model is trained using only synthetic data including a combination of domain randomized data and photorealistic data. 17. A system, comprising: a neural network model configured to: receive an image including an object; process the image to generate a belief map corresponding to a location of a keypoint associated with the object, the belief map comprising a probability value for each pixel of the image; process the image to generate a vector field comprising vectors pointing from a vertex of a bounding volume that encloses the object to a centroid associated with the object; and estimate a pose for the object based on the location and the vector field. 18. The system of claim 17 , wherein the vector field comprises a vector pointing toward the centroid for each pixel in the image. 19. The system of claim 17 , wherein the neural network model is further configured to: determine a direction from the vertex to the location; determine an angle of the vector field evaluated at a vertex location detected for the vertex; and assign the vertex to the object when a difference between the angle and the direction is within an angular threshold value. 20. A non-transitory, computer-readable storage medium storing instructions that, when executed by a processing unit, cause the processing unit to: receive an image including an object; process the image, by a neural network model, to generate a belief map corresponding to a location of a keypoint associated with the object, the belief map comprising a probability value for each pixel of the image; process the image to generate a vector field comprising vectors pointing from a vertex of a bounding volume that encloses the object to a centroid associated with the object; and estimate a pose for the object based on the location and the vector field. 21. A computer-implemented method, comprising: rendering a three-dimensional (3D) object of interest within a 3D scene to produce a rendered image including the object of interest; computing task-specific training data corresponding to the object of interest, wherein the task-specific training data comprises a belief map indicating a centroid location associated with the object of interest and a vector field comprising a vector pointing toward the centroid location for each pixel in the rendered image; and including the task-specific training data corresponding to the object of interest and the input image as a test pair in a synthetic training dataset for training a neural network. 22. The computer-implemented method of claim 21 , wherein additional images are rendered as the object of interest is subjected to a gravitational force and interacts with other objects in the 3D scene.

Assignees

Nvidia Corp

Inventors

Classifications

G06N3/047
Probabilistic or stochastic networks · CPC title
G06N3/048
Activation functions · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/08Primary
Learning methods · CPC title
G06N3/0475
Generative networks · CPC title

Patent family

Related publications grouped by family.

View patent family 68533891

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11074717B2 cover?: An object detection neural network receives an input image including an object and generates belief maps for vertices of a bounding volume that encloses the object. The belief maps are used, along with three-dimensional (3D) coordinates defining the bounding volume, to compute the pose of the object in 3D space during post-processing. When multiple objects are present in the image, the object d…
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 27 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).