Method and System for Compositing an Augmented Reality Scene
US-2018082486-A1 · Mar 22, 2018 · US
US11074717B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11074717-B2 |
| Application number | US-201916405662-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 7, 2019 |
| Priority date | May 17, 2018 |
| Publication date | Jul 27, 2021 |
| Grant date | Jul 27, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An object detection neural network receives an input image including an object and generates belief maps for vertices of a bounding volume that encloses the object. The belief maps are used, along with three-dimensional (3D) coordinates defining the bounding volume, to compute the pose of the object in 3D space during post-processing. When multiple objects are present in the image, the object detection neural network may also generate vector fields for the vertices. A vector field comprises vectors pointing from the vertex to a centroid of the object enclosed by the bounding volume defined by the vertex. The object detection neural network may be trained using images of computer-generated objects rendered in 3D scenes (e.g., photorealistic synthetic data). Automatically labelled training datasets may be easily constructed using the photorealistic synthetic data. The object detection neural network may be trained for object detection using only the photorealistic synthetic data.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: receiving an image including an object; processing the image, by a neural network model, to generate a belief map corresponding to a location of a keypoint associated with the object, the belief map comprising a probability value for each pixel of the image; processing the image to generate a vector field comprising vectors pointing from a vertex of a bounding volume that encloses the object to a centroid associated with the object; and estimating a pose for the object based on the location and the vector field. 2. The computer-implemented method of claim 1 , wherein the keypoint is the vertex of the bounding volume. 3. The computer-implemented method of claim 2 , wherein the processing of the image to generate the belief map further comprises generating an additional belief map corresponding to an additional location of the centroid. 4. The computer-implemented method of claim 1 , wherein the estimating comprises projecting the vertex and additional vertices of the bounding volume into image space and inferring thea pose in three-dimensional space from perspective-n-point. 5. The method of claim 1 , wherein the keypoint is the centroid and the location is a geometric center of the object. 6. The method of claim 1 , wherein the vector field comprises a vector pointing toward the centroid for each pixel in the image. 7. The computer-implemented method of claim 1 , further comprising detecting a vertex location corresponding to the vector field and the vertex. 8. The computer-implemented method of claim 7 , wherein the vertex location is identified based on at least one peak in the probability values of the belief map. 9. The computer-implemented method of claim 8 , wherein a first peak in the belief map corresponding to the vertex location is greater than a threshold value and a second peak in the belief map is less than the threshold value. 10. The computer-implemented method of claim 7 , wherein the pose for the object is computed based on the detected vertex location, intrinsic parameters of a camera configured to capture the image, and dimensions of the object. 11. The computer-implemented method of claim 1 , wherein the image includes a second object, and further comprising identifying a second location for a geometric center of the second object. 12. The computer-implemented method of claim 11 , further comprising: determining a direction from an additional vertex to the location, the additional vertex corresponding to an additional vector field; determining an angle of the additional vector field evaluated at a vertex location detected for the additional vertex determining a difference between the angle and the direction is greater than an angular threshold value; determining an additional direction from the additional vertex to the second location; and assigning the additional vertex to the second object when a difference between the angle and the additional direction is less than or equal to the angular threshold value. 13. The computer-implemented method of claim 11 , wherein the second location is further from the additional vertex compared with the location. 14. The computer-implemented method of claim 1 , further comprising: determining a direction from the vertex to the location; determining an angle of the vector field evaluated at a vertex location detected for the vertex; and assigning the vertex to the object when a difference between the angle and the direction is within an angular threshold value. 15. The computer-implemented method of claim 1 , wherein the pose is a six degrees-of-freedom pose defined by a position in three-dimensional (3D) space and an orientation. 16. The computer-implemented method of claim 1 , wherein the neural network model is trained using only synthetic data including a combination of domain randomized data and photorealistic data. 17. A system, comprising: a neural network model configured to: receive an image including an object; process the image to generate a belief map corresponding to a location of a keypoint associated with the object, the belief map comprising a probability value for each pixel of the image; process the image to generate a vector field comprising vectors pointing from a vertex of a bounding volume that encloses the object to a centroid associated with the object; and estimate a pose for the object based on the location and the vector field. 18. The system of claim 17 , wherein the vector field comprises a vector pointing toward the centroid for each pixel in the image. 19. The system of claim 17 , wherein the neural network model is further configured to: determine a direction from the vertex to the location; determine an angle of the vector field evaluated at a vertex location detected for the vertex; and assign the vertex to the object when a difference between the angle and the direction is within an angular threshold value. 20. A non-transitory, computer-readable storage medium storing instructions that, when executed by a processing unit, cause the processing unit to: receive an image including an object; process the image, by a neural network model, to generate a belief map corresponding to a location of a keypoint associated with the object, the belief map comprising a probability value for each pixel of the image; process the image to generate a vector field comprising vectors pointing from a vertex of a bounding volume that encloses the object to a centroid associated with the object; and estimate a pose for the object based on the location and the vector field. 21. A computer-implemented method, comprising: rendering a three-dimensional (3D) object of interest within a 3D scene to produce a rendered image including the object of interest; computing task-specific training data corresponding to the object of interest, wherein the task-specific training data comprises a belief map indicating a centroid location associated with the object of interest and a vector field comprising a vector pointing toward the centroid location for each pixel in the rendered image; and including the task-specific training data corresponding to the object of interest and the input image as a test pair in a synthetic training dataset for training a neural network. 22. The computer-implemented method of claim 21 , wherein additional images are rendered as the object of interest is subjected to a gravitational force and interacts with other objects in the 3D scene.
Probabilistic or stochastic networks · CPC title
Activation functions · CPC title
Combinations of networks · CPC title
Learning methods · CPC title
Generative networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.