Three-dimensional reconstruction method and three-dimensional reconstruction apparatus
US-2022414911-A1 · Dec 29, 2022 · US
US2024257375A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024257375-A1 |
| Application number | US-202418633275-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 11, 2024 |
| Priority date | Jun 24, 2020 |
| Publication date | Aug 1, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for object recognition neural network for amodal center prediction. One of the methods includes receiving an image of an object captured by a camera. The image of the object is processed using an object recognition neural network that is configured to generate an object recognition output. The object recognition output includes data defining a predicted two-dimensional amodal center of the object, wherein the predicted two-dimensional amodal center of the object is a projection of a predicted three-dimensional center of the object under a camera pose of the camera that captured the image.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method, the method comprising: receiving an image of an object captured by a camera; and processing the image of the object using an object recognition neural network that is configured to generate an object recognition output comprising: data defining a predicted two-dimensional amodal center of the object, wherein the predicted two-dimensional amodal center of the object is a projection of a predicted three-dimensional center of the object under a camera pose of the camera that captured the image. 2 . The method of claim 1 , wherein the object recognition output comprises pixel coordinates of the predicted two-dimensional amodal center. 3 . The method of claim 2 , wherein the object recognition neural network comprises a regression output layer that generates the pixel coordinates of the predicted two-dimensional amodal center. 4 . The method of claim 1 , wherein the object recognition neural network is a multi-task neural network and the object recognition output also comprises data defining a bounding box for the object in the image. 5 . The method of claim 4 , wherein the predicted two-dimensional amodal center is outside of the bounding box in the image. 6 . The method of claim 1 , wherein the object recognition output comprises a truncation score that represents a likelihood that the object is truncated in the image. 7 . The method of claim 1 , further comprising: obtaining data specifying one or more other predicted two-dimensional amodal centers of the object in one or more other images captured under different camera poses; and determining, from (i) the predicted two-dimensional amodal center of the object in the image and (ii) the one or more other predicted two-dimensional amodal centers of the object, the predicted three-dimensional center of the object. 8 . A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising: receiving an image of an object captured by a camera; and processing the image of the object using an object recognition neural network that is configured to generate an object recognition output comprising: data defining a predicted two-dimensional amodal center of the object, wherein the predicted two-dimensional amodal center of the object is a projection of a predicted three-dimensional center of the object under a camera pose of the camera that captured the image. 9 . The system of claim 8 , wherein the object recognition output comprises pixel coordinates of the predicted two-dimensional amodal center. 10 . The system of claim 9 , wherein the object recognition neural network comprises a regression output layer that generates the pixel coordinates of the predicted two-dimensional amodal center. 11 . The system of claim 8 , wherein the object recognition neural network is a multi-task neural network and the object recognition output also comprises data defining a bounding box for the object in the image. 12 . The system of claim 11 , wherein the predicted two-dimensional amodal center is outside of the bounding box in the image. 13 . The system of claim 8 , wherein the object recognition output comprises a truncation score that represents a likelihood that the object is truncated in the image. 14 . The system of claim 8 , the operations further comprise: obtaining data specifying one or more other predicted two-dimensional amodal centers of the object in one or more other images captured under different camera poses; and determining, from (i) the predicted two-dimensional amodal center of the object in the image and (ii) the one or more other predicted two-dimensional amodal centers of the object, the predicted three-dimensional center of the object. 15 . One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: receiving an image of an object captured by a camera; and processing the image of the object using an object recognition neural network that is configured to generate an object recognition output comprising: data defining a predicted two-dimensional amodal center of the object, wherein the predicted two-dimensional amodal center of the object is a projection of a predicted three-dimensional center of the object under a camera pose of the camera that captured the image. 16 . The computer-readable storage media of claim 15 , wherein the object recognition output comprises pixel coordinates of the predicted two-dimensional amodal center. 17 . The computer-readable storage media of claim 16 , wherein the object recognition neural network comprises a regression output layer that generates the pixel coordinates of the predicted two-dimensional amodal center. 18 . The computer-readable storage media of claim 15 , wherein the object recognition neural network is a multi-task neural network and the object recognition output also comprises data defining a bounding box for the object in the image. 19 . The computer-readable storage media of claim 18 , wherein the predicted two-dimensional amodal center is outside of the bounding box in the image. 20 . The computer-readable storage media of claim 15 , wherein the object recognition output comprises a truncation score that represents a likelihood that the object is truncated in the image.
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
by matching two-dimensional images to three-dimensional objects · CPC title
in augmented reality scenes · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.