Realistic image perspective transformation using neural networks
US-11107228-B1 · Aug 31, 2021 · US
US12175706B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12175706-B2 |
| Application number | US-202217699657-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 21, 2022 |
| Priority date | Oct 7, 2021 |
| Publication date | Dec 24, 2024 |
| Grant date | Dec 24, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method with global localization includes: extracting a feature by applying an input image to a first network; estimating a coordinate map corresponding to the input image by applying the extracted feature to a second network; and estimating a pose corresponding to the input image based on the estimated coordinate map, wherein either one or both of the first network and the second network is trained based on either one or both of: a first generative adversarial network (GAN) loss determined based on a first feature extracted by the first network based on a synthetic image determined by three-dimensional (3D) map data and a second feature extracted by the first network based on a real image; and a second GAN loss determined based on a first coordinate map estimated by the second network based on the first feature and a second coordinate map estimated by the second network based on the second feature.
Opening claim text (preview).
What is claimed is: 1. A method with global localization, the method comprising: extracting a feature by applying an input image to a first network; estimating a coordinate map corresponding to the input image by applying the extracted feature to a second network; and estimating a pose corresponding to the input image based on the estimated coordinate map, wherein either one or both of the first network and the second network is trained based on either one or both of: a first generative adversarial network (GAN) loss determined based on a first feature extracted by the first network based on a synthetic image determined by three-dimensional (3D) map data and a second feature extracted by the first network based on a real image; and a second GAN loss determined based on a first coordinate map estimated by the second network based on the first feature and a second coordinate map estimated by the second network based on the second feature. 2. The method of claim 1 , wherein either one or both of the first network and the second network is trained further based on either one or both of: a first loss determined based on the first coordinate map and ground truth data corresponding to the synthetic image; and a second loss determined based on a first pose estimated based on the first coordinate map and the ground truth data corresponding to the synthetic image. 3. The method of claim 2 , wherein the ground truth data comprises a pose of a virtual camera that captures the synthetic image and 3D coordinate data corresponding to each pixel of the synthetic image. 4. The method of claim 1 , wherein the pose comprises a six-degrees-of-freedom (6DoF) pose of a device that captures the input image. 5. A non-transitory computer-readable storage medium storing instructions that, when executed by one or more processors, configure the one or more processors to perform the method of claim 1 . 6. A method with global localization, the method comprising: determining a synthetic data set based on three-dimensional (3D) map data, the synthetic data set comprising a synthetic image captured by a virtual camera corresponding to an arbitrary pose and 3D coordinate data corresponding to each pixel of the synthetic image; determining a first generative adversarial network (GAN) loss based on a first feature determined by applying the synthetic image to the first network and a second feature determined by applying a real image captured by a real camera to the first network; determining a second GAN loss based on a first coordinate map determined by applying the first feature to the second network and a second coordinate map determined by applying the second feature to the second network; determining a first loss based on the first coordinate map and the 3D coordinate data corresponding to the synthetic image; determining a second loss based on a first pose estimated based on the first coordinate map and a pose of the virtual camera; and training either one or both of the first network and the second network based on any one or any combination of any two or more of the first loss, the second loss, the first GAN loss, and the second GAN loss. 7. The method of claim 6 , wherein the determining of the synthetic data set further comprises: extracting the first feature by applying the synthetic image to the first network; estimating the first coordinate map corresponding to each pixel of the synthetic image by applying the extracted first feature to the second network; estimating a first pose corresponding to the synthetic image based on the estimated first coordinate map; extracting the second feature by applying the real image to the first network; and estimating the second coordinate map corresponding to each pixel of the synthetic image by applying the extracted second feature to the second network. 8. The method of claim 6 , wherein the training of the either one or both of the first network and the second network comprises training the first network and a first discriminator based on the first GAN loss, the first discriminator being configured to discriminate between the first feature extracted from the synthetic image and the second feature extracted from the real image. 9. The method of claim 6 , wherein the training of the either one or both of the first network and the second network comprises training the second network and a second discriminator based on the second GAN loss, the second discriminator being configured to discriminate between the first coordinate map estimated from the synthetic image and the second coordinate map estimated from the real image. 10. The method of claim 6 , wherein the training of the either one or both of the first network and the second network comprises iteratively back-propagating a gradient determined based on the first loss to the first network and the second network. 11. The method of claim 6 , wherein the training of the either one or both of the first network and the second network comprises iteratively back-propagating a gradient determined based on the second loss to the first network and the second network. 12. The method of claim 6 , further comprising, in response to the training of the either one or both of the first network and the second network: extracting a feature by applying an input image to the first network; estimating a coordinate map corresponding to the input image by applying the extracted feature to the second network; and estimating a pose corresponding to the input image based on the estimated coordinate map. 13. An apparatus with global localization, the apparatus comprising: one or more processors configured to: extract a feature by applying an input image to a first network of a global localization model; estimate a coordinate map of the input image by applying the extracted feature to a second network of the global localization model; and estimate a pose corresponding to a global localization result by applying the estimated coordinate map to a pose estimator of the global localization model, wherein the global localization model is generated by: determining a synthetic data set based on three-dimensional (3D) map data, the synthetic data set comprising a synthetic image captured by a virtual camera corresponding to an arbitrary pose and 3D coordinate data corresponding to each pixel of the synthetic image; and iteratively back-propagating a gradient determined based on one or more losses associated with the global localization model, to update parameters of the first network and the second network; and wherein a loss associated with the global localization model comprises either one or both of: a first generative adversarial network (GAN) loss determined based on a first feature extracted by the first network based on the synthetic image and a second feature extracted by the first network based on a real image; and a second GAN loss determined based on a first coordinate map estimated by the second network based on the first feature and a second coordinate map estimated by the second network based on the second feature. 14. The apparatus of claim 13 , wherein the loss associated with the global localization model further comprises: a first loss determined based on the first coordinate map and ground truth data corresponding to the synthetic image; and a second loss determined based on a first pose estimated by the pose estimator based on the first coordinate map, and the ground truth data corresponding to the synthetic image. 15. The apparatus of claim 13 , wherein the iteratively back-propagating of the gradient comprises: iterat
Camera pose · CPC title
Artificial neural networks [ANN] · CPC title
Training; Learning · CPC title
Learning methods · CPC title
Generative networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.