Pose determination with semantic segmentation

US10546387B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10546387-B2
Application numberUS-201715699221-A
CountryUS
Kind codeB2
Filing dateSep 8, 2017
Priority dateSep 8, 2017
Publication dateJan 28, 2020
Grant dateJan 28, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method determines a pose of an image capture device. The method includes accessing an image of a scene captured by the image capture device. A semantic segmentation of the image is performed, to generate a segmented image. An initial pose of the image capture device is generated using a three-dimensional (3D) tracker. A plurality of 3D renderings of the scene are generated, each of the plurality of 3D renderings corresponding to one of a plurality of poses chosen based on the initial pose. A pose is selected from the plurality of poses, such that the 3D rendering corresponding to the selected pose aligns with the segmented image.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of determining a pose of an image capture device, comprising: accessing an image of a scene captured by the image capture device; performing a semantic segmentation of the image of the scene, to generate a segmented image, the segmented image being divided into a plurality of regions, and each region of the plurality of regions having a plurality of pixels; generating an initial pose of the image capture device using a three-dimensional (3D) tracker; generating a plurality of 3D renderings of the scene, each of the plurality of 3D renderings corresponding to one of a plurality of poses chosen based on the initial pose; determining, for each of the plurality of poses, first probabilities that the pixels within at least one of the plurality of regions of the segmented image belong to a plurality of classes based on a sequence type of the at least one of the plurality of regions; and selecting a pose from the plurality of poses based on the determined first probabilities, such that the 3D rendering corresponding to the selected pose aligns with the segmented image. 2. The method of claim 1 , further comprising updating the 3D tracker based on the selected pose. 3. The method of claim 1 , further comprising: capturing the image of the scene using the image capture device; and rectifying the image of the scene before the semantic segmentation. 4. The method of claim 1 , wherein generating the initial pose includes using calibration data, position data or motion data from one or more sensors. 5. The method of claim 4 , wherein the plurality of poses includes the initial pose, and the calibration data, position data or motion data are used to determine a pose search space including the plurality of poses. 6. The method of claim 1 , further comprising determining a second probability that each of the plurality of 3D renderings of the scene aligns with the segmented image, wherein the selecting is based on the determined second probability. 7. The method of claim 6 , wherein determining one of the first probabilities for one of the plurality of poses includes: dividing the segmented image into the plurality of regions, and combining the first probabilities corresponding to each of the plurality of classes and each of the plurality of regions, for the one of the plurality of poses. 8. The method of claim 7 , wherein the plurality of regions are columns of equal size. 9. The method of claim 1 , wherein each region of the plurality of regions comprises a plurality of pixels, and the semantic segmentation is performed by a neural network configured to classify each of the plurality of pixels in each region as belonging to a facade, a vertical edge, a horizontal edge or background. 10. The method of claim 1 , wherein: the plurality of classes include at least one of facades, vertical edges, horizontal edges or background; and the segmented image defines each of the plurality of regions as having a predetermined number of predetermined sequences of classes, wherein each of the predetermined sequence of classes includes at least one of the facades, vertical edges, horizontal edges or background. 11. The method of claim 10 , wherein each of the plurality of regions is a column having a width one pixel wide. 12. The method of claim 1 , wherein determining the first probabilities comprises determining a probability that each of the pixels belongs to each of the plurality of classes based on a location of each transition between adjacent pixels belonging to respectively different classes in the plurality of classes. 13. The method of claim 1 , further comprising training a neural network to perform the semantic segmentation, by labelling a blocking foreground object in front of a facade as being a part of the facade. 14. The method of claim 1 , wherein, the determining comprises determining the first probabilities that the pixels within each of the regions of the segmented image belong to the plurality of classes based on the sequence type of each of the regions. 15. A system for determining a pose of an image capture device, comprising: a processor coupled to access an image of a scene captured by the image capture device; and a non-transitory, machine-readable storage medium coupled to the processor and encoded with computer program code for execution by the processor, the computer program code comprising: code for performing a semantic segmentation of the image of the scene to generate a segmented image, the segmented image being divided into a plurality of regions, and each region of the plurality of regions having a plurality of pixels; code for causing a three-dimensional (3D) tracker to generate an initial pose of the image capture device; code for generating a plurality of 3D renderings of the scene, each of the plurality of 3D renderings corresponding to one of a plurality of poses chosen based on the initial pose; code for determining, for each of the plurality of poses, first probabilities that the pixels within at least one of the plurality of regions of the segmented image belong to a plurality of classes based on a sequence type of the at least one of the plurality of regions; and code for selecting a pose from the plurality of poses based on the determined first probabilities, such that the 3D rendering corresponding to the selected pose aligns with the segmented image. 16. The system of claim 15 , wherein the machine-readable storage medium further comprises code for updating the 3D tracker based on the selected pose. 17. The system of claim 15 , wherein the program code further comprises code for rectifying the image of the scene before the semantic segmentation. 18. The system of claim 15 , wherein the program code further comprises code for determining a second probability that each of the plurality of 3D renderings of the scene aligns with the segmented image, wherein the selecting is based on the determined second probability. 19. The system of claim 18 , wherein the code for determining the first probabilities includes: code for dividing the segmented image into the plurality of regions, and code for combining the determined first probabilities corresponding to each of the plurality of classes and each of the plurality of regions for one of the plurality of poses. 20. The system for determining a pose according to claim 19 , wherein the code for determining the first probabilities includes code to configure the processor for determining respective first probabilities for each region of the plurality of regions based on a location of each transition between adjacent pixels belonging to respectively different classes in the plurality of classes. 21. The system of claim 15 , wherein the plurality of regions are columns of equal size, one pixel wide. 22. The system of claim 15 , wherein the code for performing the semantic segmentation is configured to cause a neural network to classify each region as at least one of a facade, a vertical edge, a horizontal edge or a background. 23. The system of claim 15 , wherein: the plurality of classes include at least one of facades, vertical edges, horizontal edges or background; and the code for performing a semantic segmentation is adapted to define each of the plurality of regions as having a predetermined number of predetermined sequences of classes, wherein each of the predetermined sequence of classes includes at least one of the facades, vertical edges,

Assignees

Inventors

Classifications

  • Training; Learning · CPC title

  • involving reference images or patches · CPC title

  • G06T7/73Primary

    using feature-based methods · CPC title

  • G06T7/11Primary

    Region-based segmentation · CPC title

  • Probabilistic image processing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10546387B2 cover?
A method determines a pose of an image capture device. The method includes accessing an image of a scene captured by the image capture device. A semantic segmentation of the image is performed, to generate a segmented image. An initial pose of the image capture device is generated using a three-dimensional (3D) tracker. A plurality of 3D renderings of the scene are generated, each of the plural…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/73. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 28 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).