Who is the assignee on this patent?

Beijing Didi Infinity Technology & Dev Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06T7/74. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 25 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for visual positioning

US12260585B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12260585-B2
Application number	US-202217807719-A
Country	US
Kind code	B2
Filing date	Jun 18, 2022
Priority date	Dec 18, 2019
Publication date	Mar 25, 2025
Grant date	Mar 25, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The embodiments of the present disclosure provide a visual positioning method, the method may include obtaining a positioning image collected by an imaging device; obtaining a three-dimensional (3D) point cloud map associated with an area where the imaging device is located; determining a target area associated with the positioning image from the 3D point cloud map based on the positioning image; and determining positioning information of the imaging device based on the positioning image and the target area.

First claim

Opening claim text (preview).

What is claimed is: 1. A visual positioning system, comprising: at least one storage medium including a set of instructions for visual positioning; at least one processor configured to communicate with the at least one storage medium, wherein when executing the set of instructions, the at least one processor is configured to direct the system to: obtain a positioning image collected by an imaging device; obtain a three-dimensional (3D) point cloud map associated with an area where the imaging device is located; determine, based on the positioning image, a target area associated with the positioning image from the 3D point cloud map; and determine, based on the positioning image and the target area, positioning information of the imaging device, wherein to determine, based on the positioning image and the target area, positioning information of the imaging device, the at least one processor is configured to direct the system to: extract at least one visual feature point in the positioning image; match the at least one visual feature point with feature points in the target area to obtain at least one feature point pair, the at least one feature point pair including at least one of at least one feature point pair with a semantic annotation or at least one feature point pair without a semantic annotation; and calculate, based on the at least one feature point pair, the positioning information of the imaging device. 2. The system of claim 1 , wherein to determine, based on the positioning image, a target area associated with the positioning image from the 3D point cloud map, the at least one processor is configured to direct the system to: determine, based on the positioning image, one or more restricting conditions associated with a range of the target area; and determine, based on the one or more restricting conditions, the target area from the 3D point cloud map, wherein the one or more restricting conditions are related to at least one of a scene corresponding to the positioning image, an initial estimated position of the imaging device, or azimuth information of the imaging device. 3. The system of claim 2 , wherein to determine, based on the one or more restricting conditions, the target area from the 3D point cloud map, the at least one processor is configured to direct the system to: obtain a first area that matches the scene corresponding to the positioning image in the 3D point cloud map by performing a scene recognition for the positioning image; and determine, based on the first area, the target area. 4. The system of claim 3 , wherein to determine, based on the first area, the target area, the at least one processor is configured to direct the system to: obtain the initial estimated position of the imaging device by a positioning module associated with the imaging device; determine, based on the initial estimated position of the imaging device, a second area from the first area, the second area being an area in the first area that is within a distance from the initial estimated position; and determine the target area according to the second area. 5. The system of claim 4 , wherein to determine the target area according to the second area, the at least one processor is configured to direct the system to: obtain a moving direction of the imaging device; determine, based on the moving direction, the azimuth information of the imaging device, the azimuth information including an angular range of the moving direction; and determine, based on the angular range, a third area within the angular range from the second area; and designate the third area as the target area. 6. The system of claim 3 , wherein to obtain a first area that matches the scene corresponding to the positioning image in the 3D point cloud map by performing a scene recognition for the positioning image, the at least one processor is configured to direct the system to: obtain a plurality of reconstructed images for reconstructing the 3D point cloud map, each of the plurality of reconstructed images corresponding to a scene area; and determine, from the plurality of scene areas, the first area according to similarities between the positioning image and the reconstructed images. 7. The system of claim 3 , wherein to obtain a first area that matches the scene corresponding to the positioning image in the 3D point cloud map by performing a scene recognition for the positioning image, the at least one processor is configured to direct the system to: obtain the first area by processing the positioning image and the 3D point cloud map using a scene recognition model, the scene recognition model being obtained through training. 8. The system of claim 1 , wherein the 3D point cloud map includes a semantic 3D point cloud map, and to obtain the semantic 3D point cloud map, the at least one processor is configured to direct the system to: obtain a trained neural network model; obtain one or more images that are not labeled with reference objects; input the one or more images that are not labeled with reference objects to the trained neural network model to obtain one or more images labeled with reference objects; and determine, based on the one or more images labeled with reference objects, the semantic 3D point cloud map, wherein the trained neural network model is obtained by training a plurality of groups of training samples, and each group of the plurality of groups of training samples includes one or more sample images that are not labeled with reference objects and training labels including sample images labeled with reference objects. 9. The system of claim 1 , wherein the 3D point cloud map includes a semantic 3D point cloud map, and to obtain the semantic 3D point cloud map, the at least one processor is configured to direct the system to: obtain a plurality of images including one or more images labeled with reference objects and one or more images that are not labeled with reference objects; extract visual feature points in the one or more images that are not labeled with reference objects, the visual feature points being associated with the reference objects in the images labeled with reference objects; and obtain, based on the visual feature points in the one or more images that are not labeled with reference objects, the one or more images labeled with reference objects by labeling the images that are not labeled with the reference objects; determine, based on the plurality of images labeled with reference objects, the semantic 3D point cloud map. 10. The system of claim 1 , wherein to calculate, based on the at least one feature point pair, the positioning information of the imaging device, the at least one processor is configured to direct the system to: obtain a first count of feature point pairs from the at least one feature point pair to form a solution set, the solution set including feature point pairs without semantic annotations; perform at least one iterative calculation on the solution set using a random sampling consensus algorithm to obtain a pose and a count of interior points corresponding to each iterative calculation, wherein an interior point represents a visual feature point whose reprojection value between the visual feature point and a feature point corresponding to the visual feature point in the 3D point cloud map is within a reprojection deviation; and determine the positioning information of the imaging device according to the pose and the count of interior points. 11. The system of claim 10 , wherein the solution set further includes feature point pairs with semantic annotations, and to obtain a first count of feature point pairs from the at least one fea

Assignees

Beijing Didi Infinity Technology & Dev Co Ltd

Inventors

Classifications

G06T2207/30244
Camera pose · CPC title
G06T2207/30242
Counting objects in image · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T2207/20021
Dividing image into blocks, subimages or windows · CPC title

Patent family

Related publications grouped by family.

View patent family 76476869

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12260585B2 cover?: The embodiments of the present disclosure provide a visual positioning method, the method may include obtaining a positioning image collected by an imaging device; obtaining a three-dimensional (3D) point cloud map associated with an area where the imaging device is located; determining a target area associated with the positioning image from the 3D point cloud map based on the positioning imag…
Who is the assignee on this patent?: Beijing Didi Infinity Technology & Dev Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06T7/74. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 25 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method for activating service based on user scenario perception, terminal device, and system

Method and apparatus for generating a navigation guide

Position and pose determining method, apparatus, smart device, and storage medium

Method for generating a high precision map, apparatus and storage medium

Method, device, and storage medium for laser scanning device calibration

Data processing method, apparatus and terminal

Modeling method and apparatus using three-dimensional (3d) point cloud

Frequently asked questions