Who is the assignee on this patent?

Beijing Baidu Netcom Sci & Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06F18/2148. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 05 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Data processing method, data processing apparatus, electronic device and storage medium

US11748449B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11748449-B2
Application number	US-202017105027-A
Country	US
Kind code	B2
Filing date	Nov 25, 2020
Priority date	Nov 25, 2020
Publication date	Sep 5, 2023
Grant date	Sep 5, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides a data processing method, an apparatus, an electronic device and a medium, which relates to the technical fields of autonomous driving, electronic maps, deep learning, image processing, and the like. The method includes: a computing device inputs a reference image and a captured image into a feature extraction model; obtain, a set of reference descriptors based on the first descriptor map; determine a plurality of sets of training descriptors; obtain a predicted pose of the vehicle by inputting the plurality of training poses and a plurality of similarities into a pose prediction model; and train the feature extraction model and the pose prediction model. When applied to a vehicle localization system, the trained feature extraction model and pose prediction model according to some embodiments of the present disclosure can improve accuracy and robustness of vehicle localization, thereby boosting the performance of the vehicle localization system.

First claim

Opening claim text (preview).

The invention claimed is: 1. A data processing method, comprising: inputting a reference image and a captured image into a feature extraction model, respectively, to obtain a first descriptor map and a second descriptor map, the captured image being obtained by capturing an external environment from a vehicle when the vehicle is in a real pose, the reference image being obtained by pre-capturing the external environment by a capturing device; obtaining, based on the first descriptor map, a set of reference descriptors corresponding to a set of keypoints in the reference image; determining a plurality of sets of training descriptors corresponding to a set of spatial coordinates when the vehicle is in a plurality of training poses, respectively, the plurality of sets of training descriptors belonging to the second descriptor map, the set of spatial coordinates being determined based on the set of keypoints, the plurality of training poses being obtained by offsetting a known pose based on the real pose; obtaining a predicted pose of the vehicle by inputting the plurality of training poses and a plurality of similarities into a pose prediction model, the plurality of similarities being between the plurality of sets of training descriptors and the set of reference descriptors; and training the feature extraction model and the pose prediction model based on a metric representing a difference between the predicted pose and the real pose, in order to apply the trained feature extraction model and the trained pose prediction model to vehicle localization, wherein one of the following: the pose prediction model provides, based on the plurality of similarities, probabilities that the plurality of training poses are real poses, respectively, and the metric comprises a concentration of distribution of the probabilities; or the pose prediction model generates a plurality of regularized similarities based on the plurality of similarities, and the metric is determined based on the plurality of regularized similarities. 2. The method of claim 1 , wherein the metric comprises a deviation between the predicted pose and the real pose. 3. The method of claim 1 , wherein obtaining the set of reference descriptors comprises: obtaining a set of reference images of the external environment, each of the set of reference images comprising a set of keypoints as well as a set of reference descriptors and a set of spatial coordinates associated with the set of keypoints, the set of spatial coordinates being determined by projecting a laser radar point cloud onto the reference image; selecting, from the set of reference images, the reference image corresponding to the captured image based on the known pose; and obtaining the set of reference descriptors and the set of spatial coordinates stored in association with the set of keypoints in the reference image. 4. The method of claim 1 , wherein determining the plurality of sets of training descriptors comprises: determining a set of projection points of the set of spatial coordinates by projecting the set of spatial coordinates onto the captured image based on a first training pose of the plurality of training poses; determining, for a projection point of the set of projection points, a plurality of points adjacent to the projection point in the captured image; determining a plurality of descriptors of the plurality of points in the second descriptor map; and determining, based on the plurality of descriptors, a descriptor of the projection point to obtain a first training descriptor of a set of training descriptors corresponding to the first training pose among the plurality of sets of training descriptors. 5. The method of claim 1 , further comprising: determining, for a first set of training descriptors among the plurality of sets of training descriptors, a plurality of differences between a plurality of training descriptors in the first set of training descriptors and corresponding reference descriptors in the set of reference descriptors; and determining, based on the plurality of differences, a similarity between the first set of training descriptors and the set of reference descriptors as a first similarity of the plurality of similarities. 6. The method of claim 1 , wherein obtaining the predicted pose comprises: determining, based on the plurality of similarities, probabilities that the plurality of training poses are real poses, respectively, using the pose prediction model; and determining, based on the plurality of training poses and the probabilities, an expected pose of the vehicle as the predicted pose. 7. The method of claim 1 , further comprising: determining the plurality of training poses by taking a horizontal coordinate, a longitudinal coordinate and a yaw angle of the known pose as a center and by offsetting from the center in three dimensions of a horizontal axis, a longitudinal axis and a yaw angle axis, with respective predetermined offset units and within respective predetermined maximum offset ranges. 8. The method of claim 1 , further comprising: selecting, based on a farthest point sampling algorithm, the set of keypoints from a set of points in the reference image. 9. An electronic device, comprising: at least one processor; and at least one memory coupled to the at least one processor and storing instructions executable by the at least one processor, the instructions, when executed by the at least one processor, causing the electronic device to: input a reference image and a captured image into a feature extraction model, respectively, to obtain a first descriptor map and a second descriptor map, the captured image being obtained by capturing an external environment from a vehicle when the vehicle is in a real pose, the reference image being obtained by pre-capturing the external environment by a capturing device; obtain, based on the first descriptor map, a set of reference descriptors corresponding to a set of keypoints in the reference image; determine a plurality of sets of training descriptors corresponding to a set of spatial coordinates when the vehicle is in a plurality of training poses, respectively, the plurality of sets of training descriptors belonging to the second descriptor map, the set of spatial coordinates being determined based on the set of keypoints, the plurality of training poses being obtained by offsetting a known pose based on the real pose; obtain a predicted pose of the vehicle by inputting the plurality of training poses and a plurality of similarities into a pose prediction model, the plurality of similarities being between the plurality of sets of training descriptors and the set of reference descriptors; and train the feature extraction model and the pose prediction model based on a metric representing a difference between the predicted pose and the real pose, in order to apply the trained feature extraction model and the trained pose prediction model to vehicle localization, wherein one of the following: the pose prediction model provides, based on the plurality of similarities, probabilities that the plurality of training poses are real poses, respectively, and the metric comprises a concentration of distribution of the probabilities: or the pose prediction model generates a plurality of regularized similarities based on the plurality of similarities, and the metric is determined based on the plurality of regularized similarities. 10. The electronic device of claim 9 , wherein the metric comprises a deviation between the predicted pose and the real pose. 11. The electronic device of claim 9 , wherein the instructions when executed by the at least one processor cause the electr

Assignees

Beijing Baidu Netcom Sci & Tech Co Ltd

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06F18/2148Primary
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
G06F18/22
Matching criteria, e.g. proximity measures · CPC title
G06F18/2415
based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate · CPC title

Patent family

Related publications grouped by family.

View patent family 81658857

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11748449B2 cover?: The present disclosure provides a data processing method, an apparatus, an electronic device and a medium, which relates to the technical fields of autonomous driving, electronic maps, deep learning, image processing, and the like. The method includes: a computing device inputs a reference image and a captured image into a feature extraction model; obtain, a set of reference descriptors based o…
Who is the assignee on this patent?: Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F18/2148. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 05 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Estimating ground truth object keypoint labels for sensor readings

Self-supervised 3d keypoint learning for ego-motion estimation

Object posture estimation method and apparatus

Image-based localization

System and method for multimodal mapping and localization

Frequently asked questions