Augmented three-dimensional structure generation
US-2024185524-A1 · Jun 6, 2024 · US
US2020334841A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020334841-A1 |
| Application number | US-202016920058-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 2, 2020 |
| Priority date | Sep 7, 2018 |
| Publication date | Oct 22, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device and method perform Simultaneous Localization and Mapping (SLAM). The device includes at least one processor configured to perform the SLAM method, which includes the following operations. Preprocess, in a first processing stage, a received data sequence including multiple images recorded by a camera and sensor readings from multiple sensors in order to obtain a frame sequence. Each frame of the frame sequence includes a visual feature set related to one of the images at a determined time instance and sensor readings from that time instance. Sequentially process, in a second processing stage, each frame of the frame sequence based on the visual feature set and the sensor readings included in that frame in order to generate a sequence mapping graph. Merge, in a third processing stage, the sequence mapping graph with at least one other graph, in order to generate or update a full graph.
Opening claim text (preview).
What is claimed is: 1 . A device for performing simultaneous localization and mapping (SLAM), the device comprising at least one processor configured to: preprocess, in a first processing stage, a received data sequence comprising multiple images recorded by a camera and sensor readings from multiple sensors in order to obtain a frame sequence, each frame of the frame sequence comprising a visual feature set related to one of the images at a determined time and the respective sensor readings from the determined time; sequentially process, in a second processing stage, each frame of the frame sequence based on the visual feature set and the sensor readings comprised in that frame in order to generate a sequence mapping graph; and merge, in a third processing stage, the sequence mapping graph with at least one other graph in order to generate or update a full graph. 2 . The device according to claim 1 , wherein: the visual feature set comprises an image feature set comprising one or more 2D key points extracted from the related one of the images, descriptors corresponding to the 2D key points, and disparity or depth information of the 2D key points. 3 . The device according to claim 2 , wherein the at least one processor is configured to, in the first processing stage: extract an image from the data sequence, the image being one of the multiple images; rectify the image; extract the 2D key points from the rectified image; and generate the image feature set based on the extracted 2D key points. 4 . The device according to claim 3 , wherein the at least one processor is configured to, in the first processing stage: assign one or more semantic labels to pixels of the rectified image; and filter the image feature set based on the semantic labels to remove the 2D key points from the image feature set related to objects labelled as dynamic objects. 5 . The device according to claim 4 , wherein the at least one processor is further configured to, in the first processing stage: generate the visual feature set by adding a bag-of-words descriptor to the filtered image feature set, and generate a respective frame of the frame sequence by combining the visual feature set with the sensor readings from a same time instance of the image. 6 . The device according to claim 1 , wherein the at least one processor is configured to, in the second processing stage: perform camera tracking based on the visual feature set included in a respective frame of the frame set by matching 2D key points in the visual feature set to locally stored 3D key points, in order to obtain a camera pose associated with the respective frame. 7 . The device according to claim 6 , wherein the at least one processor is configured to: determine whether the frame is a key frame based on a number of the matched 2D key points. 8 . The device according to claim 7 , wherein the at least one processor is further configured to, in the second processing stage, based upon determining that the frame is the key frame: perform a first local bundle adjustment (LBA) based on a camera pose in order to obtain visual odometry information and a LBA graph; calculate a fused camera pose based on the visual odometry information and the sensor readings included in the frame; and perform a second LBA based on the fused camera pose and the LBA graph in order to obtain the sequence mapping graph. 9 . The device according to claim 1 , wherein the at least one processor is further configured to, in the third processing stage: detect a presence of one or more loops or overlapping areas shared among the sequence mapping graph and the at least one further graph; merge the sequence mapping graph and the at least one further graph in order to obtain an intermediate graph; and perform a graph optimization on the intermediate graph based on the detected loops or the overlapping areas in order to obtain the full graph. 10 . The device according to claim 1 , wherein at least two of the first processing stage, the second processing stage, or the third processing stage are performed in different processors of the at least one processor. 11 . The device according to claim 1 , wherein: the device is a distributed device and comprises at least one terminal device and at least one network device, a processor of the terminal device is configured to perform the first processing stage and transmit the obtained frame sequence to the network device, a processor of the network device is configured to perform the second and third processing stages, and the at least one processor comprises the processor of the terminal device and the processor of the network device. 12 . The device according to claim 11 , wherein the processor of the terminal device is further configured to: perform a real-time localization based on the frame sequence obtained in the first processing stage. 13 . The device according to claim 12 , wherein the processor of the terminal device is further configured to, in the second processing stage, based upon determining that a frame of the frame sequence is a key frame: perform a first local bundle adjustment (LBA) based on a camera pose in order to obtain visual odometry information and a LBA graph; calculate a fused camera pose based on the visual odometry information and the sensor readings included in the frame; and perform a fusion tracking procedure based on the fused camera pose, the LBA graph, and a current full graph in order to obtain a current camera pose 14 . The device according to claim 11 , wherein the terminal device is located in a vehicle, and the vehicle comprises the at least one camera comprising the camera and at least one of the multiple sensors. 15 . A method for performing simultaneous localization and mapping (SLAM), the method comprising: preprocessing, in a first processing stage, a received data sequence comprising multiple images recorded by a camera and sensor readings from multiple sensors in order to obtain a frame sequence, each frame of the frame sequence comprises a visual feature set related to one of the images at a determined time and sensor readings from the determined time; sequentially processing, in a second processing stage, each frame of the frame sequence based on the visual feature set and the sensor readings comprised in that frame in order to generate a sequence mapping graph; and merging, in a third processing stage, the sequence mapping graph with at least one other graph in order to generate or update a full graph. 16 . The device according to claim 5 , wherein the at least one processor is further configured to, in the first processing stage: generate the visual feature set by adding a hash table for searching the 2D key points.
from motion · CPC title
Vehicle exterior; Vicinity of vehicle · CPC title
Camera pose · CPC title
involving reference images or patches · CPC title
Range image; Depth image; 3D point clouds · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.