Robotic control based on 3d bounding shape, for an object, generated using edge-depth values for the object
US-2020376675-A1 · Dec 3, 2020 · US
US11869257B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11869257-B2 |
| Application number | US-202117206255-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 19, 2021 |
| Priority date | Mar 19, 2021 |
| Publication date | Jan 9, 2024 |
| Grant date | Jan 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for detecting and labeling a target object in a 2D image includes receiving a plurality of 2D images from a visual sensor, manually marking points of the target object on each of the 2D images, generating from the 2D images a 3D world coordinate system of the environment surrounding the target object, mapping each of the marked points on the 2D images to the 3D world coordinate system using a simultaneous localization and mapping (SLAM) engine, automatically generating a 3D bounding box covering all the marked points mapped to the 3D world coordinate system, mapping the 3D bounding box to each of the 2D images, generating a label for the target object on each of the 2D images using a machine learning object detection model, and training the machine learning object detection model based on the generated label for the target object.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method for detecting and labeling an object in a 2D image comprising: receiving a plurality of 2D images from a visual sensor, each image of the plurality of 2D images includes an image of a target object in an surrounding environment; detecting features and extracting feature points from the plurality of 2D images; manually marking visible feature points of the target object on each image of the plurality of 2D images; estimating occluded feature points of the target object in at least one of the plurality of 2D images by defining axis lines starting from visible marked points; generating from the plurality of 2D images a 3D world coordinate system of the environment surrounding the target object; mapping each of the marked feature points on the plurality of 2D images to the 3D world coordinate system; generating a 3D map of the surrounding environment using the extracted feature points; automatically generating a 3D bounding box for the target object covering all the marked points mapped to the 3D world coordinate system; determining a ground plane of the 3D world coordinate system; automatically fitting, in the 3D world coordinate system, the 3D bounding box for the target object in each of the plurality of 2D images based on the visible and estimated occluded points, the axis lines and the ground plane; mapping the 3D bounding box back to the plurality of 2D images; and generating a label for the target object surrounded by the 3D bounding box on each of the plurality of 2D images using a machine learning object detection model and projecting the label to corresponding feature points of the target object in the 3D map of the surrounding environment. 2. The computer implemented method of claim 1 , further comprising fitting the 3D bounding box for each of the plurality of 2D images on the ground plane in the 3D world coordinate system. 3. The computer implemented method of claim 1 , further comprising marking two points in each of the plurality of 2D images that define a main axis of the target object and using the main axis when generating the 3D bounding box. 4. The computer implemented method of claim 1 , wherein defining axis lines starting from visible marked points includes defining a first line between two visible marked points, defining a second line between a visible point and an occluded point. 5. The computer implemented method of claim 4 , further including forcing the second line to be parallel to the first line. 6. The computer implemented method of claim 1 , further comprising training the machine learning object detection model based on the generated label for the target object. 7. The computer implemented method of claim 6 , wherein training the machine learning object detection model includes determining correspondence between 2D feature points in the plurality of 2D images and 3D feature points in the 3D map, determining that a set of 2D feature points in the plurality of 2D images belongs to the target object, labeling the set of 2D feature points with the corresponding object and projecting the object label of the 2D feature points to the corresponding set of 3D feature points. 8. A computer system for detecting and labeling an object in a 2D image, comprising: one or more computer processors; one or more non-transitory computer-readable storage media; program instructions, stored on the one or more non-transitory computer-readable storage media, which when implemented by the one or more processors, cause the computer system to perform the steps of: receiving a plurality of 2D images from a visual sensor, each image of the plurality of 2D images includes an image of a target object in an surrounding environment; detecting features and extracting feature points from the plurality of 2D images; manually marking visible feature points of the target object on each image of the plurality of 2D images; estimating occluded feature points of the target object in at least one of the plurality of 2D images by defining axis lines starting from visible marked points; generating from the plurality of 2D images a 3D world coordinate system of the environment surrounding the target object; mapping each of the marked feature points on the plurality of 2D images to the 3D world coordinate system; generating a 3D map of the surrounding environment using the extracted feature points; automatically generating a 3D bounding box for the target object covering all the marked points mapped to the 3D world coordinate system; determining a ground plane of the 3D world coordinate system; automatically fitting, in the 3D world coordinate system, the 3D bounding box for the target object in each of the plurality of 2D images based on the visible and estimated occluded points, the axis lines and the ground plane; mapping the 3D bounding box back to the plurality of 2D images; and generating a label for the target object surrounded by the 3D bounding box on each of the plurality of 2D images using a machine learning object detection model and projecting the label to corresponding feature points of the target object in the 3D map of the surrounding environment. 9. The computer system of claim 8 , further comprising fitting the 3D bounding box for each of the plurality of 2D images on the ground plane in the 3D world coordinate system. 10. The computer system of claim 8 , further comprising marking two points in each of the plurality of 2D images that define a main axis of the target object and using the main axis when generating the 3D bounding box. 11. The computer system of claim 8 , wherein defining axis lines starting from visible marked points includes defining a first line between two visible marked points, defining a second line between a visible point and an occluded point. 12. The computer system of claim 11 , further including forcing the second line to be parallel to the first line. 13. The computer system of claim 8 , further comprising training the machine learning object detection model based on the generated label for the target object. 14. The computer system of claim 13 , wherein training the machine learning object detection model includes determining correspondence between 2D feature points in the plurality of 2D images and 3D feature points in the 3D map, determining that a set of 2D feature points in the plurality of 2D images belongs to the target object, labeling the set of 2D feature points with the corresponding object and projecting the object label of the 2D feature points to the corresponding set of 3D feature points. 15. A computer program product comprising: program instructions on a computer-readable storage medium, where execution of the program instructions using a computer causes the computer to perform a method for detecting and labeling an object in a 2D image, comprising: receiving a plurality of 2D images from a visual sensor, each image of the plurality of 2D images includes an image of a target object in an surrounding environment; detecting features and extracting feature points from the plurality of 2D images; manually marking visible feature points of the target object on each image of the plurality of 2D images; estimating occluded feature points of the target object in at least one of the plurality of 2D images by defining axis lines starting from visible marked points; generating from the plurality of 2D images a 3D world coordinate system of the environment surrounding the target object; mapping each of the marked feature points on the plurality of 2D images to the 3D world coordinate system; au
Recognising image objects characterised by unique random patterns · CPC title
Edge-based segmentation · CPC title
Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title
using neural networks · CPC title
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.