System and method for three-dimensional (3d) object detection
US-2020082180-A1 · Mar 12, 2020 · US
US11776155B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11776155-B2 |
| Application number | US-202016894123-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 5, 2020 |
| Priority date | Dec 10, 2019 |
| Publication date | Oct 3, 2023 |
| Grant date | Oct 3, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of the present disclosure provide a method and apparatus for detecting a target object in an image. The method includes: performing following prediction operations using a pre-trained neural network: detecting a target object in a two-dimensional image to determine a two-dimensional bounding box of the target object; and determining a relative position constraint relationship between the two-dimensional bounding box of the target object and a three-dimensional projection bounding box obtained by projecting a three-dimensional bounding box of the target object into the two-dimensional image; and the method further including: determining the three-dimensional projection bounding box of the target object, based on the two-dimensional bounding box of the target object and the relative position constraint relationship between the two-dimensional bounding box of the target object and the three-dimensional projection bounding box.
Opening claim text (preview).
What is claimed is: 1. A method for detecting a target object in an image, the method comprising: performing following prediction operations using a pre-trained neural network: detecting the target object in a two-dimensional image to determine a two-dimensional bounding box of the target object; and determining a relative position constraint relationship between the two-dimensional bounding box of the target object and a three-dimensional projection bounding box obtained by projecting a three-dimensional bounding box of the target object into the two-dimensional image; and the method further comprising: determining the three-dimensional projection bounding box of the target object, based on the two-dimensional bounding box of the target object and the relative position constraint relationship between the two-dimensional bounding box of the target object and the three-dimensional projection bounding box; wherein determining the relative position constraint relationship between the two-dimensional bounding box of the target object and the three-dimensional projection bounding box obtained by projecting the three-dimensional bounding box of the target object into the two-dimensional image, comprises: determining values of parameters in a preset parameter group corresponding to the target object; wherein, the preset parameter group comprises at least two first parameter pairs and at least four second parameters; wherein each of the first parameter pairs respectively represents a relative position of a vertex of the three-dimensional bounding box and the two-dimensional bounding box, and two parameters in the first parameter pair respectively represent: a relative position of a vertex on the three-dimensional bounding box and two vertices in a height direction of the two-dimensional bounding box, and a relative position of a vertex on the three-dimensional bounding box and two vertexes in a width direction of the two-dimensional bounding box; and wherein each of the second parameters respectively represents a relative position of a vertex of the three-dimensional projection bounding box in a width or height direction of the two-dimensional bounding box, and two vertices of the two-dimensional bounding box in a same direction, and any one of the first parameter pairs and any one of the second parameters represent positions of different vertices of the three-dimensional projection bounding box relative to the two-dimensional bounding box. 2. The method according to claim 1 , wherein determining the relative position constraint relationship between the two-dimensional bounding box of the target object and the three-dimensional projection bounding box obtained by projecting the three-dimensional bounding box of the target object into the two-dimensional image, further comprises: determining a posture type of the target object from at least two preset posture types, wherein the posture type of the target object is related to a number of vertices blocked by the target object among vertices of the three-dimensional projection bounding box of the target object; and determining the preset parameter group corresponding to the target object according to the posture type of the target object. 3. The method according to claim 2 , wherein the posture type of the target object is further related to an orientation of the target object, and wherein determining the three-dimensional projection bounding box of the target object, based on the two-dimensional bounding box of the target object and the relative position constraint relationship between the two-dimensional bounding box of the target object and the three-dimensional projection bounding box, comprises: determining coordinates of part of vertices of the three-dimensional projection bounding box based on coordinates of the vertices of the two-dimensional bounding box, the values of the parameters in the preset parameter group, and the posture type of the target object; and calculating coordinates of other vertices of the three-dimensional projection bounding box, based on the determined coordinates of the part of vertices of the three-dimensional projection bounding box, and a projection geometric relationship between the three-dimensional projection bounding box and the corresponding three-dimensional bounding box. 4. The method according to claim 1 , wherein the prediction operations further comprise: classifying the target object to determine a category of the target object. 5. The method according to claim 1 , wherein the pre-trained neural network is trained by: acquiring sample data, the sample data comprising a sample image of a three-dimensional projection bounding box labeling the target object included in the three-dimensional projection bounding box, the three-dimensional projection bounding box being a projection of a corresponding three-dimensional bounding box in the sample image; and performing multiple iteration training on the neural network for detecting the target object based on the sample data; the iteration training comprising: using the current neural network for detecting the target object to perform following operations: detecting the target object in the sample image to obtain a detection result of a two-dimensional bounding box of the target object; and determining a relative position constraint relationship between the two-dimensional bounding box and the three-dimensional projection bounding box of the target object in the sample image; determining a detection result of the three-dimensional projection bounding box of the target object in the sample image, based on the detection result of the two-dimensional bounding box of the target object in the sample image and the relative position constraint relationship between the two-dimensional bounding box and the three-dimensional projection bounding box of the target object in the sample image; and updating parameters of the neural network for detecting the target object, based on a difference between the detection result of the three-dimensional projection bounding box of the target object in the sample image and the three-dimensional projection bounding box of the target object in the sample image. 6. The method according to claim 5 , wherein the neural network for detecting the target object comprises a two-dimensional regression branch and a three-dimensional regression branch, wherein the two-dimensional regression branch outputs the detection result of the two-dimensional bounding box of the target object in the sample image, and the three-dimensional regression branch determines the relative position constraint relationship between the two-dimensional bounding box and the three-dimensional projection bounding box of the target object in the sample image. 7. The method according to claim 6 , wherein the neural network for detecting the target object further comprises a three-dimensional classification branch, and wherein the iteration training further comprises: determining a posture type of the target object using the three-dimensional classification branch, the posture type of the target object being related to a number of vertices blocked by the target object in vertices of the three-dimensional projection bounding box of the target object, and/or an orientation of the target object; and determining the relative position constraint relationship between the two-dimensional bounding box and the three-dimensional projection bounding box of the target object in the sample image according to the posture type of the target object by the three-dimensional regression branch. 8. The method according to claim 6 , wherein the sample data further comprises category labeling information of the target object in the sample image, a
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
using feature-based methods · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Classification techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.