Convolutional neural network framework using reverse connections and objectness priors for object detection
US-2020143205-A1 · May 7, 2020 · US
US11222441B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11222441-B2 |
| Application number | US-202016734369-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 5, 2020 |
| Priority date | Nov 22, 2017 |
| Publication date | Jan 11, 2022 |
| Grant date | Jan 11, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for object detection includes: obtaining a plurality of to-be-determined targets in a to-be-detected image; determining confidences of the plurality of to-be-determined targets separately belonging to at least one category, determining categories of the plurality of to-be-determined targets according to the confidences, and determining position offset values corresponding to the respective categories of the plurality of to-be-determined targets; using the position offset values corresponding to the respective categories of the plurality of to-be-determined targets as position offset values of the plurality of to-be-determined targets; and determining position information and a category of at least one to-be-determined target in the to-be-detected image according to the categories of the plurality of to-be-determined targets, the position offset values of the plurality of to-be-determined targets, and the confidences of the plurality of to-be-determined targets belonging to the categories thereof.
Opening claim text (preview).
The invention claimed is: 1. A method for object detection, comprising: obtaining a plurality of to-be-determined targets in a to-be-detected image; determining, for at least one category, confidences of a plurality of to-be-determined targets respectively; determining categories of the plurality of to-be-determined targets according to the confidences; respectively determining position offset values corresponding to the categories of the plurality of to-be-determined targets; respectively using the position offset values corresponding to the categories of the plurality of to-be-determined targets as position offset values of the plurality of to-be-determined targets; and determining a category and position information of at least one to-be-determined target in the to-be-detected image according to the categories of the plurality of to-be-determined targets, the position offset values of the plurality of to-be-determined targets, and confidences of the categories of the plurality of to-be-determined targets, wherein the operation of obtaining a plurality of to-be-determined targets in a to-be-detected image comprises: obtaining the plurality of to-be-determined targets formed based on at least one predetermined region size in the to-be-detected image, wherein the operation of obtaining the plurality of to-be-determined targets formed based on at least one predetermined region size in the to-be-detected image comprises: obtaining a feature map of the to-be-detected image; forming a reference box of a feature point in the feature map according to reference box configuration information, wherein the reference box configuration information is preset and a number and sizes of reference boxes are defined by the reference box configuration information; using the reference box of the feature point in the feature map as one to-be-determined target; and obtaining, respectively corresponding to a plurality of feature points in the feature map, the plurality of to-be-determined targets. 2. The method according to claim 1 , wherein the operation of obtaining a feature map of the to-be-detected image comprises: inputting the to-be-detected image into a backbone network in a convolutional neural network; inputting a feature map output by the backbone network into a filter layer in the convolutional neural network; filtering the feature map output by the backbone network by the filter layer according to a preset sliding window, and using the filtered feature map output by the backbone network as the feature map of the to-be-detected image. 3. The method according to claim 1 , wherein the operation of obtaining the plurality of to-be-determined targets formed based on at least one predetermined region size in the to-be-detected image comprises: obtaining a feature map of the to-be-detected image; pooling the feature map based on reference box configuration information to obtain a plurality of new feature maps; and using the plurality of new feature maps as the plurality of to-be-determined targets. 4. The method according to claim 1 , wherein the predetermined region size comprises: nine predetermined region sizes formed based on three different lengths and three different aspect ratios; or nine predetermined region sizes formed based on three different widths and three different aspect ratios; or nine predetermined region sizes formed based on three different lengths and widths. 5. The method according to claim 1 , wherein the category comprises: two object categories and one background category. 6. The method according to claim 1 , wherein the operation of determining, for at least one category, confidences of a plurality of to-be-determined targets respectively, and determining categories of the plurality of to-be-determined targets according to the confidences comprises: for each of the plurality of to-be-determined target, calculating, for the at least one category, a confidence of the to-be-determined target respectively, and using a category corresponding to a highest confidence as a category of the to-be-determined target. 7. The method according to claim 1 , wherein the operation of determining position offset values corresponding to the respective categories of the plurality of to-be-determined targets comprises: for each of the plurality of to-be-determined target, calculating, for a category of the to-be-determined target, a top offset value, a bottom offset value, a left offset value, and a right offset value of the to-be-determined target. 8. The method according to claim 1 , wherein the position information of at least one to-be-determined target comprises: position information of a bounding box of the at least one to-be-determined target. 9. The method according to claim 8 , wherein the operation of determining a category and position information of at least one to-be-determined target in the to-be-detected image according to the categories of the plurality of to-be-determined targets, the position offset values of the plurality of to-be-determined targets, and confidences of the categories of the plurality of to-be-determined targets comprises: selecting, from the plurality of to-be-determined targets, at least one to-be-determined target with confidences meeting a predetermined confidence requirement; forming the position information of the bounding box of the at least one to-be-determined target in the to-be-detected image according to position offset value of the selected at least one to-be-determined target; and respectively using a category of the selected at least one to-be-determined target as a category of the bounding box of the at least one to-be-determined target in the to-be-detected image. 10. The method according to claim 1 , wherein the operation of determining, for at least one category, confidences of a plurality of to-be-determined targets respectively, determining categories of the plurality of to-be-determined targets according to the confidences, respectively determining position offset values corresponding to the categories of the plurality of to-be-determined targets comprises: using a convolutional neural network to determine, for at least one category, confidences of the plurality of to-be-determined targets respectively, determine categories of the plurality of to-be-determined targets according to the confidences, and respectively determine position offset values corresponding to the categories of the plurality of to-be-determined targets; and the method further comprises: training the convolutional neural network, wherein the operation of training the convolutional neural network comprises: obtaining, from an image sample set, an image sample annotated with information of at least one standard position and category of the at least one standard position; obtaining a plurality of to-be-determined targets in the image sample; determining, for at least one category, confidences of the plurality of to-be-determined targets separately by one convolutional layer in the convolutional neural network; determining categories of the plurality of to-be-determined targets according to the confidences; respectively determining, by another convolutional layer in the convolutional neural network, position offset values corresponding to the categories of the plurality of to-be-determined targets; respectively using the position offset values corresponding to the categories of the plurality of to-be-determined targets as position offset values of the plurality of to-be-determined targets; calculating standard position offset values of the plurality of to-be-determined targets with respect to the corresponding standard position; calculating a deviation between a posit
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road · CPC title
of vehicle lights or traffic lights · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.