Who is the assignee on this patent?

Beijing Sensetime Tech Development Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06V20/56. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Methods and apparatuses for object detection, and devices

US11222441B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11222441-B2
Application number	US-202016734369-A
Country	US
Kind code	B2
Filing date	Jan 5, 2020
Priority date	Nov 22, 2017
Publication date	Jan 11, 2022
Grant date	Jan 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for object detection includes: obtaining a plurality of to-be-determined targets in a to-be-detected image; determining confidences of the plurality of to-be-determined targets separately belonging to at least one category, determining categories of the plurality of to-be-determined targets according to the confidences, and determining position offset values corresponding to the respective categories of the plurality of to-be-determined targets; using the position offset values corresponding to the respective categories of the plurality of to-be-determined targets as position offset values of the plurality of to-be-determined targets; and determining position information and a category of at least one to-be-determined target in the to-be-detected image according to the categories of the plurality of to-be-determined targets, the position offset values of the plurality of to-be-determined targets, and the confidences of the plurality of to-be-determined targets belonging to the categories thereof.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for object detection, comprising: obtaining a plurality of to-be-determined targets in a to-be-detected image; determining, for at least one category, confidences of a plurality of to-be-determined targets respectively; determining categories of the plurality of to-be-determined targets according to the confidences; respectively determining position offset values corresponding to the categories of the plurality of to-be-determined targets; respectively using the position offset values corresponding to the categories of the plurality of to-be-determined targets as position offset values of the plurality of to-be-determined targets; and determining a category and position information of at least one to-be-determined target in the to-be-detected image according to the categories of the plurality of to-be-determined targets, the position offset values of the plurality of to-be-determined targets, and confidences of the categories of the plurality of to-be-determined targets, wherein the operation of obtaining a plurality of to-be-determined targets in a to-be-detected image comprises: obtaining the plurality of to-be-determined targets formed based on at least one predetermined region size in the to-be-detected image, wherein the operation of obtaining the plurality of to-be-determined targets formed based on at least one predetermined region size in the to-be-detected image comprises: obtaining a feature map of the to-be-detected image; forming a reference box of a feature point in the feature map according to reference box configuration information, wherein the reference box configuration information is preset and a number and sizes of reference boxes are defined by the reference box configuration information; using the reference box of the feature point in the feature map as one to-be-determined target; and obtaining, respectively corresponding to a plurality of feature points in the feature map, the plurality of to-be-determined targets. 2. The method according to claim 1 , wherein the operation of obtaining a feature map of the to-be-detected image comprises: inputting the to-be-detected image into a backbone network in a convolutional neural network; inputting a feature map output by the backbone network into a filter layer in the convolutional neural network; filtering the feature map output by the backbone network by the filter layer according to a preset sliding window, and using the filtered feature map output by the backbone network as the feature map of the to-be-detected image. 3. The method according to claim 1 , wherein the operation of obtaining the plurality of to-be-determined targets formed based on at least one predetermined region size in the to-be-detected image comprises: obtaining a feature map of the to-be-detected image; pooling the feature map based on reference box configuration information to obtain a plurality of new feature maps; and using the plurality of new feature maps as the plurality of to-be-determined targets. 4. The method according to claim 1 , wherein the predetermined region size comprises: nine predetermined region sizes formed based on three different lengths and three different aspect ratios; or nine predetermined region sizes formed based on three different widths and three different aspect ratios; or nine predetermined region sizes formed based on three different lengths and widths. 5. The method according to claim 1 , wherein the category comprises: two object categories and one background category. 6. The method according to claim 1 , wherein the operation of determining, for at least one category, confidences of a plurality of to-be-determined targets respectively, and determining categories of the plurality of to-be-determined targets according to the confidences comprises: for each of the plurality of to-be-determined target, calculating, for the at least one category, a confidence of the to-be-determined target respectively, and using a category corresponding to a highest confidence as a category of the to-be-determined target. 7. The method according to claim 1 , wherein the operation of determining position offset values corresponding to the respective categories of the plurality of to-be-determined targets comprises: for each of the plurality of to-be-determined target, calculating, for a category of the to-be-determined target, a top offset value, a bottom offset value, a left offset value, and a right offset value of the to-be-determined target. 8. The method according to claim 1 , wherein the position information of at least one to-be-determined target comprises: position information of a bounding box of the at least one to-be-determined target. 9. The method according to claim 8 , wherein the operation of determining a category and position information of at least one to-be-determined target in the to-be-detected image according to the categories of the plurality of to-be-determined targets, the position offset values of the plurality of to-be-determined targets, and confidences of the categories of the plurality of to-be-determined targets comprises: selecting, from the plurality of to-be-determined targets, at least one to-be-determined target with confidences meeting a predetermined confidence requirement; forming the position information of the bounding box of the at least one to-be-determined target in the to-be-detected image according to position offset value of the selected at least one to-be-determined target; and respectively using a category of the selected at least one to-be-determined target as a category of the bounding box of the at least one to-be-determined target in the to-be-detected image. 10. The method according to claim 1 , wherein the operation of determining, for at least one category, confidences of a plurality of to-be-determined targets respectively, determining categories of the plurality of to-be-determined targets according to the confidences, respectively determining position offset values corresponding to the categories of the plurality of to-be-determined targets comprises: using a convolutional neural network to determine, for at least one category, confidences of the plurality of to-be-determined targets respectively, determine categories of the plurality of to-be-determined targets according to the confidences, and respectively determine position offset values corresponding to the categories of the plurality of to-be-determined targets; and the method further comprises: training the convolutional neural network, wherein the operation of training the convolutional neural network comprises: obtaining, from an image sample set, an image sample annotated with information of at least one standard position and category of the at least one standard position; obtaining a plurality of to-be-determined targets in the image sample; determining, for at least one category, confidences of the plurality of to-be-determined targets separately by one convolutional layer in the convolutional neural network; determining categories of the plurality of to-be-determined targets according to the confidences; respectively determining, by another convolutional layer in the convolutional neural network, position offset values corresponding to the categories of the plurality of to-be-determined targets; respectively using the position offset values corresponding to the categories of the plurality of to-be-determined targets as position offset values of the plurality of to-be-determined targets; calculating standard position offset values of the plurality of to-be-determined targets with respect to the corresponding standard position; calculating a deviation between a posit

Assignees

Beijing Sensetime Tech Development Co Ltd

Inventors

Classifications

G06V20/56Primary
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
G06V20/588
Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road · CPC title
G06V20/584Primary
of vehicle lights or traffic lights · CPC title
G06V10/454
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
G06V10/25
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

Patent family

Related publications grouped by family.

View patent family 62652715

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11222441B2 cover?: A method for object detection includes: obtaining a plurality of to-be-determined targets in a to-be-detected image; determining confidences of the plurality of to-be-determined targets separately belonging to at least one category, determining categories of the plurality of to-be-determined targets according to the confidences, and determining position offset values corresponding to the respec…
Who is the assignee on this patent?: Beijing Sensetime Tech Development Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V20/56. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Convolutional neural network framework using reverse connections and objectness priors for object detection

Region proposal for image regions that include objects of interest using feature maps from multiple layers of a convolutional neural network model

Object detection and classification in images

Method and apparatus for tracking object

Frequently asked questions