Methods and apparatuses for determining bounding box of target object, media, and devices

US11348275B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11348275-B2
Application numberUS-201916731858-A
CountryUS
Kind codeB2
Filing dateDec 31, 2019
Priority dateNov 21, 2017
Publication dateMay 31, 2022
Grant dateMay 31, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present application disclose methods and apparatuses for determining a bounding box of a target object, media, and devices. The method includes: obtaining attribute information of each of a plurality of key points of a target object; and determining a bounding box position of the target object according to the attribute information of each of the plurality of key points of the target object and to a preset neural network. The implementations of the present application can improve the efficiency and accuracy of determining a bounding box of a target object.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for determining a bounding box of a target object, comprising: obtaining attribute information of each of a plurality of key points of a target object; determining a bounding box position of the target object based on the attribute information of each of the plurality of key points of the target object and a preset neural network; wherein the attribute information of each of the plurality of key points comprises coordinate information and a presence determination value; wherein the determining the bounding box position of the target object based on the attribute information of each of the plurality of key points of the target object and the preset neural network comprises: determining at least one valid key point from the plurality of key points according to the attribute information of each of the plurality of key points, processing, according to the attribute information of each of the at least one valid key point, the attribute information of the plurality of key points to obtain processed attribute information of the plurality of key points, and inputting the processed attribute information of the plurality of key points to the preset neural network for processing to obtain the bounding box position of the target object; and wherein the processed attribute information of the plurality of key points comprises processed attribute information of each of the at least one valid key point, and attribute information of key points other than the at least one valid key point in the plurality of key points. 2. The method according to claim 1 , wherein processing, according to the attribute information of each of the at least one valid key point, the attribute information of the plurality of key points to obtain processed attribute information of the plurality of key points comprises: determining a reference coordinate according to coordinate information comprised in the attribute information of each of the at least one valid key point; and determining coordinate information in the processed attribute information of each valid key point according to the reference coordinate and to the coordinate information in the attribute information of each of the at least one valid key point. 3. The method according to claim 2 , wherein determining the reference coordinate according to coordinate information comprised in the attribute information of each of the at least one valid key point comprises: performing averaging processing on coordinates corresponding to the coordinate information of each of the at least one valid key point to obtain the reference coordinate, and/or determining the coordinate information in the processed attribute information of each valid key point according to the reference coordinate and to the coordinate information in the attribute information of each of the at least one valid key point comprises: determining, with the reference coordinate as an origin point, processed coordinate information corresponding to the coordinate information of each of the at least one valid key point. 4. The method according to claim 2 , wherein inputting the processed attribute information of the plurality of key points to the preset neural network for processing to obtain the bounding box position of the target object comprises: inputting the processed attribute information of the plurality of key points to the preset neural network for processing to obtain output position information; and determining the bounding box position of the target object according to the reference coordinate and the output position information. 5. The method according to claim 1 , wherein the neural network comprises at least two full connection layers. 6. The method according to claim 1 , wherein the neural network comprises three full connection layers, wherein an activation function of at least one of the first full connection layer and the second full connection layer of the three full connection layers comprises a Rectified Linear Unit (ReLu) activation function. 7. A non-transitory computer-readable storage medium having instructions stored thereon, wherein the instructions upon execution by a processor cause the processor to perform operations comprising: obtaining attribute information of each of a plurality of key points of a target object; determining a bounding box position of the target object based on the attribute information of each of the plurality of key points of the target object and a preset neural network; wherein the attribute information of each of the plurality of key points comprises coordinate information and a presence determination value; wherein the determining the bounding box position of the target object based on the attribute information of each of the plurality of key points of the target object and the preset neural network comprises: determining at least one valid key point from the plurality of key points according to the attribute information of each of the plurality of key points, processing, according to the attribute information of each of the at least one valid key point, the attribute information of the plurality of key points to obtain processed attribute information of the plurality of key points, and inputting the processed attribute information of the plurality of key points to the preset neural network for processing to obtain the bounding box position of the target object; and wherein the processed attribute information of the plurality of key points comprises processed attribute information of each of the at least one valid key point, and attribute information of key points other than the at least one valid key point in the plurality of key points. 8. An electronic device comprising: a processor; and a computer-readable storage medium having stored thereon instructions that, when executed by the processor, cause the processor to: obtain attribute information of each of a plurality of key points of a target object; determine a bounding box position of the target object according to the attribute information of each of the plurality of key points of the target object and a preset neural network; wherein the attribute information of each of the plurality of key points comprises coordinate information and a presence determination value; wherein the determining the bounding box position of the target object based on the attribute information of each of the plurality of key points of the target object and the preset neural network comprises: determining at least one valid key point from the plurality of key points according to the attribute information of each of the plurality of key points, processing, according to the attribute information of each of the at least one valid key point, the attribute information of the plurality of key points to obtain processed attribute information of the plurality of key points, and inputting the processed attribute information of the plurality of key points to the preset neural network for processing to obtain the bounding box position of the target object; and wherein the processed attribute information of the plurality of key points comprises processed attribute information of each of the at least one valid key point, and attribute information of key points other than the at least one valid key point in the plurality of key points. 9. The electronic device according to claim 8 , wherein processing, according to the attribute information of each of the at least one valid key point, the attribute information of the plurality of key points to obtain processed attribute information of the plurality of key points comprises: determining a reference coordinate according to coordinate information comprised in the attri

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

  • using classification, e.g. of video objects · CPC title

  • G06T7/73Primary

    using feature-based methods · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11348275B2 cover?
Embodiments of the present application disclose methods and apparatuses for determining a bounding box of a target object, media, and devices. The method includes: obtaining attribute information of each of a plurality of key points of a target object; and determining a bounding box position of the target object according to the attribute information of each of the plurality of key points of th…
Who is the assignee on this patent?
Beijing Sensetime Tech Development Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T7/73. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 31 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).