Method and apparatus for positioning key point, device, and storage medium

US11610389B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11610389-B2
Application numberUS-202117201665-A
CountryUS
Kind codeB2
Filing dateMar 15, 2021
Priority dateJun 12, 2020
Publication dateMar 21, 2023
Grant dateMar 21, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for positioning a key point, a device, and a storage medium are provided. The method may include: extracting a first feature map and a second feature map of a to-be-positioned image, the first feature map and the second feature map being different feature maps; determining, based on the first feature map, an initial position of a key point in the to-be-positioned image; determining, based on the second feature map, an offset of the key point; and adding the initial position of the key point with the offset of the key point to obtain a final position of the key point.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for positioning a key point, comprising: extracting a first feature map and a second feature map of a to-be-positioned image, the first feature map and the second feature map being different feature maps; determining, based on the first feature map, an initial position of a key point in the to-be-positioned image; determining, based on the second feature map, an offset of the key point; and adding the initial position of the key point with the offset of the key point to obtain a final position of the key point, wherein the extracting the first feature map and the second feature map of the to-be-positioned image comprises: inputting a to-be-positioned feature map into a main network to output an initial feature map of the to-be-positioned image; and inputting the initial feature map into a first sub-network and a second sub-network respectively to output the first feature map and the second feature map, wherein the first sub-network and the second sub-network are two different branches of the main network. 2. The method according to claim 1 , wherein the determining, based on the first feature map, the initial position of the key point in the to-be-positioned image comprises: generating, based on the first feature map, a heat map of the key point in the to-be-positioned image; and determining, based on a heat value of a point on the heat map, the initial position of the key point. 3. The method according to claim 2 , wherein the generating, based on the first feature map, the heat map of the key point in the to-be-positioned image comprises: performing 1×1 convolution on the first feature map to obtain the heat map, wherein channels of the heat map correspond to key points one to one. 4. The method according to claim 1 , wherein the determining, based on the second feature map, the offset of the key point comprises: extracting, based on the initial position of the key point, a feature from a corresponding position of the second feature map; and performing offset regression by using the feature to obtain the offset of the key point. 5. An electronic device, comprising: one or more processors; and a storage apparatus storing one or more programs thereon, the one or more programs, when executed by the one or more processors, causing the one or more processors to perform operations comprising: extracting a first feature map and a second feature map of a to-be-positioned image, the first feature map and the second feature map being different feature maps; determining, based on the first feature map, an initial position of a key point in the to-be-positioned image; determining, based on the second feature map, an offset of the key point; and adding the initial position of the key point with the offset of the key point to obtain a final position of the key point, wherein the extracting the first feature map and the second feature map of the to-be-positioned image comprises: inputting a to-be-positioned feature map into a main network to output an initial feature map of the to-be-positioned image; and inputting the initial feature map into a first sub-network and a second sub-network respectively to output the first feature map and the second feature map, wherein the first sub-network and the second sub-network are two different branches of the main network. 6. The electronic device according to claim 5 , wherein the determining, based on the first feature map, the initial position of the key point in the to-be-positioned image comprises: generating, based on the first feature map, a heat map of the key point in the to-be-positioned image; and determining, based on a heat value of a point on the heat map, the initial position of the key point. 7. The electronic device according to claim 6 , wherein the generating, based on the first feature map, the heat map of the key point in the to-be-positioned image comprises: performing 1×1 convolution on the first feature map to obtain the heat map, wherein channels of the heat map correspond to key points one to one. 8. The electronic device according to claim 5 , wherein the determining, based on the second feature map, the offset of the key point comprises: extracting, based on the initial position of the key point, a feature from a corresponding position of the second feature map; and performing offset regression by using the feature to obtain the offset of the key point. 9. A non-transitory computer readable medium, storing a computer program thereon, the computer program, when executed by a processor, causing the processor to perform operations comprising: extracting a first feature map and a second feature map of a to-be-positioned image, the first feature map and the second feature map being different feature maps; determining, based on the first feature map, an initial position of a key point in the to-be-positioned image; determining, based on the second feature map, an offset of the key point; and adding the initial position of the key point with the offset of the key point to obtain a final position of the key point, wherein the extracting the first feature map and the second feature map of the to-be-positioned image comprises: inputting a to-be-positioned feature map into a main network to output an initial feature map of the to-be-positioned image; and inputting the initial feature map into a first sub-network and a second sub-network respectively to output the first feature map and the second feature map, wherein the first sub-network and the second sub-network are two different branches of the main network. 10. The non-transitory computer readable medium according to claim 9 , wherein the determining, based on the first feature map, the initial position of the key point in the to-be-positioned image comprises: generating, based on the first feature map, a heat map of the key point in the to-be-positioned image; and determining, based on a heat value of a point on the heat map, the initial position of the key point. 11. The non-transitory computer readable medium according to claim 10 , wherein the generating, based on the first feature map, the heat map of the key point in the to-be-positioned image comprises: performing 1×1 convolution on the first feature map to obtain the heat map, wherein channels of the heat map correspond to key points one to one. 12. The non-transitory computer readable medium according to claim 9 , wherein the determining, based on the second feature map, the offset of the key point comprises: extracting, based on the initial position of the key point, a feature from a corresponding position of the second feature map; and performing offset regression by using the feature to obtain the offset of the key point.

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • G06V10/462Primary

    Salient features, e.g. scale invariant feature transforms [SIFT] · CPC title

  • using classification, e.g. of video objects · CPC title

  • Human being; Person · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11610389B2 cover?
A method and apparatus for positioning a key point, a device, and a storage medium are provided. The method may include: extracting a first feature map and a second feature map of a to-be-positioned image, the first feature map and the second feature map being different feature maps; determining, based on the first feature map, an initial position of a key point in the to-be-positioned image; d…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/462. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 21 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).