Person intention reasoning method, apparatus and device, and storage medium

US2025037495A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025037495-A1
Application numberUS-202218716483-A
CountryUS
Kind codeA1
Filing dateSep 23, 2022
Priority dateApr 28, 2022
Publication dateJan 30, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The person intention reasoning method includes: performing object detection on a to-be-reasoned image to obtain an object detection result; determining that an image portion corresponding to a detection bounding box of each person in the to-be-reasoned image is a to-be-reasoned sub-image of the corresponding person respectively, and acquiring a joint feature and an occlusion probability of a joint of the corresponding person; performing prediction and analysis on the joint feature of corresponding joint based on the occlusion probability to obtain a corresponding prediction feature, and performing correction based on the joint feature and the prediction feature of the joint of the corresponding person to obtain a corresponding correction feature; and performing person intention reasoning by using the object detection result and the correction feature of the joint of the corresponding person to obtain a corresponding person intention reasoning result.

First claim

Opening claim text (preview).

1 . A person intention reasoning method, comprising: performing target detection on a to-be-reasoned image to obtain a corresponding target detection result; determining a detection bounding box of each person in the to-be-reasoned image based on the target detection result, determining that an image portion corresponding to each detection bounding box in the to-be-reasoned image is a to-be-reasoned sub-image of a corresponding person respectively, and acquiring a joint feature and an occlusion probability of a joint of the corresponding person in each to-be-reasoned sub-image; performing prediction and analysis on the joint feature of corresponding joint based on the occlusion probability to obtain a corresponding prediction feature, and performing correction based on the joint feature and the corresponding prediction feature of the corresponding joint of the corresponding person in each to-be-reasoned sub-image to obtain a correction feature of the corresponding joint of the corresponding person in each to-be-reasoned sub-image; and performing person intention reasoning by using the target detection result and the correction feature of the corresponding joint of the corresponding person in each to-be-reasoned sub-image to obtain a corresponding person intention reasoning result. 2 . The method according to claim 1 , wherein the performing prediction and analysis on the joint feature of the corresponding joint based on the occlusion probability to obtain the corresponding prediction feature comprises: taking an arbitrary to-be-reasoned sub-image as a current sub-image, and performing coding fusion on the joint feature and corresponding occlusion probability of each joint in the current sub-image to obtain corresponding fused feature information; and inputting the corresponding fused feature information of the current sub-image into an occluded joint prediction network to obtain a prediction feature of each joint in the current sub-image outputted by the occluded joint prediction network, wherein the occluded joint prediction network is obtained by pre-training based on a plurality of pieces of the corresponding fused feature information of a known prediction feature. 3 . The method according to claim 2 , wherein the performing coding fusion on the joint feature and the corresponding occlusion probability of each joint in the current sub-image to obtain the corresponding fused feature information comprises: splicing the joint feature of the current sub-image and the corresponding occlusion probability of the current sub-image directly into a corresponding multi-dimensional vector as the corresponding fused feature information of the current sub-image. 4 . The method according to claim 2 , wherein the performing coding fusion on the joint feature and the corresponding occlusion probability of each joint in the current sub-image to obtain the corresponding fused feature information comprises: extending the corresponding occlusion probability of the current sub-image into a d-dimensional sub-probability, and adding the d-dimensional sub-probability to a d-dimensional joint feature of the current sub-image in one-to-one correspondence to obtain the corresponding fused feature information of the current sub-image. 5 . The method according to claim 4 , wherein the adding the d-dimensional sub-probability to the d-dimensional joint feature of the current sub-image in one-to-one correspondence to obtain the corresponding fused feature information of the current sub-image comprises: splicing the d-dimensional joint feature and one-dimensional occlusion sub-probability into a (d+1)-dimensional vector to obtain the corresponding fused feature information of the current sub-image. 6 . The method according to claim 4 , wherein the adding the d-dimensional sub-probability to the d-dimensional joint feature of the current sub-image in one-to-one correspondence to obtain the corresponding fused feature information of the current sub-image comprises: extending occlusion sub-probability into d dimensions, and then adding to the d-dimensional joint feature of the current sub-image in one-to-one correspondence to obtain the corresponding fused feature information of the current sub-image. 7 . The method according to claim 2 , wherein the method further comprises: acquiring a plurality of images as training images respectively, wherein each of the training images comprises a single person; acquiring fused feature information and a corresponding prediction feature of each of the training images; and inputting the fused feature information and the corresponding prediction feature of each of the training images into a graph convolutional network, and training the graph convolutional network to obtain a trained graph convolutional network, wherein the trained graph convolutional network is the occluded joint prediction network. 8 . The method according to claim 1 , wherein the acquiring the joint feature of the joint of the corresponding person in each to-be-reasoned sub-image comprises: taking an arbitrary to-be-reasoned sub-image as a current sub-image, and compressing the current sub-image into a multi-dimensional vector by using a convolutional neural network, wherein the multi-dimensional vector comprises specified data obtained by compressing a length and a width of the current sub-image respectively according to a downsampling multiple of the convolutional neural network; and obtaining average pooling of the specified data in the multi-dimensional vector of the current sub-image to obtain a vector of the joint feature of each joint in the current sub-image. 9 . The method according to claim 8 , wherein the compressing the current sub-image into the multi-dimensional vector by using the convolutional neural network comprises: abstracting the current sub-image into a plurality of joints, and for an extracted image portion of the current sub-image corresponding to each detection bounding box, compressing, by the convolutional neural network, an arbitrary image portion in each extracted image portion into a multi-dimensional vector of [h//s, w//s, N], wherein, s represents the downsampling multiple of the convolutional neural network, // represents a compression operation using the convolutional neural network, N represents a total number of joints contained in the current sub-image, h and w represent a length and a width of the arbitrary image portion respectively, and h//s and w//s are both the specified data. 10 . The method according to claim 8 , wherein the obtaining average pooling of the specified data in the multi-dimensional vector of the current sub-image to obtain the vector of the joint feature of each joint in the current sub-image comprises: obtaining average pooling of specified data of preceding two dimensional vectors in the multi-dimensional vector to obtain the vector of the joint feature of each joint in the current sub-image. 11 . The method according to claim 8 , wherein the acquiring the occlusion probability of the joint of the corresponding person in each to-be-reasoned sub-image comprises: inputting the vector of the joint feature of each joint in the current sub-image into an occlusion prediction network to obtain the occlusion probability of each joint in the current sub-image outputted by the occlusion prediction network, wherein the occlusion prediction network is obtained by pre-training based on a vector of a joint feature that is known to be occluded or not. 12 . The method according to claim 11 , wherein the occlusion prediction network is composed of a fully connected layer and a sigmoid activation function layer. 13

Assignees

Inventors

Classifications

  • using classification, e.g. of video objects · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Pattern recognition · CPC title

  • of extracted features · CPC title

  • using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025037495A1 cover?
The person intention reasoning method includes: performing object detection on a to-be-reasoned image to obtain an object detection result; determining that an image portion corresponding to a detection bounding box of each person in the to-be-reasoned image is a to-be-reasoned sub-image of the corresponding person respectively, and acquiring a joint feature and an occlusion probability of a jo…
Who is the assignee on this patent?
Suzhou Metabrain Intelligent Technology Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V40/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).