What technology area does this patent fall under?

Primary CPC classification G06V20/584. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

End-to-end monocular 2D semantic keypoint detector and tracker learning

US12488597B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12488597-B2
Application number	US-202117167570-A
Country	US
Kind code	B2
Filing date	Feb 4, 2021
Priority date	Feb 4, 2021
Publication date	Dec 2, 2025
Grant date	Dec 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for semantic keypoint detection is described. The method includes linking, using a keypoint graph neural network (KGNN), semantic keypoints of an object within a first image of a video stream into a 2D graph structure corresponding to a category of the object. The method also includes embedding descriptors within the semantic keypoints of the 2D graph structure corresponding to the category of the object. The method further includes tracking the object within subsequent images of the video stream using the embedded descriptors within the semantic keypoints of the 2D graph structure corresponding to the category of the object.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for semantic keypoint detection, comprising: linking, using a first keypoint graph neural network (KGNN) encoder, semantic keypoints of an object within a first frame at time T of a video stream into a first 2D graph structure corresponding to a category of the object; embedding descriptors within the semantic keypoints of the first 2D graph structure corresponding to the category of the object; linking, using a second KGNN encoder, semantic keypoints in a second 2D graph structure representing the object category of the object in a second frame at time T+1 of the video stream; generating, using a shared differential keypoint flow model, an estimated second 2D graph structure corresponding to the category of the object in the second frame at time T+1 of the video stream according to the embedded descriptors in the semantic keypoints of the first 2D graph structure; and tracking the object within subsequent frames of the video stream according to a regression loss from comparing, using a shared matching layer, the estimated 2D graph structure according to the embedded descriptor in the semantic keypoints of the first 2D graph structure with the second 2D graph structure of the object in the second frame at time T+1. 2 . The method of claim 1 , in which linking the semantic keypoints comprises: linking, using the first KGNN encoder, interest keypoints of the object within the first image of the video stream into the 2D graph structure corresponding to the category of the object within the first frame of the video stream; and detecting, using a first KGNN detector, the semantic keypoints from the linked, interest keypoint within the first 2D graph structure corresponding to the category of the object within the first frame at time T of the video stream. 3 . The method of claim 1 , in which the first 2D graph structure and the second 2D graph structure are based on a geometric structure of the category associated with the object. 4 . The method of claim 1 , in which linking comprises: extracting, using a shared image backbone, interest keypoints within the first frame of the video stream based on relevant appearance and geometric features of the first frame; and generating a keypoint heatmap based on the extracted interest keypoints. 5 . The method of claim 1 , in which embedding comprises: generating descriptors of the semantic keypoints; and embedding, using a first KGNN descriptor head, the generated descriptors within the semantic keypoints of the first 2D graph structure. 6 . The method of claim 1 , in which the object comprises a vehicle represented by the estimated 2D graph structure to depict geometry/spatial relationships of a rigid-body of the vehicle according to the category of the vehicle. 7 . A non-transitory computer-readable medium having program code recorded thereon for semantic keypoint detection, the program code being executed by a processor and comprising: program code to link, using a first keypoint graph neural network (KGNN) encoder, semantic keypoints of an object within a first frame at time T of a video stream into a first 2D graph structure corresponding to a category of the object; program code to embed descriptors within the semantic keypoints of the first 2D graph structure corresponding to the category of the object; linking, using a second KGNN encoder, semantic keypoints in a second 2D graph structure representing the object category of the object in a second frame at time T+1 of the video stream; program code to generate, using a shared differential keypoint flow model, an estimated second 2D graph structure corresponding to the category of the object in the second frame at time T+1 of the video stream according to the embedded descriptors in the semantic keypoints of the first 2D graph structure; and program code to track the object within subsequent frames of the video stream according to a regression loss from comparing, using a shared matching layer, the estimated 2D graph structure according to the embedded descriptor in the semantic keypoints of the first 2D graph structure with the second 2D graph structure of the object in the second frame at time T+1. 8 . The non-transitory computer-readable medium of claim 7 , in which linking the semantic keypoints comprises: program code to link, using the first KGNN encoder, interest keypoints of the object within the first frame of the video stream into the first 2D graph structure corresponding to the category of the object within the first frame of the video stream; and program code to detect, using a first KGNN detector, the semantic keypoints from the linked, interest keypoint within the first 2D graph structure corresponding to the category of the object within the first frame of the video stream. 9 . The non-transitory computer-readable medium of claim 7 , in which the first 2D graph structure and the second 2D graph structure are based on a geometric structure of the category associated with the object. 10 . The non-transitory computer-readable medium of claim 7 , in which the program code to link comprises: program code to extract, using a shared image backbone, interest keypoints within the first frame of the video stream based on relevant appearance and geometric features of the first frame; and program code to generate a keypoint heatmap based on the extracted interest keypoints. 11 . The non-transitory computer-readable medium of claim 7 , in which the program code to embed comprises: program code to generate the descriptors of the semantic keypoints; and program code to embed, using a first KGNN descriptor head, the descriptors within the semantic keypoints of the first 2D graph structure. 12 . The non-transitory computer-readable medium of claim 7 , in which the object comprises a vehicle represented by the estimated 2D graph structure to depict geometry/spatial relationships of a rigid-body of the vehicle according to the category of the vehicle. 13 . A system for semantic keypoint detection, the system comprising: a semantic keypoint detection module to link, using a first keypoint graph neural network (KGNN) encoder, semantic keypoints of an object within a first frame at time T of a video stream into a first 2D graph structure corresponding to a category of the object, and to link, using a second KGNN encoder, semantic keypoints in a second 2D graph structure representing the object category of the object in a second frame at time T+1 of the video stream; a semantic keypoint descriptor module to embed descriptors within the semantic keypoints of the first 2D graph structure corresponding to the category of the object, and to generate, using a shared differential keypoint flow model, an estimated second 2D graph structure corresponding to the category of the object in the second frame at time T+1 of the video stream according to the embedded descriptors-within in the semantic keypoints of the first 2D graph structure; and a semantic keypoint tracking module to track the object within subsequent frames of the video stream according to a regression loss from comparing, using a shared matching layer, the estimated 2D graph structure according to the embedded descriptor in the semantic keypoints of the first 2D graph structure with the second 2D graph structure of the object in the second frame at time T+1. 14 . The system of claim 13 , in which the object comprises a vehicle represented by the estimated 2D graph structure to depict geometry/spatial relationships of a rigid-body of the vehicle according to the category of the vehicle.

Assignees

Toyota Res Inst Inc

Inventors

Classifications

G06V20/41
Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title
B60W60/00274
considering possible movement changes · CPC title
G06F16/9024
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
G06N3/04
Architecture, e.g. interconnection topology · CPC title
B60W2554/80
Spatial relation or speed relative to objects · CPC title

Patent family

Related publications grouped by family.

View patent family 82611965

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12488597B2 cover?: A method for semantic keypoint detection is described. The method includes linking, using a keypoint graph neural network (KGNN), semantic keypoints of an object within a first image of a video stream into a 2D graph structure corresponding to a category of the object. The method also includes embedding descriptors within the semantic keypoints of the 2D graph structure corresponding to the cat…
Who is the assignee on this patent?: Toyota Res Inst Inc
What technology area does this patent fall under?: Primary CPC classification G06V20/584. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for virtual and augmented reality

Proactive vehicle safety system

Multi-directional structured image array capture on a 2d graph

Frequently asked questions