End-to-end signalized intersection transition state estimator with scene graphs over semantic keypoints

US12437558B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12437558-B2
Application numberUS-202217976761-A
CountryUS
Kind codeB2
Filing dateOct 29, 2022
Priority dateFeb 17, 2021
Publication dateOct 7, 2025
Grant dateOct 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, computer-readable media, techniques, and methodologies are disclosed for performing end-to-end, learning-based keypoint detection and association. A scene graph of a signalized intersection is constructed from an input image of the intersection. The scene graph includes detected keypoints and linkages identified between the keypoints. The scene graph can be used along with a vehicle's localization information to identify which keypoint that represents a traffic signal is associated with the vehicle's current travel lane. An appropriate vehicle action may then be determined based on a transition state of the traffic signal keypoint and trajectory information for the vehicle. A control signal indicative of this vehicle action may then be output to cause an autonomous vehicle, for example, to implement the appropriate vehicle action.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: at least one memory storing machine-executable instructions; and at least one processor configured to access the at least one memory and execute the machine-executable instructions to: determine keypoint embeddings and linkage embeddings based on keypoints and linkages detected from an image; generate a scene graph of the image based on the keypoint embeddings and linkage embeddings; and output a vehicle control signal based on the scene graph and localization information for a vehicle. 2. The system of claim 1 , wherein the scene graph corresponds to a signalized intersection, wherein the keypoints comprises a first keypoint, and wherein the at least one processor is further configured to execute the machine-executable instructions to: determine a current travel lane of the vehicle based on the localization information; and determine, based on the scene graph, that the first keypoint is associated with the current travel lane, wherein the first keypoint corresponds to a traffic signal in the signalized intersection, the traffic signal having an associated transition state. 3. The system of claim 2 , wherein the keypoints further comprise a second keypoint and a third keypoint and the linkages comprises a first linkage and a second linkage, and wherein the at least one processor is configured to determine that the first keypoint is associated with the current travel lane by executing the machine-executable instructions to: determine, based on the scene graph, that the second keypoint corresponds to a first lane boundary of the current travel lane; determine, based on the scene graph, that the third keypoint corresponds to a second lane boundary of the current travel lane; and determine, based on the scene graph, that the first keypoint is associated with the second keypoint via the first linkage and that the first keypoint is associated with the third keypoint via the second linkage. 4. The system of claim 2 , wherein the vehicle control signal is indicative of a predetermined vehicle action associated with the transition state of the traffic signal. 5. The system of claim 1 , wherein each keypoint is a respective pixel location in the image that is associated with a corresponding candidate object. 6. The system of claim 5 , wherein the output comprises a feature map, and wherein the at least one processor is configured to determine the keypoint embeddings by executing the machine-executable instructions to determine, for each keypoint, a respective feature vector in the feature map that corresponds to the keypoint. 7. The system of claim 6 , wherein the at least one processor is further configured to execute the machine-executable instructions to: determine a respective classification for the corresponding candidate object associated with each keypoint; and associate, for each keypoint, the respective classification corresponding to the keypoint with the respective feature vector corresponding to the keypoint. 8. The system of claim 1 , wherein each linkage is represented as a respective pixel location in the image along a line connecting a respective source candidate object and a respective destination candidate object. 9. The system of claim 8 , wherein the at least one processor is configured to determine the linkage embeddings by executing the machine-executable instructions to determine, for each linkage, a respective source object embedding for the respective source candidate object and a respective destination object embedding for the respective destination candidate object. 10. The system of claim 9 , wherein the at least one processor is further configured to execute the machine-executable instructions to determine a respective relationship type between the respective source object embedding and the respective destination object embedding for each linkage. 11. The system of claim 9 , wherein the output comprises a feature map, and wherein, for a particular linkage, the respective source object embedding is a first feature vector in the feature map and the respective destination object embedding is a second feature vector in the feature map. 12. The system of claim 9 , wherein the at least one processor is configured to determine associations between the keypoints and the linkages by executing the machine-executable instructions to: determine, for a particular linkage, a first keypoint embedding that is a closest match to the respective source object embedding of the particular linkage; and determine a second keypoint embedding that is a closest match to the respective destination object embedding of the particular linkage. 13. The system of claim 1 , wherein the at least one processor is configured to determine associations between the determined keypoints and the determined linkages by executing the machine-executable instructions to select, among candidate sets of associations, a particular set of associations that minimizes an aggregate loss for the scene graph. 14. A method, comprising: determining, using a machine learning algorithm and based on detected keypoints and detected linkages associated with an image, keypoint embeddings and linkage embeddings; determining an association between a particular keypoint of the detected keypoints and a current travel lane of a vehicle based on associations between the keypoint embeddings and the linkage embeddings; and controlling operation of the vehicle based on a control signal generated based on the determined association between the particular keypoint and the current travel lane of the vehicle. 15. The method of claim 14 , further comprising: determining associations between the detected keypoints and the detected linkages based on the keypoint embeddings and the linkage embeddings, wherein the image is an image of a signalized intersection, the method further comprising: generating a scene graph of the signalized intersection, the scene graph identifying associations between the detected keypoints and the detected linkages corresponds to a signalized intersection; determining the current travel lane of the vehicle based on localization information associated with the vehicle; and using the scene graph to determine the association between the particular keypoint and the current travel lane, the particular keypoint representing a traffic signal in the signalized intersection that has an associated transition state. 16. The method of claim 15 , wherein the particular keypoint is a first keypoint, the detected keypoints further comprise a second keypoint and a third keypoint, and the detected linkages comprise a first linkage and a second linkage, and wherein determining the association between the first keypoint and the current travel lane comprises: determining, based on the scene graph, that the second keypoint corresponds to a first lane boundary of the current travel lane; determining, based on the scene graph, that the third keypoint corresponds to a second lane boundary of the current travel lane; and determining, based on the scene graph, that the first keypoint is associated with the second keypoint via the first linkage and that the first keypoint is associated with the third keypoint via the second linkage. 17. The method of claim 15 , wherein controlling operation of the vehicle comprises controlling operation of the vehicle based on a vehicle control signal that is indicative of a predetermined vehicle action associated with the transition state of the traffic signal. 18. The method of claim 15 , wherein determin

Assignees

Inventors

Classifications

  • based on the proximity to a decision surface, e.g. support vector machines · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • Details of control systems for road vehicle drive control not related to the control of a particular sub-unit {, e.g. process diagnostic or vehicle driver interfaces} · CPC title

  • Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12437558B2 cover?
Systems, methods, computer-readable media, techniques, and methodologies are disclosed for performing end-to-end, learning-based keypoint detection and association. A scene graph of a signalized intersection is constructed from an input image of the intersection. The scene graph includes detected keypoints and linkages identified between the keypoints. The scene graph can be used along with a v…
Who is the assignee on this patent?
Toyota Res Inst Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/584. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).