Road geometry estimation for vehicles

US2025078519A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025078519-A1
Application numberUS-202418816542-A
CountryUS
Kind codeA1
Filing dateAug 27, 2024
Priority dateAug 29, 2023
Publication dateMar 6, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The invention relates to a method for determining a representation of one or more road objects of a road for a vehicle traveling on the road. The method includes, for each time step out of a plurality of consecutive time steps, encoding images output from the one or more cameras of a vehicle using machine-learning algorithms trained to output image features of road objects depicted in an image provided as input to the machine-learning algorithms. The method further includes transforming a plurality of image features included in the encoded images to a Bird's Eye View (BEV) representation of the plurality of image features. The method also includes decoding the BEV representation to extract a set of object embeddings using transformer-based machine-learning algorithms. Further, the method includes outputting a position and class of each road object of the one or more road objects by decoding the extracted set of object embeddings.

First claim

Opening claim text (preview).

1 . A computer-implemented method for determining a representation of one or more road objects of a road for a vehicle traveling on the road, the vehicle having one or more cameras, the method comprising: for each time step out of a plurality of consecutive time steps: encoding one or more images output from the one or more cameras using one or more machine-learning algorithms trained to output image features of one or more road objects depicted in an image provided as input to the one or more machine-learning algorithms; transforming a plurality of image features comprised in the one or more encoded images to a Bird's Eye View (BEV) representation of the plurality of image features; decoding the BEV representation in order to extract a set of object embeddings from the BEV representation using one or more transformer-based machine-learning algorithms trained to output the set of object embeddings based on an input comprising the BEV representation, a set of object queries, and a set of transformed prior object embeddings extracted at a preceding time step; outputting a position and class of each road object of the one or more road objects by decoding the extracted set of object embeddings. 2 . The method according to claim 1 , further comprising: for each time step out of the plurality of consecutive time steps: transforming the object embeddings extracted at a previous time step using one or more Multi-Layer Perceptron (MLP) algorithms and motion data of the vehicle. 3 . The method according to claim 1 , further comprising: for each time step out of the plurality of consecutive time steps, encoding a lidar output dataset from one or more lidars of the vehicle onto the BEV representation using a machine-learning algorithm trained to extract a plurality of features of one or more road objects indicated in a lidar output dataset provided as input to the machine-learning algorithm. 4 . The method according to claim 1 , wherein the plurality of image features are transformed to the BEV representation using an Inverse Perspective Mapping algorithm and based on a camera pose of each camera of the one or more cameras. 5 . The method according to claim 1 , wherein the transformer-based machine-learning model is configured to output one object embedding for each object query of the set of object queries, and wherein the object embeddings and object queries are vectors of the same size. 6 . The method according to claim 1 , further comprising: for each time step out of the plurality of consecutive time steps forming a geometric representation of the one or more road objects based on the output position and class of each road object. 7 . The method according to claim 1 , wherein the extracted set of object embeddings are decoded by using one or more Multi-Layer Perceptron (MLP) algorithms configured to output the position and class of each road object based on an input comprising object embeddings. 8 . The method according to claim 1 , further comprising: forming a loss function based on the output position and class of each road object and a ground-truth dataset. 9 . The method according to claim 8 , further comprising: updating the set of object queries based on the formed loss function. 10 . A computer program product comprising instructions which, when executed by a computing device of a vehicle, causes the computing device to carry out the method according to claim 1 . 11 . A non-transitory computer-readable storage medium storing instructions which, when executed by a computing device of a vehicle, causes the computing device to carry out the method according to claim 1 . 12 . A system for determining a representation of one or more road objects of a road for a vehicle traveling on the road, the vehicle having one or more cameras, the system comprising one or more memory storage areas comprising program code, the one or more memory storage areas and the program code being configured to, with the one or more processors, cause the system to at least: for each time step out of a plurality of consecutive time steps: encode one or more images output from the one or more cameras using one or more machine-learning algorithms trained to output image features of one or more road objects depicted in an image provided as input to the one or more machine-learning algorithms; transform a plurality of image features comprised in the one or more encoded images to a Bird's Eye View (BEV) representation of the plurality of image features; decode the BEV representation in order to extract a set of object embeddings from the BEV representation using one or more transformer-based machine-learning algorithms trained to output the set of object embeddings based on an input comprising the BEV representation, a set of object queries, and a set of transformed prior object embeddings extracted at a preceding time step; output a position and class of each road object of the one or more road objects by decoding the extracted set of object embeddings. 13 . A vehicle comprising a system according to claim 12 .

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025078519A1 cover?
The invention relates to a method for determining a representation of one or more road objects of a road for a vehicle traveling on the road. The method includes, for each time step out of a plurality of consecutive time steps, encoding images output from the one or more cameras of a vehicle using machine-learning algorithms trained to output image features of road objects depicted in an image …
Who is the assignee on this patent?
Zenseact Ab
What technology area does this patent fall under?
Primary CPC classification G06V20/588. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 06 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).