What technology area does this patent fall under?

Primary CPC classification G06N3/092. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jul 31 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for foundation models based reward design for autonomous driving

Patent metadata
Field	Value
Publication number	US-2025245516-A1
Application number	US-202418428515-A
Country	US
Kind code	A1
Filing date	Jan 31, 2024
Priority date	Jan 31, 2024
Publication date	Jul 31, 2025
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for optimizing an action policy of an autonomous vehicle machine learning model. Images are generated corresponding to an environment about a vehicle. These images are passed through an image encoder to generate image-based embeddings of the current state of the vehicle. A text prompt representing a goal of the autonomous vehicle is passed through a text encoder to generate text-based embeddings of the goal. A similarity score is determined, representing a similarity between the image-based embeddings of the current state and the text-based embeddings of the goal. A reinforcement learning model for a closed-loop autonomous driving task is executed, with the similarity score used as the reward function. An action policy corresponding to a control of the vehicle is optimized based on the reward function.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for optimizing an action policy of a machine learning model of an autonomous vehicle, the method comprising: generating an image of an environment about an autonomous vehicle based on vehicle sensor data representing a current state of the autonomous vehicle; passing the generated image through an image encoder to generate image-based embeddings of the current state of the autonomous vehicle; receiving a text prompt representing a goal of the autonomous vehicle; passing the text prompt through a text encoder to generate text-based embeddings of the goal; determining a similarity score representing a similarity between the image-based embeddings of the current state and the text-based embeddings of the goal; executing a reinforcement learning model for a closed-loop autonomous driving task, wherein the similarity score is utilized as a reward in the reinforcement learning model; and optimizing an action policy of the reinforcement learning model based on the similarity score utilized as the reward, wherein the action policy is associated with a control command of the autonomous vehicle. 2 . The method of claim 1 , further comprising: executing a foundation model to perform the determining of the similarity score. 3 . The method of claim 2 , wherein the similarity score is determined as follows: r = 1 - FM state ( state ⁢ description ) · FM goal ( goal ⁢ description ) ❘ "\[LeftBracketingBar]" FM goal ( goal ⁢ description ) ❘ "\[RightBracketingBar]" · ❘ "\[LeftBracketingBar]" FM state ( state ⁢ description ) ❘ "\[RightBracketingBar]" wherein r represents the reward utilized in the reinforcement learning model, FM state represents the image-based embeddings of the current state of the autonomous vehicle, and FM goal represents the text-based embeddings of the goal. 4 . The method of claim 1 , wherein the text prompt is a human-crafted text prompt not generated by a machine learning model. 5 . The method of claim 1 , wherein the determining of the similarity score includes deriving an inverse of the similarity between the image-based embeddings of the current state and the text-based embeddings of the goal. 6 . The method of claim 1 , wherein the image encoder is part of a vision-language model (VLM) configured to generate a vector representing the generated image in a learned embedding space. 7 . The method of claim 6 , wherein the text encoder is part of a large language model (LLM) configured to generate a vector representing the goal in a learned embedding space. 8 . A system for optimizing an action policy of a machine learning model of an autonomous vehicle, the system comprising: one or more image sensors mounted to an autonomous vehicle and configured to generate images external to the autonomous vehicle representing a current state of the autonomous vehicle; and one or more processors communicatively coupled to the one or more images sensors, the one or more processors programmed to: receive the generated images from the one or more image sensors, execute an image encoder on the generated images to generate image-based embeddings of the current state of the vehicle, receive a text prompt representing a goal of the autonomous vehicle, execute a text encoder on the text prompt to generate text-based embeddings of the goal, determine a similarity score representing a similarity between the image-based embeddings of the current state and the text-based embeddings of the goal, execute a reinforcement learning model for a closed-loop autonomous driving task, wherein the similarity score is utilized as a reward in the reinforcement learning model, and optimize an action policy of the reinforcement learning model based on the similarity score utilized as the reward, wherein the action policy is associated with a control command of the autonomous vehicle. 9 . The system of claim 8 , wherein the one or more processors are further programmed to: execute a foundation model to perform the determining of the similarity score. 10 . The system of claim 9 , wherein the similarity score is determined as follows: r = 1 - FM state ( state ⁢ description ) · FM goal ( goal ⁢ description ) ❘ "\[LeftBracketingBar]" FM goal (

Assignees

Bosch Gmbh Robert

Inventors

Classifications

B60W2050/0028
Mathematical models, e.g. for simulation · CPC title
B60W2050/0019
Control system elements or transfer functions · CPC title
B60W50/00
Details of control systems for road vehicle drive control not related to the control of a particular sub-unit {, e.g. process diagnostic or vehicle driver interfaces} · CPC title
B60W60/001
Planning or execution of driving tasks · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

View patent family 96501400

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025245516A1 cover?: Methods and systems for optimizing an action policy of an autonomous vehicle machine learning model. Images are generated corresponding to an environment about a vehicle. These images are passed through an image encoder to generate image-based embeddings of the current state of the vehicle. A text prompt representing a goal of the autonomous vehicle is passed through a text encoder to generate …
Who is the assignee on this patent?: Bosch Gmbh Robert
What technology area does this patent fall under?: Primary CPC classification G06N3/092. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jul 31 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).