What technology area does this patent fall under?

Primary CPC classification B60W60/001. Mapped technology areas include Operations & Transport.

When was this patent published?

Publication date Thu Apr 24 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Complex network cognition-based federated reinforcement learning end-to-end autonomous driving control system, method, and vehicular device

Patent metadata
Field	Value
Publication number	US-2025128720-A1
Application number	US-202318845007-A
Country	US
Kind code	A1
Filing date	Aug 23, 2023
Priority date	Jul 21, 2023
Publication date	Apr 24, 2025
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The provided are a federated reinforcement learning (FRL) end-to-end autonomous driving control system and method, as well as vehicular equipment, based on complex network cognition. An FRL algorithm framework is provided, designated as FLDPPO, for dense urban traffic. This framework combines rule-based complex network cognition with end-to-end FRL through the design of a loss function. FLDPPO employs a dynamic driving guidance system to assist agents in learning rules, thereby enabling them to navigate complex urban driving environments and dense traffic scenarios. Moreover, the provided framework utilizes a multi-agent FRL architecture, whereby models are trained through parameter aggregation to safeguard vehicle-side privacy, accelerate network convergence, reduce communication consumption, and achieve a balance between sampling efficiency and high robustness of the model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A complex network cognition-based federated reinforcement learning (FRL) end-to-end autonomous driving control system, comprising a measurement encoder, an image encoder, a complex network cognition module, a reinforcement learning module, and a federated learning module, wherein: the measurement encoder is configured to obtain state quantities required by the complex network cognition module and the reinforcement learning module, the state quantities required by the complex network cognition module comprise a x-coordinate, a y-coordinate, a heading angle change and a speed of a driving agent, the state quantities are handed over to the complex network cognition module as an input, the state quantities required by the reinforcement learning module comprise a steering wheel angle, a throttle, a brake, a gear, a lateral speed and a longitudinal speed, the state quantities are given to the reinforcement learning module as part of the inputs after extracting features from a two-layer fully connected network; the image encoder is configured to obtain an amount of image implicit state required by the reinforcement learning module, an image used is a 15-channel semantic bird's eye view (BEV), i RL ∈[0,1] 192*192*15 , 192 is in pixels and the BEV used is 5px/m, 15 channels contain a drivable domain, a desired path, a road edge, 4 frames of other vehicles, 4 frames of pedestrians, and 4 frames of traffic signs, wherein the desired path is calculated using a A* algorithm, the semantic BEV is extracted by multilayer convolutional layers to extract implicit features and then passed to the reinforcement learning module as another part of the inputs; the complex network cognition module is configured to model a driving situation of a driving subject, and to obtain a maximum risk value of the driving subject in a current driving situation according to the state quantity provided by the measurement encoder, and finally to output dynamic driving suggestions based on the risk value through an activation function; the reinforcement learning module is configured to integrate the state quantities output from the measurement encoder and the image encoder, output corresponding strategies according to integrated network inputs, and interact with an environment to generate experience samples stored in a local replay buffer in the federated learning module, when the number of experience samples reaches a certain threshold, a batch of sample is taken from the local replay buffer for training, and finally trained neural network parameters are uploaded to the federated learning module; and the federated learning module is configured to receive the neural network parameters uploaded by the reinforcement learning module of the driving agents, and to aggregate a set of global parameters based on the plurality of neural network parameters, and finally to send the global parameters to the driving agents until a neural network converges, a global parameter aggregation is performed by a following equation: ϕ m * = 1 N ⁢ ∑ n ϕ m n wherein ϕ* m denotes the global parameters at time m, N denotes the number of driving agents, and ϕ m n denotes the neural network parameters at time m of the nth driving agent; wherein the activation function is configured to map the risk value, Activate(Risk) represents different activation functions according to different driving suggestions, and the mapped risk value will be used as a basis for guiding the output strategy of the reinforcement learning module: Activate go ( Risk ) = 4 ( 1 + exp ⁡ ( - 300 / Risk ) ) - 1 ⁢ Activate stop ( Risk ) = 4 ( 1 + exp ⁡ ( - 0.2 * Risk ) ) - 1 wherein Activate go (Risk) denotes an activation function when the driving suggestion is forward, Activate stop (Risk) denotes an activation function when the driving suggestion is stop, and Risk denotes a current risk value of a self-vehicle, a dynamic risk suggestion B risk ; B risk = B ⁡ ( Activate go ( Risk ) , β go ) , go ⁢ B risk = B ⁡ ( α stop , Activate stop

Assignees

Univ Jiangsu

Inventors

Classifications

B60W60/001Primary
Planning or execution of driving tasks · CPC title
G05B13/027
using neural networks only · CPC title
B60W2520/12
Lateral speed · CPC title
B60W2520/10
Longitudinal speed · CPC title
B60W2510/188
Parking lock mechanisms · CPC title

Patent family

Related publications grouped by family.

View patent family 88168903

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025128720A1 cover?: The provided are a federated reinforcement learning (FRL) end-to-end autonomous driving control system and method, as well as vehicular equipment, based on complex network cognition. An FRL algorithm framework is provided, designated as FLDPPO, for dense urban traffic. This framework combines rule-based complex network cognition with end-to-end FRL through the design of a loss function. FLDPPO …
Who is the assignee on this patent?: Univ Jiangsu
What technology area does this patent fall under?: Primary CPC classification B60W60/001. Mapped technology areas include Operations & Transport.
When was this patent published?: Publication date Thu Apr 24 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).