Complex network cognition-based federated reinforcement learning end-to-end autonomous driving control system, method, and vehicular device

US2025128720A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025128720-A1
Application numberUS-202318845007-A
CountryUS
Kind codeA1
Filing dateAug 23, 2023
Priority dateJul 21, 2023
Publication dateApr 24, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The provided are a federated reinforcement learning (FRL) end-to-end autonomous driving control system and method, as well as vehicular equipment, based on complex network cognition. An FRL algorithm framework is provided, designated as FLDPPO, for dense urban traffic. This framework combines rule-based complex network cognition with end-to-end FRL through the design of a loss function. FLDPPO employs a dynamic driving guidance system to assist agents in learning rules, thereby enabling them to navigate complex urban driving environments and dense traffic scenarios. Moreover, the provided framework utilizes a multi-agent FRL architecture, whereby models are trained through parameter aggregation to safeguard vehicle-side privacy, accelerate network convergence, reduce communication consumption, and achieve a balance between sampling efficiency and high robustness of the model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A complex network cognition-based federated reinforcement learning (FRL) end-to-end autonomous driving control system, comprising a measurement encoder, an image encoder, a complex network cognition module, a reinforcement learning module, and a federated learning module, wherein: the measurement encoder is configured to obtain state quantities required by the complex network cognition module and the reinforcement learning module, the state quantities required by the complex network cognition module comprise a x-coordinate, a y-coordinate, a heading angle change and a speed of a driving agent, the state quantities are handed over to the complex network cognition module as an input, the state quantities required by the reinforcement learning module comprise a steering wheel angle, a throttle, a brake, a gear, a lateral speed and a longitudinal speed, the state quantities are given to the reinforcement learning module as part of the inputs after extracting features from a two-layer fully connected network; the image encoder is configured to obtain an amount of image implicit state required by the reinforcement learning module, an image used is a 15-channel semantic bird's eye view (BEV), i RL ∈[0,1] 192*192*15 , 192 is in pixels and the BEV used is 5px/m, 15 channels contain a drivable domain, a desired path, a road edge, 4 frames of other vehicles, 4 frames of pedestrians, and 4 frames of traffic signs, wherein the desired path is calculated using a A* algorithm, the semantic BEV is extracted by multilayer convolutional layers to extract implicit features and then passed to the reinforcement learning module as another part of the inputs; the complex network cognition module is configured to model a driving situation of a driving subject, and to obtain a maximum risk value of the driving subject in a current driving situation according to the state quantity provided by the measurement encoder, and finally to output dynamic driving suggestions based on the risk value through an activation function; the reinforcement learning module is configured to integrate the state quantities output from the measurement encoder and the image encoder, output corresponding strategies according to integrated network inputs, and interact with an environment to generate experience samples stored in a local replay buffer in the federated learning module, when the number of experience samples reaches a certain threshold, a batch of sample is taken from the local replay buffer for training, and finally trained neural network parameters are uploaded to the federated learning module; and the federated learning module is configured to receive the neural network parameters uploaded by the reinforcement learning module of the driving agents, and to aggregate a set of global parameters based on the plurality of neural network parameters, and finally to send the global parameters to the driving agents until a neural network converges, a global parameter aggregation is performed by a following equation: ϕ m * = 1 N ⁢ ∑ n ϕ m n wherein ϕ* m denotes the global parameters at time m, N denotes the number of driving agents, and ϕ m n denotes the neural network parameters at time m of the nth driving agent; wherein the activation function is configured to map the risk value, Activate(Risk) represents different activation functions according to different driving suggestions, and the mapped risk value will be used as a basis for guiding the output strategy of the reinforcement learning module: Activate go ( Risk ) = 4 ( 1 + exp ⁡ ( - 300 / Risk ) ) - 1 ⁢ Activate stop ( Risk ) = 4 ( 1 + exp ⁡ ( - 0.2 * Risk ) ) - 1 wherein Activate go (Risk) denotes an activation function when the driving suggestion is forward, Activate stop (Risk) denotes an activation function when the driving suggestion is stop, and Risk denotes a current risk value of a self-vehicle, a dynamic risk suggestion B risk ; B risk = B ⁡ ( Activate go ( Risk ) , β go ) , go ⁢ B risk = B ⁡ ( α stop , Activate stop

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025128720A1 cover?
The provided are a federated reinforcement learning (FRL) end-to-end autonomous driving control system and method, as well as vehicular equipment, based on complex network cognition. An FRL algorithm framework is provided, designated as FLDPPO, for dense urban traffic. This framework combines rule-based complex network cognition with end-to-end FRL through the design of a loss function. FLDPPO …
Who is the assignee on this patent?
Univ Jiangsu
What technology area does this patent fall under?
Primary CPC classification B60W60/001. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Thu Apr 24 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).