Autonomous driving control method, apparatus and device, and readable storage medium

US11887009B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11887009-B2
Application numberUS-202118039271-A
CountryUS
Kind codeB2
Filing dateSep 29, 2021
Priority dateJun 1, 2021
Publication dateJan 30, 2024
Grant dateJan 30, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present application discloses an automatic driving control method. In the method, parameters are optimally set by using a noisy and noiseless dual-strategy network, identical vehicle traffic environment state information is input into the noisy and noiseless dual-strategy network, a motion space perturbation threshold is set by using a noiseless strategy network as a comparison and a benchmark so as to adaptively adjust noise parameters, and motion noise is indirectly added by adaptively injecting noise into a strategy network parameter space, such that exploration of an environment and a motion space by a deep reinforcement learning algorithm may be effectively improved, automatic driving exploration performance and stability based on deep reinforcement learning is improved, and full consideration of influence of an environment state and driving strategies in vehicle decision-making and motion selection is ensured, thereby improving the stability and safety of an automatic vehicle.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for automatic driving control, comprising: initializing a system parameter of a deep-reinforcement-learning automatic driving decision system, wherein the deep-reinforcement-learning automatic driving decision system comprises a noiseless strategic network and a noisy strategic network; obtaining vehicle traffic environmental state information; inputting the vehicle traffic environmental state information into the noiseless strategic network and the noisy strategic network to perform automatic driving strategy generation, to obtain a noiseless strategy and a noisy strategy; adjusting a noise parameter injected into the noisy strategic network within a disturbance threshold according to the noisy strategy and the noiseless strategy, wherein adjusting the noise parameter injected into the noisy strategic network within the disturbance threshold according to the noisy strategy and the noiseless strategy comprises: calculating strategy difference between the noisy strategy and the noiseless strategy; determining whether the strategy difference exceeds a disturbance threshold; taking a quotient of the strategy difference and a modulation factor as the noise parameter when the strategy difference exceeds the disturbance threshold; and taking a product of the strategy difference and the modulation factor as the noise parameter when the strategy difference does not exceed the disturbance threshold; wherein the modulation factor is greater than 1; performing parameter optimization on a system parameter of the noisy strategic network according to the noise parameter to generate an optimized noisy strategic network; and performing automatic driving control according to a driving strategy generated by the optimized noisy strategy network; wherein performing parameter optimization on the system parameter of the noisy strategic network according to the noise parameter comprises: performing parameter optimization on a system parameter of the noiseless strategic network according to the noisy strategy, and taking a system parameter of the optimized noiseless strategic network as an original parameter; and taking a sum of the original parameter and the noise parameter as an optimized system parameter of the noisy strategic network; wherein the system parameter of the deep-reinforcement-learning automatic driving decision system comprises an initial strategy parameter with no noise, an initial strategy parameter with implicit noise, an initial network parameter and initial strategy parameter noise. 2. The method for automatic driving control according to claim 1 , wherein before performing the automatic driving control according to the driving strategy generated by the optimized noisy strategy network, the method further comprises: determining execution times of the parameter optimization; determining whether the execution times reach a threshold number of training times; performing the step of performing the automatic driving control according to the driving strategy generated by the optimized noisy strategy network, when the execution times reach the threshold number of training times; and performing the step of obtaining the vehicle traffic environmental state information when the execution times do not reach the threshold number of training times. 3. The method for automatic driving control according to claim 2 , wherein the method further comprises: performing the step of initializing the system parameter of the deep-reinforcement-learning automatic driving decision system when a notice of driving accident is received. 4. The method for automatic driving control according to claim 1 , wherein the strategic network is a network constructed based on a deep-reinforcement-learning strategy parameter space. 5. The method for automatic driving control according to claim 4 , wherein the deep-reinforcement-learning automatic driving decision system further comprises an evaluation network; and performing parameter optimization on the system parameter of the noisy strategic network according to the noise parameter comprises: update a parameter of the evaluation network, a parameter of the noiseless strategic network and a parameter of the strategic network with implicit noise. 6. A device for automatic driving control, comprising: a memory configured for storing a computer program; and a processor configured for implementing steps of the method for automatic driving control according to claim 1 when the computer program is executed. 7. The device for automatic driving control according to claim 6 , wherein before performing the automatic driving control according to the driving strategy generated by the optimized noisy strategy network, the method further comprises: determining execution times of the parameter optimization; determining whether the execution times reach a threshold number of training times; performing the step of performing the automatic driving control according to the driving strategy generated by the optimized noisy strategy network, when the execution times reach the threshold number of training times; and performing the step of obtaining the vehicle traffic environmental state information when the execution times do not reach the threshold number of training times. 8. The device for automatic driving control according to claim 7 , wherein the method further comprises: performing the step of initializing the system parameter of the deep-reinforcement-learning automatic driving decision system when a notice of driving accident is received. 9. The device for automatic driving control according to claim 6 , wherein the noiseless strategic network refers to a strategic network with no noise, and the noisy strategic network refers to a strategic network with implicit noise, and the strategic network is a network constructed based on a deep-reinforcement-learning strategy parameter space. 10. The device for automatic driving control according to claim 9 , wherein the deep-reinforcement-learning automatic driving decision system further comprises an evaluation network; and performing parameter optimization on the system parameter of the noisy strategic network according to the noise parameter comprises: update a parameter of the evaluation network, a parameter of the noiseless strategic network and a parameter of the strategic network with implicit noise. 11. A non-transitory readable storage medium, having a computer program stored thereon and the computer program, when executed by a processor, implementing steps of the method for automatic driving control according to claim 1 . 12. The non-transitory readable storage medium according to claim 11 , wherein before performing the automatic driving control according to the driving strategy generated by the optimized noisy strategy network, the method further comprises: determining execution times of the parameter optimization; determining whether the execution times reach a threshold number of training times; performing the step of performing the automatic driving control according to the driving strategy generated by the optimized noisy strategy network, when the execution times reach the threshold number of training times; and performing the step of obtaining the vehicle traffic environmental state information when the execution times do not reach the threshold number of training times. 13. The non-transitory readable storage medium according to claim 12 , wherein the method further comprises: performing the step of initializing the system parameter of the deep-reinforcement-learning automatic driving decision system when a notice of driving ac

Assignees

Inventors

Classifications

  • G06N3/092Primary

    Reinforcement learning · CPC title

  • specially adapted for safety · CPC title

  • Input parameters relating to objects · CPC title

  • G05B13/042Primary

    in which a parameter or coefficient is automatically adjusted to optimise the performance · CPC title

  • using neural networks only · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11887009B2 cover?
The present application discloses an automatic driving control method. In the method, parameters are optimally set by using a noisy and noiseless dual-strategy network, identical vehicle traffic environment state information is input into the noisy and noiseless dual-strategy network, a motion space perturbation threshold is set by using a noiseless strategy network as a comparison and a benchm…
Who is the assignee on this patent?
Inspur Suzhou Intelligent Technology Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/092. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 30 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).