Method for driving control based on boarding congestion and a vehicle using the same
US-2024367682-A1 · Nov 7, 2024 · US
US2018009445A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2018009445-A1 |
| Application number | US-201615205558-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 8, 2016 |
| Priority date | Jul 8, 2016 |
| Publication date | Jan 11, 2018 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method of adaptively controlling an autonomous operation of a vehicle is provided. The method includes steps of (a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected data and a state cost, an estimated average cost, and an approximated cost-to-go function that produces a minimum value for a cost-to-go of the vehicle when applied by an actor network; and (b) in an actor network in the computing system and operatively coupled to the critic network, determining a control input to apply to the vehicle that produces the minimum value for the cost-to-go, wherein the actor network is configured to determine the control input by estimating a noise level using the average cost, a cost-to-go determined from the approximated cost-to-go function, a control dynamics for a current state of the vehicle, and the passively collected data.
Opening claim text (preview).
1 . A computer-implemented method of adaptively controlling an autonomous operation of a vehicle, the method comprising: a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected data and a state cost, an estimated average cost, and an approximated cost-to-go function that produces a minimum value for a cost-to-go of the vehicle when applied by an actor network; and b) in an actor network in the computing system and operatively coupled to the critic network, determining a control input to apply to the vehicle which produces the minimum value for the cost-to-go, wherein the actor network is configured to determine the control input by estimating a noise level using the estimated average cost, an estimated cost-to-go determined from the approximated cost-to-go function, a control dynamics for a current state of the vehicle, and the samples of passively collected data. 2 . A computer-implemented method of adaptively controlling an autonomous operation of a vehicle, the method comprising: a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected data and a state cost, an estimated average cost, and an approximated cost-to-go function that produces a minimum value for a cost-to-go of the vehicle when applied by an actor network; and b) in an actor network in the computing system and operatively coupled to the critic network, determining a control input to apply to the vehicle which produces the minimum value for the cost-to-go, wherein the actor network is configured to determine the control input by estimating a noise level using the estimated average cost, an estimated cost-to-go determined from the approximated cost-to-go function, a control dynamics for a current state of the vehicle, and the samples of passively collected data, and wherein the approximated cost-to-go function is determined using a linear combination of weighted radial basis functions in accordance with the following relationship: Z ^ ( x ) := ∑ j = 0 N ω j f j ( x ) where ω are weights f j are j-th radial basis functions, N is a number of radial basis functions used for determining the approximated the cost-to-go function, and {circumflex over (Z)}(x) is the-approximated cost-to-go function. 3 . The method of claim 2 wherein weights co used in the approximated cost-to-go function are updated in accordance with the following relationship: ω i + 1 = ω ~ i + λ 1 + ∑ i = 0 N λ 2 l δ jl + λ 3 Z ^ avg f k where δ ij denotes a Dirac delta function, superscript denotes a number of iterations, λ 1 , λ 2 , λ 3 are Lagrangian multipliers, and {circumflex over (Z)} avg is an estimated average cost. 4 . The method of claim 1 further comprising the step of updating parameters of the critic network using an approximated temporal difference error determined using a linearized version of a Bellman equation. 5 . The method of claim 4 wherein updating of the critic network parameters is performed when the vehicle is in motion. 6 . A computer-implemented method of adaptively controlling an autonomous operation of a vehicle, the method comprising: a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected data and a state cost, an estimated average cost, and an approximated cost-to-go function that produces a minimum value for a cost-to-go of the vehicle when applied by an actor network; and b) in an actor network in the computing system and operatively coupled to the critic network, determining a control input to apply to the vehicle which produces the minimum value for the cost-to-go, Wherein the actor network is configured to determine the control input by estimating a noise level using the estimated average cost, an estimated cost-to-go determined from the approximated cost-to-go function, a control dynamics for a current state of the vehicle, and the samples of passively collected data, the method further comprising the step of updating parameters of the critic network using an approximated temporal difference error determined using a linearized version of a Bellman equation, and wherein the estimated average cost determined by the critic network is updated in accordance with the following relationship: {circumflex over (Z)} avg i+1 ={circumflex over (Z)} avg i −α Z i e k {circumflex over (Z)} k where β is a learning rate, e k is the approximated temporal difference error, {circumflex over (Z)} k is an estimated cost determined from the approximated cost-to-go function, {circumflex over (Z)} avg i is an estimated average cost in state i, and {circumflex over (Z)} avg i+1 is an estimated average cost in state i+1. 7 . The method of claim 4 , wherein passively-collected data is the only data used during updating of the critic network parameters. 8 . A computer-implemented method of adaptively controlling an autonomous operation of a vehicle, the method comprising: a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected d
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Combinations of networks · CPC title
Data processing systems or methods, management, administration · CPC title
the criterion being a learning criterion · CPC title
in which a variable is automatically adjusted to optimise the performance · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.