Apparatus and method for preventing incorrect boarding of autonomous driving vehicle
US-2020357285-A1 · Nov 12, 2020 · US
US11537954B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11537954-B2 |
| Application number | US-201816237103-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 31, 2018 |
| Priority date | Sep 4, 2018 |
| Publication date | Dec 27, 2022 |
| Grant date | Dec 27, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are provided for ride order dispatching and vehicle repositioning. A method for ride order dispatching and vehicle repositioning, comprises: obtaining information comprising a location of a vehicle, current orders, and a current time; inputting the obtained information to a trained model; and determining action information for the vehicle based on an output of the trained model, the action information comprising: re-positioning the vehicle or accepting a ride order. The model is configured with: receiving information of drivers and information of orders as inputs; obtaining a global state based on the information of drivers, the information of orders, and a global time; and querying a plurality of driver-order pairs and driver-reposition pairs based at least on the obtained global state to determine the action information as the output.
Opening claim text (preview).
The invention claimed is: 1. A method for ride order dispatching and vehicle repositioning, comprising: obtaining information comprising a location of a vehicle, current orders, and a current time; inputting the obtained information to a trained model of a model; and determining action information for the vehicle based on an output of the trained model, the action information comprising: re-positioning the vehicle or accepting a ride order, wherein: the model comprises a single-driver deep-Q network (SD-DQN) and is configured with model instructions for performing: receiving information of drivers and information of orders as inputs; obtaining a global state based on the information of drivers, the information of orders, and a global time, each state transition of the global state being from a single driver completing a trip to the single driver completing a next trip; querying a plurality of driver-order pairs and driver-reposition pairs based at least on the obtained global state to determine a Q-value of the SD-DQN for the single driver; determining the action information as the output based at least on the determined Q-value to optimize a return for the single driver; and controlling the vehicle to execute the determined action information. 2. The method of claim 1 , further comprising: providing the action information to the vehicle, wherein the action information maximizes a reward for the vehicle's driver. 3. The method of claim 1 , wherein: for each of the drivers, the information of drivers comprises two-dimensional location coordinates and a step-left scalar value; and the step-left scalar value is based on a time of arrival at a destination of a currently dispatched order. 4. The method of claim 1 , wherein: for each of the orders, the information of orders comprises two-dimensional start location coordinates, two-dimensional end location coordinates, a price scale value, and a time waiting scalar value; and the time waiting scalar value is a time since the each of the orders started requesting a vehicle dispatch. 5. The method of claim 1 , wherein obtaining the global state based on the information of drivers, the information of orders, and the global time comprises: embedding the information of drivers and the information of orders in memory cells to obtain driver embedding and order embedding; performing a round of attention of an attention mechanism to obtain driver context based on the driver embedding and obtain order context based on the order embedding; and concatenating the driver embedding, the order embedding, and the global time to obtain the global state. 6. The method of claim 5 , wherein querying the plurality of driver-order pairs and driver-reposition pairs based at least on the obtained global state to determine the action information as the output comprises: querying all driver-order pairs with respect to the drivers and the orders to obtain first Q-values respectively, and querying all driver-reposition pairs with respect to the drivers and repositioning movements to obtain second Q-values respectively; obtaining a maximum Q-value among the first and second Q-values; determining an optimal driver-order pair or an optimal driver-reposition pair associated with the maximum Q-value; and determining the action information as dispatching a corresponding driver to fulfill a corresponding order according to the optimal driver-order pair or repositioning a corresponding driver according to the optimal driver-reposition pair. 7. The method of claim 6 , wherein repositioning the corresponding driver comprises: staying at a current location of the corresponding driver. 8. The method of claim 6 , wherein querying all the driver-order pairs with respect to the drivers and the orders to obtain the first Q-values respectively comprises: determining the first Q-values respectively based on a first neural network; and the first neural network takes the driver embedding, the order embedding, and the global state as inputs. 9. The method of claim 6 , wherein querying all the driver-reposition pairs with respect to the drivers and the repositioning movements to obtain the second Q-values respectively comprises: determining the second Q-values respectively based on a second neural network; and the second neural network takes the driver embedding, repositioning movement embedding, and the global state as inputs, wherein the repositioning movement embedding is obtained by embedding the repositioning movements. 10. A system for ride order dispatching and vehicle repositioning, comprising a processor and a non-transitory computer-readable storage medium storing instructions that, when executed by the processor, cause the system to perform operations comprising: obtaining information comprising a location of a vehicle, current orders, and a current time; inputting the obtained information to a trained model of a model; and determining action information for the vehicle based on an output of the trained model, the action information comprising: re-positioning the vehicle or accepting a ride order, wherein: the model comprises a single-driver deep-Q network (SD-DQN) and is configured with model instructions for performing: receiving information of drivers and information of orders as inputs; obtaining a global state based on the information of drivers, the information of orders, and a global time, each state transition of the global state being from a single driver completing a trip to the single driver completing a next trip; querying a plurality of driver-order pairs and driver-reposition pairs based at least on the obtained global state to determine a Q-value of the SD-DQN for the single driver; determining the action information as the output based at least on the determined Q-value to optimize a return for the single driver; and controlling the vehicle to execute the determined action information. 11. The system of claim 10 , wherein the operations further comprise: providing the action information to the vehicle, wherein the action information maximizes a reward for the vehicle's driver. 12. The system of claim 10 , wherein: for each of the drivers, the information of drivers comprises two-dimensional location coordinates and a step-left scalar value; and the step-left scalar value is based on a time of arrival at a destination of a currently dispatched order. 13. The system of claim 10 , wherein: for each of the orders, the information of orders comprises two-dimensional start location coordinates, two-dimensional end location coordinates, a price scale value, and a time waiting scalar value; and the time waiting scalar value is a time since the each of the orders started requesting a vehicle dispatch. 14. The system of claim 10 , wherein obtaining the global state based on the information of drivers, the information of orders, and the global time comprises: embedding the information of drivers and the information of orders in memory cells to obtain driver embedding and order embedding; performing a round of attention of an attention mechanism to obtain driver context based on the driver embedding and obtain order context based on the order embedding; and concatenating the driver embedding, the order embedding, and the global time to obtain the global state. 15. The system of claim 14 , wherein querying the plurality of driver-order pairs and driver-reposition pairs based at least on the obtained global state to determine the action information as the output comprises: querying all driver-order pairs with respect to the drivers and th
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
Dispatching vehicles on the basis of a location, e.g. taxi dispatching · CPC title
Combinations of networks · CPC title
Rendezvous; Ride sharing · CPC title
Reservations, e.g. for tickets, services or events · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.