Agent behavior model for simulation control
US-2021370972-A1 · Dec 2, 2021 · US
US12091042B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12091042-B2 |
| Application number | US-202117391099-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 2, 2021 |
| Priority date | Aug 2, 2021 |
| Publication date | Sep 17, 2024 |
| Grant date | Sep 17, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for training an autonomous vehicle (AV) motion planning model are disclosed. The system receives a log of data representing objects detected by an AV over time. The system identifies a group of sample times in the log. Each sample time represents a time at which the AV made a choice in response to a state of an object. For each of the sample times, the system will generate candidate trajectories for the AV, and it will output the candidate trajectories on a display. The system will receive a label with a rating for each candidate trajectory. The system will then save, to a data set, each of the candidate trajectories in association with its label and the data from the log for its corresponding sample time. The system may then apply the data set to an AV motion planning model to train the model.
Opening claim text (preview).
The invention claimed is: 1. A computer-implemented method of training an autonomous vehicle motion planning model, the method comprising: receiving a data log comprising data representing one or more objects detected by an autonomous vehicle (AV) over a time period; identifying a group of sample times in the data log, in which each of the sample times represents a time at which a motion planning system of the AV made a choice in response to a state of one or more of the objects; for each of the sample times, generating a plurality of candidate trajectories for the AV; outputting the plurality of candidate trajectories on a display device; receiving, via a user interface, a label for each of the candidate trajectories, wherein the label for each candidate trajectory includes a rating for that candidate trajectory; saving, to a data set, each of the candidate trajectories in association with its label and the data from the log for its corresponding sample time; applying the data set to an AV motion planning model to train the AV motion planning model; in response to receiving, via the user interface, a request to play the data log, causing the display device to play a scene that includes the AV moving along a route and one or more of actors; pausing the scene at one of the sample times and outputting the plurality of candidate trajectories while the pausing occurs; and resuming the scene after receiving the labels for the plurality of candidate trajectories. 2. The method of claim 1 further comprising, before receiving the data log: by a perception system of the AV as the AV travels through a scene, using sensors to collect the data representing the one or more objects in the scene; and generating the data log from the collected data. 3. The method of claim 1 further comprising, before receiving the data log: using a simulation system to generate the data representing the one or more objects in a simulated scene that includes the AV; and applying the simulated scene to the motion planning model to generate simulated responses of the AV to the states of one or more of the objects. 4. The method of claim 1 further comprising generating a loss value for each of the candidate trajectories. 5. The method of claim 1 , wherein applying the data set to an AV motion planning model to train the AV motion planning model comprises using a cost function to train the AV motion planning model to minimize the cost function or maximize a value function. 6. The method of claim 1 further comprising, before outputting the plurality of candidate trajectories on the display device: using one or more parameters detected in the scene to generate proposed labels for at least some of the candidate trajectories; and presenting the proposed labels on the display device for user acceptance or rejection. 7. The method of claim 6 , wherein the one or more parameters comprise a sound emitted by another actor in the scene. 8. The method of claim 1 , wherein: the one or more objects comprise a traffic signal; and the state of the traffic signal comprises a traffic light state. 9. The method of claim 1 , wherein: the one or more objects comprise a moving actor; and the state of the moving actor comprises a predicted trajectory. 10. The method of claim 1 , further comprising: by an AV motion planning system, using the AV motion planning model to generate a real-world trajectory for a vehicle, and by an AV control system, causing the vehicle to follow the real-world trajectory; receiving feedback that includes a rating of the real-world trajectory; saving the feedback to the data set to yield an updated data set; and using the updated data set to refine training of the AV motion planning model. 11. A system for training an autonomous vehicle motion planning model, the system comprising: a display device; a processor; and a memory containing programming instructions that are configured to instruct the processor to: receive a data log comprising data representing one or more objects detected by an autonomous vehicle (AV) over a time period, identify a group of sample times in the data log, in which each of the sample times represents a time at which a motion planning system of the AV made a choice in response to a state of one or more of the objects, for each of the sample times, generate a plurality of candidate trajectories for the AV, output the plurality of candidate trajectories on the display device, receive, via a user interface, a label for each of the candidate trajectories, wherein the label for each candidate trajectory includes a rating for that candidate trajectory, save, to a data set, each of the candidate trajectories in association with its label and the data from the log for its corresponding sample time, apply the data set to an AV motion planning model to train the AV motion planning model, in response to receiving, via the user interface, a request to play the data log, play on the display device, a scene that includes the AV moving along a route and one or more of actors, pause the scene at one of the sample times and output the plurality of candidate trajectories while the pausing occurs, and resume the scene after receiving the labels for the plurality of candidate trajectories. 12. The system of claim 11 , further comprising a vehicle that includes a perception system that is configured to, as the vehicle travels through a scene: use sensors to collect the data representing the one or more objects in the scene; an generate the data log from the collected data. 13. The system of claim 11 , further comprising additional programming instructions that are configured to cause the processor to: use a simulation system to generate the data representing the one or more objects in a simulated scene that includes the AV; and apply the simulated scene to the motion planning model to generate simulated responses of the AV to the states of one or more of the objects. 14. The system of claim 11 , further comprising additional programming instructions that are configured to cause the processor to implement one or more of the following; generate a loss value for each of the candidate trajectories; or use a cost function to train the AV motion planning model to minimize the cost function. 15. The system of claim 11 , further comprising additional programming instructions that are configured to cause the processor to, before outputting the plurality of candidate trajectories on the display device: use one or more parameters detected in the scene to generate proposed labels for at least some of the candidate trajectories; and present the proposed labels on the display device for user acceptance or rejection. 16. The system of claim 11 , further comprising a vehicle that includes: a vehicle motion planning system that is configured to use the AV motion planning model to generate a real-world trajectory for a vehicle, and a vehicle control system that is configured to cause the vehicle to follow the real-world trajectory; wherein the system further comprises programming instructions that are configured to cause the processor to: receive feedback that includes a rating of the real-world trajectory, save the feedback to the data set to yield an updated data set, and use the updated data set to refine training of the AV motion planning model. 17. A computer program product comprising a memory device that stores programming instructions that are configured to cause a processor to train an autonomous vehicle motion planning model
Radar; Laser, e.g. lidar · CPC title
Image sensing, e.g. optical camera · CPC title
Special cost functions, i.e. other than distance or default speed limit of road segments · CPC title
Display means · CPC title
Means for informing the driver, warning the driver or prompting a driver intervention · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.