Reinforcement and model learning for vehicle operation

US11027751B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11027751-B2
Application numberUS-201716757936-A
CountryUS
Kind codeB2
Filing dateOct 31, 2017
Priority dateOct 31, 2017
Publication dateJun 8, 2021
Grant dateJun 8, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and vehicles may be configured to gain experience in the form of state-action and/or action-observation histories for an operational scenario as the vehicle traverses a vehicle transportation network. The histories may be incorporated into a model in the form of learning to improve the model over time. The learning may be used to improve integration with human behavior. Driver feedback may be used in the learning examples to improve future performance and to integrate with human behavior. The learning may be used to create customized scenario solutions. The learning may be used to transfer a learned solution and apply the learned solution to a similar scenario.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for use in traversing a vehicle transportation network, the method comprising: traversing, by an autonomous vehicle, a vehicle transportation network, wherein traversing the vehicle transportation network includes: determining a route of the autonomous vehicle within the vehicle transportation network; executing the route of the autonomous vehicle; detecting an operational scenario based on the route of the autonomous vehicle and a location of the autonomous vehicle; determining a particular scenario-specific operation control evaluation module based on the operational scenario, wherein the scenario-specific operation control evaluation module includes models that determine a candidate vehicle control action based on an operational environment of the autonomous vehicle, wherein the models include an exploration model based on selecting a low probability action in a semi-random manner and an exploitation model based on selecting a high probability action; instantiating a scenario-specific operational control evaluation module instance based on the particular scenario-specific operation control evaluation module; traversing a portion of the vehicle transportation network by executing the candidate vehicle control action using the exploration model or the exploitation model; observing a state resulting from the execution of the candidate vehicle control action; updating the scenario-specific operational control evaluation module instance based on the state; generating a state-action history entry based on the candidate vehicle control action and the state; and storing the state-action history entry in a scenario-specific operation control database. 2. The method of claim 1 , wherein the exploration model and the exploitation model apply a stored state-action history. 3. The method of claim 1 , wherein the exploitation model is based on selecting the candidate vehicle control action using the particular scenario-specific operation control evaluation module. 4. The method of claim 1 , further comprising: determining that the operational scenario is not complete; and determining a second particular scenario-specific operation control evaluation module based on the operational scenario. 5. The method of claim 1 , further comprising: determining that the operational scenario is complete; determining driver-initiated vehicle control actions, wherein a number of driver-initiated vehicle control actions is greater than a threshold; and creating a customized operational scenario based on the observed state and the driver-initiated vehicle control actions. 6. The method of claim 1 , wherein the state is associated with an operational environment. 7. The method of claim 1 , wherein the state is associated with a vehicle state of the autonomous vehicle. 8. The method of claim 1 , wherein the state is associated with a vehicle state of another vehicle in the vehicle transportation network. 9. The method of claim 1 further comprising: solving a second operational scenario based on the stored state-action history. 10. An autonomous vehicle comprising: a processor configured to execute instructions stored on a non-transitory computer readable medium to: determine a route of the autonomous vehicle within a vehicle transportation network; execute the route of the autonomous vehicle; detect an operational scenario based on the route of the autonomous vehicle and a location of the autonomous vehicle; determine a particular scenario-specific operation control evaluation module based on the operational scenario, wherein the scenario-specific operation control evaluation module includes a model that determines a candidate vehicle control action based on an operational environment of the autonomous vehicle, wherein the model is based on a low probability random selection of the candidate vehicle control action; instantiate a scenario-specific operational control evaluation module instance based on the particular scenario-specific operation control evaluation module; traverse a portion of the vehicle transportation network based on an execution of the candidate vehicle control action using the model; observe a state resulting from the execution of the candidate vehicle control action; and generate a state-action history entry based on the candidate vehicle control action and the state; and a memory configured to store the state-action history entry in a scenario-specific operation control database. 11. The autonomous vehicle of claim 10 , wherein the model applies the stored state-action history such that a probability of the random selection of the candidate vehicle control action diminishes over time proportional to a volume of the stored state-action history. 12. The autonomous vehicle of claim 10 , wherein the model is based on selecting the candidate vehicle control action using the particular scenario-specific operation control evaluation module. 13. The autonomous vehicle of claim 10 , wherein the processor is further configured to execute instructions stored on the non-transitory computer readable medium to: determine that the operational scenario is not complete; and determine a second particular scenario-specific operation control evaluation module based on the operational scenario. 14. The autonomous vehicle of claim 10 , wherein the processor is further configured to execute instructions stored on the non-transitory computer readable medium to: determine that the operational scenario is complete; determine driver-initiated vehicle control actions, wherein a number of driver-initiated vehicle control actions is greater than a threshold; and create a customized operational scenario based on the observed state and the driver-initiated vehicle control actions. 15. The autonomous vehicle of claim 10 , wherein the processor is further configured to execute instructions stored on the non-transitory computer readable medium to: solve a second operational scenario based on the stored state-action history.

Assignees

Inventors

Classifications

  • specially adapted for specific operations · CPC title

  • B60W60/001Primary

    Planning or execution of driving tasks · CPC title

  • involving control alternatives for a single driving scenario, e.g. planning several paths to avoid obstacles · CPC title

  • Historical data · CPC title

  • for active traffic, e.g. moving vehicles, pedestrians, bikes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11027751B2 cover?
Methods and vehicles may be configured to gain experience in the form of state-action and/or action-observation histories for an operational scenario as the vehicle traverses a vehicle transportation network. The histories may be incorporated into a model in the form of learning to improve the model over time. The learning may be used to improve integration with human behavior. Driver feedback …
Who is the assignee on this patent?
Nissan North America Inc, Univ Massachusetts, Renault Sas
What technology area does this patent fall under?
Primary CPC classification B60W60/0025. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Jun 08 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).