Multi-task machine-learned models for object intention determination in autonomous driving

US11794785B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11794785-B2
Application numberUS-202217749841-A
CountryUS
Kind codeB2
Filing dateMay 20, 2022
Priority dateJun 15, 2018
Publication dateOct 24, 2023
Grant dateOct 24, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Generally, the disclosed systems and methods utilize multi-task machine-learned models for object intention determination in autonomous driving applications. For example, a computing system can receive sensor data obtained relative to an autonomous vehicle and map data associated with a surrounding geographic environment of the autonomous vehicle. The sensor data and map data can be provided as input to a machine-learned intent model. The computing system can receive a jointly determined prediction from the machine-learned intent model for multiple outputs including at least one detection output indicative of one or more objects detected within the surrounding environment of the autonomous vehicle, a first corresponding forecasting output descriptive of a trajectory indicative of an expected path of the one or more objects towards a goal location, and/or a second corresponding forecasting output descriptive of a discrete behavior intention determined from a predefined group of possible behavior intentions.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for detecting and forecasting an actor in an environment of an autonomous vehicle, comprising: obtaining sensor data descriptive of the environment of the autonomous vehicle, the environment containing the actor; obtaining map data associated with the environment; and processing the sensor data and the map data with a single forward pass through a machine-learned model that is trained to jointly implement detection and forecasting to generate: a detection output descriptive of the actor in the environment; a first forecasting output descriptive of a trajectory indicative of an expected path of the actor towards a goal location; and a second forecasting output descriptive of a discrete behavior intention determined from a predefined group of possible behavior intentions for the actor. 2. The method of claim 1 , wherein the detection output comprises a detection score indicative of a likelihood of the actor being in one of a plurality of predetermined classes. 3. The method of claim 1 , wherein the detection output comprises a bounding box associated with the actor. 4. The method of claim 1 , wherein the first forecasting output is represented by trajectory data comprising a sequence of bounding shapes at a plurality of time stamps. 5. The method of claim 1 , wherein the predefined group of possible behavior intentions for the actor comprises at least one of keep lane, turn left, turn right, left change lane, right change lane, stopped, parked, or reverse driving. 6. The method of claim 1 , wherein the machine-learned model comprises a plurality of shared layers that are used at least in part for determining the detection output, the first forecasting output, and the second forecasting output. 7. The method of claim 1 , wherein processing the sensor data and the map data with the machine-learned model comprises processing a fused representation of a given view of the sensor data obtained relative to the autonomous vehicle and the map data in the given view. 8. The method of claim 7 , wherein the given view of the sensor data and the map data comprises a birds-eye view. 9. The method of claim 7 , wherein the given view of the sensor data is represented as a multi-dimensional tensor having at least one of a height dimension or a time dimension stacked into a channel dimension with the multi-dimensional tensor. 10. An autonomous vehicle (AV) control system, comprising: one or more processors; and one or more non-transitory computer-readable media that store instructions for execution by the one or more processors that cause the AV control system to perform operations, the operations comprising: obtaining sensor data descriptive of an environment of an autonomous vehicle, the environment containing an actor; obtaining map data associated with the environment; and processing the sensor data and the map data with a single forward pass through a machine-learned model that is trained to jointly implement detection and forecasting to generate: a detection output descriptive of the actor in the environment; a first forecasting output descriptive of a trajectory indicative of an expected path of the actor towards a goal location; and a second forecasting output descriptive of a discrete behavior intention determined from a predefined group of possible behavior intentions for the actor. 11. The AV control system of claim 10 , wherein processing the sensor data and the map data with the single forward pass through the machine-learned model comprises processing a fused representation of a given view of the sensor data obtained relative to the autonomous vehicle and the map data in the given view. 12. The AV control system of claim 10 , wherein the detection output comprises a detection score indicative of a likelihood of the actor being in one of a plurality of predetermined classes. 13. The AV control system of claim 10 , wherein the first forecasting output is represented by trajectory data comprising a sequence of bounding shapes at a plurality of time stamps. 14. The AV control system of claim 10 , wherein the predefined group of possible behavior intentions for the actor comprises at least one of keep lane, turn left, turn right, left change lane, right change lane, stopped, parked, or reverse driving. 15. The AV control system of claim 10 , wherein the machine-learned model comprises a plurality of shared layers that are used at least in part for determining the detection output, the first forecasting output, and the second forecasting output. 16. An autonomous vehicle, comprising: one or more sensors that generate sensor data relative to the autonomous vehicle; one or more processors; and one or more non-transitory computer-readable media that store instructions for execution by the one or more processors that cause the one or more processors to perform operations, the operations comprising: obtaining sensor data descriptive of an environment of an autonomous vehicle, the environment containing an actor; obtaining map data associated with the environment; and processing the sensor data and the map data with a single forward pass through a machine-learned model that is trained to jointly implement detection and forecasting to generate: a detection output descriptive of the actor in the environment; a first forecasting output descriptive of a trajectory indicative of an expected path of the actor towards a goal location; and a second forecasting output descriptive of a discrete behavior intention determined from a predefined group of possible behavior intentions for the actor. 17. The autonomous vehicle of claim 16 , wherein processing the sensor data and the map data with the machine-learned model comprises processing a fused representation of a given view of the sensor data obtained relative to the autonomous vehicle and the map data in the given view. 18. The autonomous vehicle of claim 17 , wherein the given view of the sensor data and the map data comprises a birds-eye view. 19. The autonomous vehicle of claim 17 , wherein the given view of the sensor data is represented as a multi-dimensional tensor having at least one of a height dimension or a time dimension stacked into a channel dimension with the multi-dimensional tensor. 20. The autonomous vehicle of claim 16 , the operations further comprising controlling the autonomous vehicle to execute a motion plan determined at least in part from the detection output, the first forecasting output, and the second forecasting output.

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • for two or more other traffic participants · CPC title

  • using trajectory prediction for other traffic participants · CPC title

  • the prediction being responsive to traffic or environmental parameters · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11794785B2 cover?
Generally, the disclosed systems and methods utilize multi-task machine-learned models for object intention determination in autonomous driving applications. For example, a computing system can receive sensor data obtained relative to an autonomous vehicle and map data associated with a surrounding geographic environment of the autonomous vehicle. The sensor data and map data can be provided as…
Who is the assignee on this patent?
Uatc Llc
What technology area does this patent fall under?
Primary CPC classification B60W60/0027. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Oct 24 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).