Systems and Methods for Latent Distribution Modeling for Scene-Consistent Motion Forecasting

US2021276595A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021276595-A1
Application numberUS-202117150995-A
CountryUS
Kind codeA1
Filing dateJan 15, 2021
Priority dateMar 5, 2020
Publication dateSep 9, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for determining scene-consistent motion forecasts from sensor data can include obtaining scene data including one or more actor features. The computer-implemented method can include providing the scene data to a latent prior model, the latent prior model configured to generate scene latent data in response to receipt of scene data, the scene latent data including one or more latent variables. The computer-implemented method can include obtaining the scene latent data from the latent prior model. The computer-implemented method can include sampling latent sample data from the scene latent data. The computer-implemented method can include providing the latent sample data to a decoder model, the decoder model configured to decode the latent sample data into a motion forecast including one or more predicted trajectories of the one or more actor features. The computer-implemented method can include receiving the motion forecast including one or more predicted trajectories of the one or more actor features from the decoder model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for determining scene-consistent motion forecasts from sensor data, the method comprising: obtaining, by a computing system comprising one or more computing devices, scene data comprising one or more actor features; providing, by the computing system, the scene data to a latent prior model, the latent prior model configured to generate scene latent data in response to receipt of scene data, the scene latent data comprising one or more latent variables; obtaining, by the computing system, the scene latent data from the latent prior model; sampling, by the computing system, latent sample data from the scene latent data; providing, by the computing system, the latent sample data to a decoder model, the decoder model configured to decode the latent sample data into a motion forecast comprising one or more predicted trajectories of the one or more actor features; and receiving, by the computing system, the motion forecast comprising one or more predicted trajectories of the one or more actor features from the decoder model. 2 . The computer-implemented method of claim 1 , wherein the decoder model comprises a deterministic decoder model. 3 . The computer-implemented method of claim 1 , wherein the decoder model comprises a specified and tractable conditional likelihood. 4 . The computer-implemented method of claim 1 , wherein the one or more latent variables are respective to the one or more actor features such that each actor feature has an associated latent variable of the scene latent data that is anchored to the actor feature. 5 . The computer-implemented method of claim 1 , wherein the one or more latent variables comprise one or more continuous latent variables. 6 . The computer-implemented method of claim 1 , further comprising: obtaining, by the computing system, one or more scene observations; providing, by the computing system, the one or more scene observations to a scene feature extraction model, the scene feature extraction model comprising one or more neural networks configured to extract one or more scene features from the one or more scene observations; receiving, by the computing system, the one or more scene features from the scene feature extraction model; providing, by the computing system, the one or more scene features to an actor feature recognition model, the actor feature recognition model configured to: extract spatial feature maps for bounding boxes from the one or more scene features by rotated region of interest align; pool a region around each spatial feature map to produce pooled actor features; downsample the pooled actor features by applying one or more downsampling convolutional neural networks; and max-pool along spatial dimensions to reduce each pooled actor feature to a respective actor feature of the one or more actor features; and receiving, by the computing system, the one or more actor features from the actor feature recognition model. 7 . The computer-implemented method of claim 1 , further comprising: sampling, by the computing system, second latent sample data from the scene latent data; providing, by the computing system, the second latent sample data to the decoder model; and receiving, by the computing system, a second motion forecast comprising one or more second predicted trajectories of the one or more actor features from the decoder model. 8 . The computer-implemented method of claim 1 , wherein at least one of the latent prior model or the decoder model comprises a scene interaction model configured to model the latent distribution as an interaction graph comprising one or more nodes representative of the one or more actor features and one or more edges representative of interactions between the one or more actor features. 9 . The computer-implemented method of claim 8 , wherein the scene interaction model comprises one or more graph neural networks. 10 . The computer-implemented method of claim 9 , wherein a message function of the one or more graph neural networks comprises a multi-layer perceptron model that takes as input one or more terminal nodes of the one or more nodes at a previous propagation step of the one or more graph neural networks. 11 . The computer-implemented method of claim 9 , wherein an aggregation function of the one or more graph neural networks comprises a feature-wise max-pooling aggregation function. 12 . The computer-implemented method of claim 9 , wherein a gated recurrent unit cell is configured to update a state of the one or more nodes. 13 . The computer-implemented method of claim 1 , wherein the one or more actor features comprise data descriptive of a context of one or more traffic participants. 14 . A computer-implemented method of training a motion forecasting system, the method comprising: obtaining, by a computing system comprising one or more computing devices, a training dataset comprising one or more training examples labeled with ground truth data, the one or more training examples comprising one or more actor features and the ground truth data comprising a ground truth context of the one or more actor features; providing, by the computing system, the one or more training examples labeled with ground truth data to a latent encoder model, the latent encoder model configured to produce a first latent distribution in response to receipt of the one or more training examples and the ground truth data; providing, by the computing system, the one or more training examples to a latent prior model, the latent prior model configured to produce a second latent distribution in response to receipt of the one or more training examples; determining, by the computing system, a training loss based at least in part on the first latent distribution and the second latent distribution; and backpropagating, by the computing system, the training loss through at least the latent prior model to train at least the latent prior model. 15 . The computer-implemented method of claim 14 , wherein the training loss comprises a KL divergence loss between the first latent distribution and the second latent distribution. 16 . The computer-implemented method of claim 14 , wherein the method further comprises: providing, by the computing system, training scene observations to a feature extraction model; receiving, by the computing system, one or more predicted features from the feature extraction model; and determining, by the computing system, a feature loss between the one or more predicted features and the ground truth data; wherein the training loss comprises the feature loss. 17 . The computer-implemented method of claim 16 , wherein the feature loss comprises a cross-entropy loss between the one or more predicted features and one or more training features of the ground truth data and a regression loss between bounding boxes of the one or more predicted features and bounding boxes of the ground truth data. 18 . The computer-implemented method of claim 14 , further comprising: sampling, by the computing system, the first latent distribution to produce one or more first latent samples; sampling, by the computing system, the second latent distribution to produce one or more second latent samples; providing, by the computing system, the one or more first latent samples to a decoder model; receiving, by the computing system, one or more first predicted trajectories from the decoder model; providing, by the computing system, the one or more second latent samples to the decoder m

Assignees

Inventors

Classifications

  • Classification techniques · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

  • using trajectory prediction for other traffic participants · CPC title

  • based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps · CPC title

  • Dispatching vehicles on the basis of a location, e.g. taxi dispatching · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021276595A1 cover?
A computer-implemented method for determining scene-consistent motion forecasts from sensor data can include obtaining scene data including one or more actor features. The computer-implemented method can include providing the scene data to a latent prior model, the latent prior model configured to generate scene latent data in response to receipt of scene data, the scene latent data including o…
Who is the assignee on this patent?
Uber Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Sep 09 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).