Speech recognition systems and methods
US-2022157294-A1 · May 19, 2022 · US
US12330689B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12330689-B2 |
| Application number | US-202217727617-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 22, 2022 |
| Priority date | Apr 23, 2021 |
| Publication date | Jun 17, 2025 |
| Grant date | Jun 17, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are methods for predicting agent trajectories, which can include generating a graph corresponding to a map of a scene by encoding map features and agent features as node encodings of the graph and determining a policy for application to outgoing edges of the nodes of the graph. Some methods described also include sampling paths for a target vehicle in the scene according to the policy and predicting a set of trajectories based on the sampled paths traversed by the policy and a sampled latent variable. Systems and computer program products are also provided.
Opening claim text (preview).
What is claimed is: 1. A method comprising: generating, using at least one processor, a graph corresponding to a map of a scene by encoding map features and agent features as node encodings of the graph; determining, using the at least one processor, a policy for application to outgoing edges at nodes of the graph; sampling, using the at least one processor, paths for a target vehicle in the scene according to the policy; predicting, using the at least one processor, a set of trajectories based on the sampled paths traversed by the policy and a sampled latent variable; and operating, using the at least one processor, a vehicle based on the set of trajectories of the target vehicle, wherein predicting the set of trajectories comprises: outputting a context vector for the policy using a multi-head attention layer; and combining the context vector with motion encodings and the sampled latent variable to predict the set of trajectories. 2. The method of claim 1 , wherein a respective node corresponds to a segment of a lane centerline of the map. 3. The method of claim 1 , further comprising updating the node encodings with surrounding agent encodings by calculating scaled dot product attention weights. 4. The method of claim 1 , comprising aggregating local context from neighboring nodes into the node encodings of the graph using a graph neural network. 5. The method of claim 1 , wherein the policy for application to the outgoing edges is a discrete probability distribution over the outgoing edges at the nodes of the graph. 6. The method of claim 1 , wherein the policy is predicted by training a multilayer perceptron (MLP) using behavior cloning. 7. The method of claim 1 , comprising selectively aggregating context along the sampled paths, and predicting the set of trajectories based on the sampled paths traversed by the policy, the aggregated context, and the sampled latent variable. 8. The method of claim 7 , wherein predicting the set of trajectories comprises: concatenating the aggregated context and the sampled latent variable with the motion encodings; and inputting the concatenated aggregated context and the sampled latent variable to a multilayer perceptron, wherein the set of trajectories indicates predicted locations at future time steps. 9. A system, comprising: a graph encoder to encode high definition maps and agent features into a graph for generating final node encodings, wherein the graph includes nodes and edges, the nodes representing segments of a lane centerline and edges representing transitions between nodes, wherein the graph is used to generate the final node encodings; a policy header to learn a policy for sampled graph traversals based on a motion of a target vehicle as well as local scene and agent context at neighboring nodes; and a trajectory decoder to predict trajectories based on node encodings along paths traversed by the policy and a sampled latent variable, wherein the trajectory decoder comprising a multi-head attention layer configured to output a context vector for the policy, wherein the context vector is combined with motion encodings and the sampled latent variable to predict the trajectories. 10. The system of claim 9 , wherein the policy is a discrete probability distribution of transitions associated with a respective edge at a respective node. 11. The system of claim 9 , wherein the graph encoder includes one or more gated recurrent units to encode target vehicle trajectories, surrounding vehicle trajectories, and node features. 12. The system of claim 9 , wherein initial node encodings are updated with surrounding agent encodings by calculating scaled dot product attention weights to generate the final node encodings. 13. The system of claim 9 , wherein the graph encoder is configured to aggregate local context from neighboring nodes into the final node encodings of the graph using a graph neural network. 14. At least one non-transitory storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to: generate a graph corresponding to a map of a scene by encoding map features and agent features as node encodings of the graph; determine a policy for application to outgoing edges at nodes of the graph; sample paths for a target vehicle in the scene according to the policy; predict a set of trajectories based on the sampled paths traversed by the policy and a sampled latent variable; and operate a vehicle based on the set of trajectories of the target vehicle, wherein to predict the set of trajectories, the at least one processor is further caused to: output a context vector for the policy using a multi-head attention layer; and combine the context vector with motion encodings and the sampled latent variable to predict the set of trajectories. 15. The at least one non-transitory storage medium of claim 14 , wherein a respective node corresponds to a segment of a lane centerline of the map. 16. The at least one non-transitory storage medium of claim 14 , comprising updating the node encodings with surrounding agent encodings by calculating scaled dot product attention weights. 17. The at least one non-transitory storage medium of claim 14 , comprising aggregating local context from neighboring nodes into the node encodings of the graph using a graph neural network. 18. The at least one non-transitory storage medium of claim 14 , wherein the policy for application to the outgoing edges is a discrete probability distribution over the outgoing edges at nodes of the graph. 19. The at least one non-transitory storage medium of claim 14 , wherein the policy is predicted by training a multilayer perceptron (MLP) using behavior cloning.
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Learning methods · CPC title
High definition maps · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.