Method and system for vision-centric deep-learning-based road situation analysis
US-9760806-B1 · Sep 12, 2017 · US
US12033334B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12033334-B2 |
| Application number | US-202217839448-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 13, 2022 |
| Priority date | May 6, 2020 |
| Publication date | Jul 9, 2024 |
| Grant date | Jul 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A sequence of images generated at respective times by one or more sensors configured to sense an environment through which objects are moving relative to the one or more sensors is received. A message passing graph having a multiplicity of layers associated with the sequence of images is constructed. A neural network supported by the message passing graph is trained. The training includes performing a pass through the message passing graph in a forward direction including by adding a new feature node based on a feature detection and a new edge node and performing a pass through the message passing graph in a backward direction, including by updating at least one edge node of the message passing graph. Multiple features are tracked through the sequence of images, including passing messages through the message passing graph.
Opening claim text (preview).
What is claimed is: 1. A method of multi-object tracking, the method comprising: receiving, by processing hardware, a sequence of images generated at respective times by one or more sensors configured to sense an environment through which objects are moving relative to the one or more sensors; constructing, by the processing hardware, a message passing graph having a multiplicity of layers associated with the sequence of images, the constructing including: generating a plurality of feature nodes to represent features detected in at least a portion of the sequence of images, and generating edges that interconnect at least some of the feature nodes across adjacent layers of the message passing graph to represent associations between the features; training a neural network supported by the message passing graph, the training including: performing a pass through the message passing graph in a forward direction including by adding a new feature node based on a feature detection and a new edge node, and performing a pass through the message passing graph in a backward direction, including by updating at least one edge node of the message passing graph; and tracking, by the processing hardware, multiple features through the sequence of images, including passing messages through the message passing graph. 2. The method of claim 1 , wherein constructing the message passing graph further includes: generating corresponding edge nodes associated with the respective generated edges, each node of the corresponding edge nodes connected to exactly one corresponding feature node in a first layer and exactly one corresponding feature node in a second layer, the first layer immediately preceding the second layer. 3. The method of claim 2 , further comprising passing at least one of the messages from at least one of the feature nodes to at least one of the edge nodes. 4. The method of claim 2 , wherein the tracking includes: identifying a plurality of tracks, wherein at least one of the tracks is a sequence of connections between at least a portion of the edge nodes and at least a portion of the feature nodes representing associated features of a same object, through at least a portion of the multiple layers of the message passing graph. 5. The method of claim 2 , wherein constructing the message passing graph further includes: generating, for the new feature node, at least one respective memory unit configured to output a probability that the feature detection is correct; and generating, for the new edge node, at least one respective memory unit configured to output a probability that a connection between two corresponding feature nodes connected to the new edge node is correct. 6. The method of claim 5 , wherein the at least one respective memory unit is implemented as Long Short Term Memories (LSTMs). 7. The method of claim 5 , wherein the at least one respective memory unit is implemented as Gated Recurrent Units (GRUs). 8. The method of claim 1 , wherein the tracking includes: limiting the passing of the messages to only those layers of the message passing graph that are currently within a rolling window of a finite size. 9. The method of claim 8 , wherein the tracking further includes: advancing the rolling window in the forward direction in response to generating a new layer of the message passing graph, based on a new image. 10. The method of claim 9 , wherein the tracking further includes: in response to advancing the rolling window past a layer: (i) fixing parameters of the layer, and (ii) excluding any further change to the layer. 11. The method of claim 8 , wherein the size of the rolling window is between 3 and 10, measured in a number of layers. 12. The method of claim 1 , further comprising: calculating a total cross-entropy loss. 13. The method of claim 1 , wherein performing the pass in the forward direction includes pruning at least one low-probability feature node of the plurality of feature nodes. 14. The method of claim 1 , further comprising: generating an inference using the neural network supported by the message passing graph, the generating including: performing a pass through the message passing graph in the forward direction to generate probabilities, and producing one or more tracks through the message passing graph using the generated probabilities. 15. The method of claim 1 , wherein the constructing further includes, for each feature node: generating a feature vector using an objector detector; initializing a hidden state of at least one feature node of the plurality of feature nodes using the feature vector; and performing an end-to-end training of the neural network supported by the message passing graph to jointly optimize object detection and object tracking. 16. The method of claim 1 , wherein receiving the sequence of images includes receiving sensor data from at least one of a LIDAR sensor or a camera. 17. The method of claim 16 , wherein at least a portion of the detected features represents a cluster of pixels. 18. A system, comprising: a processing hardware configured to: receive a sequence of images generated at respective times by one or more sensors configured to sense an environment through which objects are moving relative to the one or more sensors; constructing a message passing graph having a multiplicity of layers associated with the sequence of images, the constructing including: generate a plurality of feature nodes to represent features detected in at least a portion of the sequence of images, and generate edges that interconnect at least some of the feature nodes across adjacent layers of the message passing graph to represent associations between the features; train a neural network supported by the message passing graph including by being configured to perform a pass through the message passing graph in a forward direction including by adding a new feature node based on a feature detection and a new edge node, and perform a pass through the message passing graph in a backward direction, including by updating at least one edge node of the message passing graph; and track multiple features through the sequence of images, including passing messages through the message passing graph. 19. The system of claim 18 , wherein the one or more sensors includes at least one of: a LIDAR sensor or a camera. 20. A computer program product, the computer program product being embodied in a non-transitory computer-readable medium and comprising computer instructions for: receiving, by processing hardware, a sequence of images generated at respective times by one or more sensors configured to sense an environment through which objects are moving relative to the one or more sensors; constructing, by the processing hardware, a message passing graph having a multiplicity of layers associated with the sequence of images, the constructing including: generating a plurality of feature nodes to represent features detected in at least a portion of the sequence of images, and generating edges that interconnect at least some of the feature nodes across adjacent layers of the message passing graph to represent associations between the features; training a neural network supported by the message passing graph, the training including: performing a pass through the message passing graph in a forward direction including by adding a new feature node based on a feature detection and a new edge node, and performing a pass through the message passing graph in a backwar
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Supervised learning · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
using neural networks · CPC title
Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.