Object Association for Autonomous Vehicles
US-2019333232-A1 · Oct 31, 2019 · US
US12012127B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12012127-B2 |
| Application number | US-202016779576-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 31, 2020 |
| Priority date | Oct 26, 2019 |
| Publication date | Jun 18, 2024 |
| Grant date | Jun 18, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Tracking a current and/or previous position, velocity, acceleration, and/or heading of an object using sensor data may comprise determining whether to associate a current object detection generated from recently received (e.g., current) sensor data with a previous object detection generated from formerly received sensor data. In other words, a track may identify that an object detected in former sensor data is the same object detected in current sensor data. However, multiple types of sensor data may be used to detect objects and some objects may not be detected by different sensor types or may be detected differently, which may confound attempts to track an object. An ML model may be trained to receive outputs associated with different sensor types and/or a track associated with an object, and determine a data structure comprising a region of interest, object classification, and/or a pose associated with the object.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving a first object detection associated with a first sensor type and a second object detection associated with a second sensor type, the first object detection and the second object detection identifying an object in an environment surrounding an autonomous vehicle; determining, based at least in part on previous sensor data, a previous track associated with the object, the previous track identifying at least one of an estimated previous position of the object, a previous region of interest, or a previous velocity of the object; inputting the first object detection, the second object detection, and at least part of the previous track into a machine learning (ML) model; receiving, from the ML model, a data structure comprising a region of interest, object classification, and a pose associated with the object, the pose indicating at least one of a position or a yaw associated with the object; determining, based at least in part on the data structure, a new track associated with the object wherein the new track indicates that the object detected in the previous sensor data is a same object detected in current sensor data; updating, based at least in part on the data structure, one or more previous tracks by retiring the one or more previous tracks, wherein retiring the one or more previous tracks comprises indicating that the object associated with the one or more previous tracks has been occluded for a threshold amount of time; and controlling the autonomous vehicle based at least in part on the new track. 2. The method of claim 1 , wherein the data structure additionally comprises at least one of an indication that the object is stationary or dynamic, a top-down segmentation of the environment, a yaw rate, a velocity associated with the object, or an acceleration associated with the object. 3. The method of claim 1 , wherein determining the new track comprises: determining a degree of alignment of the region of interest to the previous region of interest; and determining that the degree of alignment meets or exceeds a threshold degree of alignment. 4. The method of claim 1 , further comprising: receiving a first prior object detection associated with a first time previous to a second time at which the first object detection was generated; receiving a second prior object detection associated with a third time previous to a fourth time at which the second object detection was generated; and inputting the first prior object detection and the second prior object detection to the ML model in addition to the first object detection, the second object detection, and the previous track. 5. The method of claim 1 , wherein inputting the first object detection, the second object detection, and at least part of the previous track comprises: generating a multi-channel data structure based at least in part on the first object detection, the second object detection, and at least part of the previous track, wherein generating the multi-channel data structure comprises encoding attributes associated with the environment into channels of the multi-channel data structure based at least in part on the first object detection, the second object detection, and at least part of the previous track; and inputting the multi-channel data structure to the ML model. 6. The method of claim 1 , wherein: the first object detection is based at least in part on sensor data that has a first perspective of the environment; the data structure indicates a top-down perspective of the environment; and the top-down perspective is different than the first perspective. 7. The method of claim 1 , further comprising reducing, based at least in part on comparing object detections from each of multiple different sensor modalities to the previous track, jitter associated with the new track. 8. The method of claim 1 , wherein updating the one or more previous tracks further comprises at least one of associating one or more of the first object detection or the second object detection with the one or more previous tracks or indicating that the one or more previous tracks associated with an object is partially or fully occluded. 9. A system comprising: one or more processors; and a memory storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: receiving first sensor data and second sensor data; inputting the first sensor data to a first perception pipeline and inputting the second sensor data to a second perception pipeline; receiving a first output from the first perception pipeline based at least in part on the first sensor data and a second output from the second perception pipeline, the first output and the second output identifying an object in an environment; receiving a previous track associated with the object in the environment, the previous track identifying at least one of an estimated previous position of the object, a previous region of interest, or a previous velocity of the object; inputting the first output, the second output, and at least part of the previous track into a machine-learning (ML) model; receiving, from the ML model, a data structure comprising a region of interest, object classification, and a pose associated with the object, the pose indicating at least one of a position or a yaw associated with the object; determining an updated track associated with the object based at least in part on the data structure, a current position, and at least one of the region of interest or the yaw associated with the object; and updating, based at least in part on the data structure, one or more previous tracks by retiring the one or more previous tracks, wherein retiring the one or more previous tracks comprises indicating that the object associated with the one or more previous tracks has been occluded for a threshold amount of time. 10. The system of claim 9 , wherein the data structure additionally comprises at least one of an indication that the object is stationary or dynamic, a top-down segmentation of the environment, a yaw rate, a velocity associated with the object, or an acceleration associated with the object. 11. The system of claim 9 , wherein: a third output indicates that a second portion of the environment associated with the first output and the second output is unoccupied; and the third output is provided as input to the ML model in addition to the first output and the second output. 12. The system of claim 9 , wherein determining the updated track comprises: determining a degree of alignment of the region of interest to the previous region of interest; and determining that the degree of alignment meets or exceeds a threshold degree of alignment. 13. The system of claim 9 , wherein at least one of the first output or the second output comprises at least one of: a first representation of the environment from a top-down perspective; an indication that a second portion of the environment is occupied; a second representation of an occluded portion of the environment; a second region of interest associated with the object; a classification associated with the object; a sensor data segmentation; a three-dimensional discretized representation of sensor data; a yaw associated with the object; a yaw rate associated with the object; a ground height estimation; a set of extents associated with the object; a velocity associated with the object; or an acceleration associated with the object. 14. The system of claim 9 , wherein the operations further compr
using signals provided by artificial sources external to the vehicle, e.g. navigation beacons · CPC title
Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level (multimodal speaker identification or verification G10L17/10) · CPC title
Classification techniques · CPC title
Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Combination of methods, e.g. classifiers, working on different input data, e.g. sensor fusion · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.