Training bounding box selection

US10936902B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10936902-B1
Application numberUS-201816201889-A
CountryUS
Kind codeB1
Filing dateNov 27, 2018
Priority dateNov 27, 2018
Publication dateMar 2, 2021
Grant dateMar 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques to train a model with machine learning and use the trained model to select a bounding box that represents an object are described. For example, a system may implement various techniques to generate multiple bounding boxes for an object in an environment. Each bounding box may be slightly different based on the technique and data used. To select a bounding box that most closely represents an object (or is best used for tracking the object), a model may be trained. The model may be trained by processing sensor data that has been annotated with bounding boxes that represent ground truth bounding boxes. The model may be implemented to select a most appropriate bounding box for a situation (e.g., a given velocity, acceleration, distance, location, etc.). The selected bounding box may be used to track an object, generate a trajectory, or otherwise control a vehicle.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving sensor data from one or more sensors associated with an autonomous vehicle; receiving annotated data indicating a ground truth bounding box for an object represented in the sensor data; determining, by a perception system, a plurality of bounding boxes associated with the object; determining, by the perception system, a first object track based at least in part on the sensor data; associating the first object track with the plurality of bounding boxes; determining, by the perception system, a second object track based at least in part on the sensor data; determining a first score for an association between the first object track and the ground truth bounding box, the first score indicating how closely the ground truth bounding box matches the first object track; determining a second score for an association between the second object track and the ground truth bounding box, the second score indicating how closely the ground truth bounding box matches the second object track; selecting the first object track based at least in part on the first score and the second score; determining one or more characteristics associated with the sensor data; and providing the one or more characteristics and the plurality of bounding boxes associated with the first object track to a machine learned model to train the machine learned model to output an output bounding box. 2. The method of claim 1 , wherein the one or more characteristics comprise at least one of: a velocity of the object when the sensor data was captured; a velocity of the autonomous vehicle when the sensor data was captured; a distance from the autonomous vehicle to the object when the sensor data was captured; a number of frames associated with the first object track; a geolocation; a confidence associated with a technique used to determine at least one of the plurality of bounding boxes; a proximity of the autonomous vehicle or the object to a road feature when the sensor data was captured; or a ratio of empty space to occupied space within at least one of the plurality of bounding boxes. 3. The method of claim 1 , further comprising: training, based at least in part on the one or more characteristics and the plurality of bounding boxes, the machine learned model to select, as the output bounding box, a type of bounding box from among a plurality of types of bounding boxes associated with the plurality of bounding boxes, respectively, the type of bounding box indicating a technique used to determine the respective bounding box. 4. The method of claim 1 , further comprising: training, based at least in part on the one or more characteristics and the plurality of bounding boxes, the machine learned model to determine the output bounding box. 5. The method of claim 1 , wherein the sensor data comprises first sensor data that is associated with a first frame, and the method further comprises: receiving second sensor data from the one or more sensors, the second sensor data being associated with a second frame; receiving additional annotated data indicating an additional ground truth bounding box for the object; determining a third score for an association between the first object track and the additional ground truth bounding box; and aggregating the first score and the third score to generate an aggregated score for the first object track, wherein the selecting the first object track is based at least in part on the aggregated score for the first object track. 6. A system comprising: one or more processors; and memory storing executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: determining an object track based at least in part on sensor data received from one or more sensors, the object track comprising one or more of historical positions, historical velocities, historical orientations, or historical accelerations of an object; determining a similarity score between a ground truth bounding box and the object track, the object track associated with a plurality of bounding boxes; selecting, as a selected object track and based at least in part on the similarity score, the object track; determining one or more characteristics associated with the sensor data; and training, based at least in part on the one or more characteristics and the plurality of bounding boxes, a machine learned model to output an output bounding box. 7. The system of claim 6 , wherein the one or more characteristics comprise at least one of: a velocity of the object when the sensor data was captured, a velocity of the system when the sensor data was captured, a geolocation of the system, or a distance to the object when the sensor data was captured. 8. The system of claim 6 , wherein the one or more characteristics comprise at least one of: a number of frames associated with the object track, a confidence associated with a technique used to determine at least one of the plurality of bounding boxes, a proximity of the system or the object to a road feature when the sensor data was captured, or an amount of empty space within at least one of the plurality of bounding boxes. 9. The system of claim 6 , wherein the operations further comprise: determining, with a first technique, a first bounding box of the plurality of bounding boxes; and determining, with a second technique, a second bounding box of the plurality of bounding boxes. 10. The system of claim 6 , wherein the sensor data comprises first sensor data that is associated with a first frame, and the operations further comprise: receiving second sensor data from the one or more sensors, the second sensor data being associated with a second frame; receiving data indicating the ground truth bounding box for the object in the second frame; determining an additional similarity score between the object track and the ground truth bounding box; and aggregating the similarity score and the additional similarity score to generate an aggregated score for the object track, wherein the selecting the object track is based at least in part on the aggregated score for the object track. 11. The system of claim 10 , wherein the operations further comprise: mapping, for the second frame, and based at least in part on the aggregated score, the ground truth bounding box with the object track; receiving third sensor data from the one or more sensors, the third sensor data being associated with a third frame; receiving data indicating the ground truth bounding box for the object in the third frame and indicating another ground truth bounding box for another object in the third frame; and mapping, for the third frame, the other ground truth bounding box with the other object track while refraining from considering the ground truth bounding box for the object in the third frame and the object track. 12. The system of claim 6 , wherein the training the machine learned model comprises training, based at least in part on the ground truth bounding box, the one or more characteristics, and the plurality of bounding boxes, the machine learned model to output, as the output bounding box, a type of bounding box from among a plurality of types of bounding boxes associated with the plurality of bounding boxes, respectively. 13. The system of claim 6 , wherein the training the machine learned model comprises training, based at least in part on the ground truth bounding box, the one or more characteristics, and the plurality of bounding boxes, the machine learned model to determine the output bounding box.

Assignees

Inventors

Classifications

  • G06T17/00Primary

    Three-dimensional [3D] modelling for computer graphics · CPC title

  • Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

  • relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking · CPC title

  • Detecting or recognising potential candidate objects based on visual cues, e.g. shapes · CPC title

  • using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10936902B1 cover?
Techniques to train a model with machine learning and use the trained model to select a bounding box that represents an object are described. For example, a system may implement various techniques to generate multiple bounding boxes for an object in an environment. Each bounding box may be slightly different based on the technique and data used. To select a bounding box that most closely repres…
Who is the assignee on this patent?
Zoox Inc
What technology area does this patent fall under?
Primary CPC classification G06T17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).