Object classification using extra-regional context
US-2020202145-A1 · Jun 25, 2020 · US
US10963709B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10963709-B2 |
| Application number | US-201916238475-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 2, 2019 |
| Priority date | Jan 2, 2019 |
| Publication date | Mar 30, 2021 |
| Grant date | Mar 30, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The techniques discussed herein may comprise refining a classification of an object detected as being represented in sensor data. For example, refining the classification may comprise determining a sub-classification of the object.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, from a sensor of an autonomous vehicle, an image; providing, as input to a first neural network, the image; receiving, from the first neural network, a feature map, a region of interest, a classification, and a first probability associated with an object represented in the image; providing, as input to a second neural network, at least a portion of the feature map that corresponds to the region of interest; receiving, from the second neural network, a sub-classification of the classification and a second probability associated with the subclassification; and controlling operation of the autonomous vehicle based at least in part on at least one of the classification or the sub-classification. 2. The method of claim 1 , further comprising: outputting the classification associated with the region of interest based at least in part on determining that the first probability meets or exceeds a first probability threshold; and outputting at least one of the classification or the sub-classification associated with the region of interest based at least in part on determining that the second probability meets or exceeds a second probability threshold. 3. The method of claim 1 , the method further comprising: providing, as additional input to the second portion of the neural network, at least an additional feature map received from the first neural network, wherein the first neural network is trained to output at least one of an instance segmentation or a semantic segmentation. 4. The method of claim 1 , further comprising: transmitting, to the second neural network, the portion of the feature map based at least in part on receiving the classification and the classification being associated with the second neural network; and transmitting, to a third neural network, a second portion of a second feature map, based at least in part on receiving a second classification and the second classification being associated with the third neural network. 5. The method of claim 4 , the candidate classifications comprise at least two of: a pedestrian classification; a vehicle classification; a cyclist classification; a signage classification; an animal classification; and a traffic obstruction classification. 6. A system comprising: one or more processors; memory storing processor-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: receiving sensor data; providing, as input to a first machine-learning (ML) model, the sensor data; receiving, from the first ML model, a classification associated with a representation of an object in the sensor data, a first probability associated with the classification, a feature map, and a region of interest of the sensor data associated with the representation of the object; and receiving, from a sub-class ML model, a sub-classification of the classification and a second probability associated with the sub-classification. 7. The system of claim 6 , the operations further comprising: inputting, into the sub-class ML model, at least a first portion of a first feature map received from a first portion of the first ML model and at least a second portion of a second feature map received from a second portion of the first ML model. 8. The system of claim 7 , wherein the first portion and the second portion are based at least in part on the region of interest. 9. The system of claim 7 , wherein the second feature map comprises at least one of a semantic segmentation feature map, an instance segmentation feature map, a dense depth feature map, or an object orientation feature map. 10. The system of claim 6 , wherein the operations further comprise: outputting the classification associated with the object based at least in part on determining that the first probability meets or exceeds a first probability threshold; outputting the sub-classification in association with the object based at least in part on determining that the second probability meets or exceeds a second probability threshold; and controlling an autonomous vehicle based at least in part on at least one of the classification or the sub-classification. 11. The system of claim 6 , wherein the operations further comprise: providing to at least one of the first ML model or the sub-class ML model a ground truth sensor data, the ground truth sensor data associated with a ground truth classification label and a ground truth sub-classification label; determining a first loss based at least in part on a difference between a first output of the first ML model and a ground truth classification label; determining a second loss based at least in part on a difference between a second output of the sub-class ML model and the ground truth sub-classification label; and altering at least one of one or more first parameters of the first ML model or one or more second parameters of the sub-class ML model to minimize at least one of the first loss or the second loss. 12. The system of claim 6 , wherein: the second portion of the ML model is associated with a first classification, a third portion of the ML model is associated with a second classification, and the first classification and the second classification are candidate classifications associated with the first portion of the ML model. 13. The system of claim 12 , wherein the first classification and the second classification are two of a plurality of classifications, wherein the plurality of classifications comprises at least two of: a pedestrian classification; a vehicle classification; a cyclist classification; a signage classification; an animal classification; and a traffic obstruction classification. 14. The system of claim 6 , wherein: the first ML model comprises a first neural network comprising a plurality of first layers; and the sub-class ML model comprises a second neural network comprising a plurality of second layers. 15. The system of claim 6 , wherein: the first ML model comprises a first portion of a neural network comprising a plurality of first layers; and the sub-class ML model comprises a second portion of the neural network comprising a plurality of second layers. 16. A non-transitory computer-readable medium storing processor-executable instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: receiving sensor data; providing, as input to a first machine-learning (ML) model, the sensor data; receiving, from the first ML model, a first output comprising a classification associated with a representation of an object in the sensor data, and a first probability associated with the classification; and receiving, from a sub-class ML model, a sub-classification of the classification and a second probability associated with the sub-classification. 17. The non-transitory computer-readable medium of claim 16 , wherein the operations further comprise: receiving, from a first portion of the first ML model, a first feature map; receiving, from a second portion of the first ML model, a second feature map; inputting, into the sub-class ML model, at least a portion of the first feature map and at least a portion of the second feature map, wherein at least one of the first feature map or the second feature maps are associated with a region of interest associated with the representation of the object in the sensor data. 18. The non-transitory computer-readable medium
Detecting or recognising potential candidate objects based on visual cues, e.g. shapes · CPC title
using neural networks · CPC title
Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title
Classification techniques · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.