Incremental learning framework for object detection in videos
US-9805264-B2 · Oct 31, 2017 · US
US10679099B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10679099-B2 |
| Application number | US-201815974069-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 8, 2018 |
| Priority date | May 8, 2018 |
| Publication date | Jun 9, 2020 |
| Grant date | Jun 9, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An autonomous vehicle vision system for estimating a category of a detected object in an object pose unknown to the system includes a neural network to apply a mapping process to a region of interest in an image including the detected object in the object pose to obtain a point in a 3D manifold space. The system includes an object detector to estimate the category of the detected object in the object pose in the region of interest based on a relationship between the point representing the detected object in the object pose and a plurality of separate object clusters in the 3D manifold space. The system further includes a planner to select an improved route based on a predicted behavior of the category of the detected object in the object pose. The system also includes a controller to control operation of an autonomous vehicle according to the improved route.
Opening claim text (preview).
What is claimed is: 1. A method for estimating a category of a detected object in an object pose unknown to an autonomous vehicle vision system, comprising: applying a mapping process to a region of interest in an image including the detected object in the object pose to obtain a point in a 3D manifold space; estimating the category of the detected object in the object pose in the region of interest based on a relationship between the point representing the detected object in the object pose and a plurality of separate 3D object clusters in the 3D manifold space by: analyzing a position of the point relative to the positions of the plurality of separate 3D object clusters in the 3D manifold space, and identifying the category of the detected object in the object pose as the category of one of the plurality of separate 3D object clusters when the position of the point is within a predetermined threshold distance of the position of the one of the plurality of separate 3D object clusters; supplying the category of the detected object in the object pose to a planner of the autonomous vehicle vision system; and controlling operation of an autonomous vehicle according to an improved route selected by the planner based on a predicted behavior of the detected object in the object pose. 2. The method of claim 1 , in which applying the mapping process comprises: receiving the image having the region of interest including the detected object in the object pose from one or more sensors of the autonomous vehicle vision system; generating a feature vector to represent a 3D view space of the detected object with information to identify the detected object in the object pose; and mapping the feature vector representing the detected object in the object pose to the point in the 3D manifold space, in which the point is not on any of the plurality of separate 3D object clusters in the 3D manifold space. 3. The method of claim 1 , further comprising: training the 3D manifold space to separately map objects of different categories into the plurality of separate 3D object clusters separated by at least a predetermined distance in the 3D manifold space; and aggregating each of the plurality of separate 3D object clusters according to poses of an object represented by the respective, separate 3D object clusters. 4. The method of claim 1 , in which aggregating comprises improving a continual point of view of the object category and the object pose of each of the plurality of separate 3D object clusters in the 3D manifold space. 5. The method of claim 1 , in which applying the mapping process comprises: training a convolutional neural network to map the region of interest in the image including the detected object in the object pose to obtain the point in the 3D manifold space. 6. The method of claim 1 , in which estimating the category of the detected object comprises: determining a position of the point relative to the positions of the plurality of separate 3D object clusters in the 3D manifold space; computing a range between the position of the point and the positions of the plurality of separate 3D object clusters; detecting an object cluster within a predetermined threshold distance from the position of the point representing the detected object in the object pose; and assigning the category of the detected object cluster as the category of the detected object in the object pose. 7. The method of claim 6 , further comprising: notifying the planner of an unidentified object when a distance between the position of the point and the positions of each of the plurality of separate 3D object clusters is greater than the predetermined threshold distance. 8. The method of claim 1 , further comprising: detecting the object in the object pose in the region of interest in the image; and porting a 3D viewpoint of the detected object in the object pose to the point in the 3D manifold space. 9. A non-transitory computer-readable medium having program code recorded thereon for estimating a category of a detected object in an object pose unknown to an autonomous vehicle vision system, the program code being executed by a processor and comprising: program code to apply a mapping process to a region of interest in an image including the detected object in the object pose to obtain a point in a 3D manifold space; program code to estimate the category of the detected object in the object pose in the region of interest based on a relationship between the point representing the detected object in the object pose and a plurality of separate 3D object clusters in the 3D manifold space by: program code to analyze a position of the point relative to the positions of the plurality of separate 3D object clusters in the 3D manifold space, and program code to identify the category of the detected object in the object pose as the category of one of the plurality of separate 3D object clusters when the position of the point is within a predetermined threshold distance of the position of the one of the plurality of separate 3D object clusters; program code to supply the category of the detected object in the object pose to a planner of the autonomous vehicle vision system; and program code to control operation of an autonomous vehicle according to an improved route selected by the planner based on a predicted behavior of the detected object in the object pose. 10. The non-transitory computer-readable medium of claim 9 , in which program code to apply the mapping process comprises: program code to receive the image having the region of interest including the detected object in the object pose from one or more sensors of the autonomous vehicle vision system; program code to generate a feature vector to represent a 3D view space of the detected object with information to identify the detected object in the object pose; and program code to map the feature vector representing the detected object in the object pose to the point in the 3D manifold space, in which the point is not on any of the plurality of separate 3D object clusters in the 3D manifold space. 11. The non-transitory computer-readable medium of claim 9 , further comprising: program code to train the 3D manifold space to separately map objects of different categories into the plurality of separate 3D object clusters separated by at least a predetermined distance in the 3D manifold space; and program code to aggregate each of the plurality of separate 3D object clusters according to poses of an object represented by the respective, separate 3D object clusters. 12. The non-transitory computer-readable medium of claim 9 , in which program code to aggregate further comprises: program code to improve a continual point of view of the object category and the object pose of each of the plurality of separate 3D object clusters in the 3D manifold space. 13. The non-transitory computer-readable medium of claim 9 , in which program code to apply the mapping process further comprises: program code to train a convolutional neural network to map the region of interest in the image including the detected object in the object pose to obtain the point in the 3D manifold space. 14. The non-transitory computer-readable medium of claim 9 , in which program code to estimate the category of the detected object further comprises: program code to determine a position of the point relative to the positions of the plurality of separate 3D object clusters in the 3D manifold space; program code to compute a range between the position of the point and the positions of the plurality of separate 3D object clusters;
Physics · mapped topic
Learning methods · CPC title
Physics · mapped topic
Physics · mapped topic
Architecture, e.g. interconnection topology · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.