Pedestrian detection neural networks
US-2018173971-A1 · Jun 21, 2018 · US
US11763574B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11763574-B2 |
| Application number | US-202117515277-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 29, 2021 |
| Priority date | Feb 27, 2020 |
| Publication date | Sep 19, 2023 |
| Grant date | Sep 19, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A vehicle includes one or more cameras that capture a plurality of two-dimensional images of a three-dimensional object. A light detector and/or a semantic classifier search within those images for lights of the three-dimensional object. A vehicle signal detection module fuses information from the light detector and/or the semantic classifier to produce a semantic meaning for the lights. The vehicle can be controlled based on the semantic meaning. Further, the vehicle can include a depth sensor and an object projector. The object projector can determine regions of interest within the two-dimensional images, based on the depth sensor. The light detector and/or the semantic classifier can use these regions of interest to efficiently perform the search for the lights.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: receiving a two-dimensional image of a vehicle from a camera; receiving, from an object tracker, three-dimensional coordinates of a three-dimensional box corresponding to sides of the vehicle, wherein the object tracker performs object detection and tracking based on information from a depth sensor; projecting the three-dimensional coordinates of sides of the three-dimensional box individually onto the two-dimensional image; determining regions of interest in the two-dimensional image based on the projection; searching for one or more vehicle signal lights of the vehicle within the regions of interest; and associating one or more vehicle signal lights found within a particular region of interest to a particular side of the three-dimensional box; and performing geometric and semantic interpretations on the one or more vehicle signal lights associated with the sides of three-dimensional box. 2. The method of claim 1 , wherein projecting the three-dimensional coordinates onto the two-dimensional image comprises applying a geometric matrix transformation to the three-dimensional coordinates. 3. The method of claim 1 , further comprising: controlling an autonomous vehicle, based on the geometric and semantic interpretations, wherein the controlling is accelerating, braking, or steering the autonomous vehicle. 4. The method of claim 1 , further comprising: determining, by the object tracker, a portion of the vehicle is occluded in the two-dimensional image; and in response to determining the portion of the vehicle is occluded in the two-dimensional image, applying a mask to the two-dimensional image to remove pixels not associated with regions where vehicle signal lights would be present. 5. The method of claim 1 , wherein projecting the three-dimensional coordinates comprises: projecting the three-dimensional coordinates corresponding to a front face of the vehicle onto the two-dimensional image. 6. The method of claim 1 , wherein projecting the three-dimensional coordinates comprises: projecting the three-dimensional coordinates corresponding to a rear face of the vehicle onto the two-dimensional image. 7. The method of claim 1 , further comprising: cropping the two-dimensional image to the regions of interest to produce cropped images, wherein the searching is performed on the cropped images. 8. One or more non-transitory, computer-readable media encoded with instructions that, when executed by one or more processing units, perform a method comprising: receiving a two-dimensional image of a vehicle from a camera; receiving, from an object tracker encoded in the instructions to perform object detection and tracking based on information from a depth sensor, three-dimensional coordinates of a three-dimensional box corresponding to sides of the vehicle; projecting three-dimensional coordinates of a first side of the three-dimensional box onto the two-dimensional image; determining a first region of interest in the two-dimensional image based on the projection; and searching for one or more vehicle signal lights of the vehicle within the first region of interest; associating the one or more vehicle signal lights in the first region of interest to the first side; and performing geometric and semantic interpretations on the one or more vehicle signal lights associated with the first side. 9. The one or more non-transitory, computer-readable media of claim 8 , wherein performing geometric and semantic interpretation comprises: projecting the one or more vehicle signal lights onto a relative space of the three-dimensional box; and determining one or more semantic labels of the one or more vehicle signal lights associated with the first side. 10. The one or more non-transitory, computer-readable media of claim 8 , wherein projecting the three-dimensional coordinates of the first side onto the two-dimensional image comprises applying a geometric matrix transformation to the three-dimensional coordinates. 11. The one or more non-transitory, computer-readable media of claim 8 , wherein the method further comprises: determining, by the object tracker, a portion of the vehicle is occluded in the two-dimensional image, and in response to determining the portion of the vehicle is occluded in the two-dimensional image, applying a mask to the two-dimensional image to remove pixels not associated with vehicle. 12. The one or more non-transitory, computer-readable media of claim 8 , further comprising: projecting three-dimensional coordinates of a second side of the three-dimensional box onto the two-dimensional image; determining a second region of interest in the two-dimensional image based on the projection; and searching for one or more further vehicle signal lights of the vehicle within the second region of interest. 13. The one or more non-transitory, computer-readable media of claim 12 , further comprising: associating the one or more further vehicle signal light in the second region of interest to the second side; and performing geometric and semantic interpretations on the one or more further vehicle signal lights associated with the second side. 14. The one or more non-transitory, computer-readable media of claim 8 , wherein the first side corresponds to a front face or rear face of theft vehicle. 15. A system, comprising: one or more memories including instructions; one or more processors to execute the instructions; a camera; a depth sensor; a vehicle control system; and an object tracker, encoded in the instructions, to perform object detection and tracking based on information from the depth sensor; an object projector, encoded in the instructions, to: receive a two-dimensional image of a vehicle from the camera; receive, from an object tracker encoded in the instructions to, three-dimensional coordinates of a three-dimensional box corresponding to different sides of a vehicle; project three-dimensional coordinates of a selected side of the three-dimensional box onto the two-dimensional image; and determine a region of interest in the two-dimensional image based on the projection; and a vehicle signal detector, encoded in the instructions, to: search for one or more vehicle signal lights of the vehicle within the region of interest; associate the one or more vehicle signal lights found within the region of interest to the selected side; and perform geometric and semantic interpretations on the one or more vehicle signal lights associated with the selected side. 16. The system of claim 15 , wherein the system is an autonomous vehicle. 17. The system of claim 15 , wherein the selected side is corresponds to a front side of the vehicle. 18. The system of claim 15 , wherein the selected side is corresponds to a rear side of the vehicle. 19. The system of claim 16 , wherein the object projector is further to: project three-dimensional coordinates of a further selected side of the three-dimensional box onto the two-dimensional image; and determine a further region of interest in the two-dimensional image based on the projection. 20. The system of claim 19 , wherein the vehicle signal detector is further to: search for one or more further vehicle signal lights of the vehicle within the further region of interest; and associate the one or more further vehicle signal lights found within the region of interest to the further selected side; and perform geometric and semantic interpretations on the one or more fu
of vehicle lights or traffic lights · CPC title
of extracted features · CPC title
the classifiers operating on different input data, e.g. multi-modal recognition · CPC title
of results relating to different input data, e.g. multimodal recognition · CPC title
of extracted features · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.