Detecting Augmented-Reality Targets
US-2020143238-A1 · May 7, 2020 · US
US11703566B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11703566-B2 |
| Application number | US-202117373550-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 12, 2021 |
| Priority date | Apr 16, 2019 |
| Publication date | Jul 18, 2023 |
| Grant date | Jul 18, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A machine-learning architecture may be trained to determine point cloud data associated with different types of sensors with an object detected in an image and/or generate a three-dimensional region of interest (ROI) associated with the object. In some examples, the point cloud data may be associated with sensors such as, for example, a lidar device, radar device, etc.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving first sensor data associated with an image sensor, the first sensor data representing a first portion of an environment surrounding a vehicle; receiving second sensor data associated with a depth sensor, the second sensor data representing a second portion of the environment surrounding the vehicle, wherein the first portion of the environment surrounding the vehicle and the second portion of the environment surrounding the vehicle at least partially overlap; inputting the first sensor data into a first subnetwork; receiving a first output from the first subnetwork; determining, based at least in part on the first output, an object detection that identifies an object in one or more images of the first sensor data; determining, based at least in part on the second sensor data, depth information corresponding to the environment; inputting at least a portion of the first output and the depth information into a second subnetwork; receiving a second output from the second subnetwork; determining, based at least in part on the second output, a three-dimensional region of interest corresponding to the object; combining, as a combined output, the first output and the second output; inputting a first portion of the combined output into a third subnetwork and second portion of the combined output into a fourth subnetwork; and receiving a first map from the third subnetwork and second map from the fourth subnetwork. 2. The method of claim 1 , wherein: the first map indicates at least a first probability that a first point of the first sensor data is associated with the object, and the second map indicates at least a second probability that a second point of the second sensor data is associated with the object. 3. The method of claim 1 , wherein combining the first output and the second output comprises: down-sampling, as first global data and using one or more first network layers, the first output; down-sampling, as second global data and using one or more second network layers, the second output; concatenating, as concatenated data, the first global data with the second global data; inputting the concatenated data into a first fully-connected layer and a second fully-connected layer; receiving first transformed concatenated data from the first fully-connected layer and second transformed concatenated data from the second fully-connected layer; adding, as first summed data, the first output with the first transformed concatenated data; and adding, as second summed data, the second output with the second transformed concatenated data, wherein the first portion of combined data comprises an output from the first fully-connected layer, and wherein the second portion of combined data comprises an output from the second fully-connected layer. 4. The method of claim 1 , wherein determining the three-dimensional region of interest comprises: inputting, to a machine-learned model, at least one of the first map or the second map and at least a portion of the object detection; and receiving, from the machine-learned model, the three-dimensional region of interest. 5. The method of claim 1 , further comprising controlling the vehicle based at least in part on the three-dimensional region of interest. 6. The method of claim 1 , wherein: inputting the first sensor data into a first subnetwork comprises determining a first subset of sensor data and providing the first subset of sensor data to the first subnetwork; inputting the second sensor data into a second subnetwork comprises determining a second subset of sensor data and providing the second subset of sensor data to the second subnetwork; determining the first subset and the second subset is based at least in part on: projecting, as first projected data, the first sensor data into an image space associated with an image sensor that captured at least one of the one or more images, wherein the projecting is based at least in part on an orientation of the image sensor; projecting, as second projected data and based at least in part on the orientation, the second sensor data into the image space; determining first points of the first sensor data associated with a first portion of the first projected data that lies within extents of the object detection; determining second points of the second sensor data associated with a second portion of the second projected data that lies within extents of the object detection; determining, as the first subset, the first points from a first coordinate space associated with a first type of sensor to a coordinate space defined as having an origin located at a position of the image sensor and a longitudinal axis extending through a center of an ROI associated with the object detection; and determining, as the second subset, the second points from a second coordinate space associated with a second type of sensor to the coordinate space defined as having an origin located at a position of the image sensor. 7. A system comprising: one or more processors; a memory storing processor-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: receiving first sensor data associated with an image sensor; receiving second sensor data associated with a depth sensor, the first sensor data and the second sensor data being associated with an environment surrounding a vehicle; inputting the first sensor data into a first subnetwork; receiving a first output from the first subnetwork; determining, based at least in part on the first output, an object detection that identifies an object in one or more images of the first sensor data; determining, based at least in part on the second sensor data, depth information corresponding to the environment; determining a portion of the depth information based at least in part on the first output; inputting the portion of depth information into a second subnetwork; receiving a second output from the second subnetwork; determining, based at least in part on the second output, a three-dimensional region of interest corresponding to the object; combining, as a combined output, the first output and the second output; inputting a first portion of the combined output into a third subnetwork and a second portion of the combined output into a fourth subnetwork; and receiving a first map from the third subnetwork and a second map from the fourth subnetwork. 8. The system of claim 7 , wherein: the first map indicates at least a first probability that a first point of the first sensor data is associated with the object, and the second map indicates at least a second probability that a second point of the second sensor data is associated with the object. 9. The system of claim 7 , wherein combining the first output and the second output comprises: down-sampling, as first global data and using one or more first network layers, the first output; down-sampling, as second global data and using one or more second network layers, the second output; concatenating, as concatenated data, the first global data with the second global data; inputting the concatenated data into a first fully-connected layer and a second fully-connected layer; receiving first transformed concatenated data from the first fully-connected layer and second transformed concatenated data from the second fully-connected layer; adding, as first summed data, the first output with the first transformed concatenated data; and adding, as second summed data, the second output with the second transformed concatenated data, wherein the first portion of combined data comprises an output from the first
Feedforward networks · CPC title
Supervised learning · CPC title
involving the use of neural networks · CPC title
for mapping or imaging · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.