What technology area does this patent fall under?

Primary CPC classification G06N3/084. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 18 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for object detection, tracking, and motion prediction

US11475351B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11475351-B2
Application number	US-201816124966-A
Country	US
Kind code	B2
Filing date	Sep 7, 2018
Priority date	Nov 15, 2017
Publication date	Oct 18, 2022
Grant date	Oct 18, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, tangible non-transitory computer-readable media, and devices for object detection, tracking, and motion prediction are provided. For example, the disclosed technology can include receiving sensor data including information based on sensor outputs associated with detection of objects in an environment over one or more time intervals by one or more sensors. The operations can include generating, based on the sensor data, an input representation of the objects. The input representation can include a temporal dimension and spatial dimensions. The operations can include determining, based on the input representation and a machine-learned model, detected object classes of the objects, locations of the objects over the one or more time intervals, or predicted paths of the objects. Furthermore, the operations can include generating, based on the input representation and the machine-learned model, an output including bounding shapes corresponding to the objects.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of object detection, the computer-implemented method comprising: receiving sensor data comprising information based at least in part on one or more sensor outputs associated with detection of an environment over one or more time intervals by one or more sensors, wherein the environment comprises one or more objects; generating based at least in part on the sensor data, an input representation of the one or more objects, wherein the input representation comprises a temporal dimension and one or more spatial dimensions; determining, based at least in part on one or more fusion criteria provided in data associated with the temporal dimension of the input representation, whether to aggregate temporal information associated with the temporal dimension at a first convolution layer of a plurality of convolution layers of a machine-learned model or to aggregate the temporal information associated with the temporal dimension over two or more convolution layers of the plurality of convolution layers of the machine-learned model; determining based at least in part on the input representation and the machine-learned model, at least one of: (i) one or more detected object classes of the one or more objects, (ii) one or more locations of the one or more objects over the one or more time intervals, or (iii) one or more predicted paths of the one or more objects, wherein the machine-learned model aggregates the temporal information associated with the temporal dimension at the first convolution layer or aggregates the temporal information associated with the temporal dimension over two or more convolution layers of the machine-learned model, wherein aggregating the temporal information comprises reducing the one or more time intervals of the temporal dimension to one time interval in a manner that is determined based at least in part on the one or more fusion criteria; and generating, based at least in part on the input representation and the machine-learned model, output data comprising one or more bounding shapes corresponding to the one or more objects. 2. The computer-implemented method of claim 1 , further comprising: generating, based at least in part on the sensor data, a plurality of voxels corresponding to the environment comprising the one or more objects, wherein a height dimension of the plurality of voxels is used as an input channel of the input representation, and wherein the input representation is based at least in part on the plurality of voxels corresponding to one or more portions of the environment occupied by the one or more objects. 3. The computer-implemented method of claim 1 , wherein the input representation comprises a tensor associated with a plurality of dimensions comprising the temporal dimension and the one or more spatial dimensions, the temporal dimension of the tensor associated with the one or more time intervals, and the one or more spatial dimensions of the tensor comprising a width dimension, a depth dimension, or a height dimension that is used as an input channel for the machine-learned model. 4. The computer-implemented method of claim 3 , wherein the input representation is input to the first convolution layer of the plurality of convolution layers of the machine-learned model, and wherein weights of a plurality of feature maps for the plurality of convolution layers are shared between the plurality of convolution layers. 5. The computer-implemented method of claim 4 , further comprising: aggregating the temporal information to the tensor subsequent to aggregating spatial information associated with the one or more spatial dimensions to the tensor, wherein the temporal information is aggregated as the input representation is processed by the plurality of convolution layers, and wherein the temporal information is associated with the temporal dimension of the tensor. 6. The computer-implemented method of claim 1 , wherein the fusion criteria provided in data associated with the temporal dimension of the input representation comprises a flag signaling whether to aggregate temporal information associated with the temporal dimension at the first convolution layer of the plurality of convolution layers of the machine-learned model or to aggregate the temporal information associated with the temporal dimension over the two or more convolution layers of the plurality of convolution layers of the machine-learned model. 7. The computer-implemented method of claim 6 , wherein aggregating the temporal information comprises: reducing the one or more time intervals of the temporal dimension to one time interval by performing a one-dimensional convolution on the temporal information associated with the temporal dimension. 8. The computer-implemented method of claim 1 , wherein aggregating the temporal information comprises: reducing the one or more time intervals of the temporal dimension to one time interval by performing a two-dimensional convolution on the temporal information associated with the temporal dimension. 9. The computer-implemented method of claim 1 , further comprising: activating, based at least in part on the output data, one or more systems comprising mechanical systems, one or more electromechanical systems, or one or more electronic systems, associated with operation of a manually operated vehicle, an autonomous vehicle, or one or more robotic systems. 10. The computer-implemented method of claim 1 , further comprising: determining one or more travelled paths of the one or more objects based at least in part on one or more locations of the one or more objects over a sequence of the one or more time intervals comprising a last time interval associated with a current time and the one or more time intervals prior to the current time, wherein the one or more predicted paths of the one or more objects is based at least in part on the one or more travelled paths. 11. The computer-implemented method of claim 10 , further comprising: detecting an object of the one or more objects that is at least partly occluded; and determining, based at least in part on the one or more travelled paths of the one or more objects, a time associated with the object of the one or more objects that is at least partly occluded being detected. 12. The computer-implemented method of claim 1 , wherein the one or more sensor outputs comprise one or more three-dimensional points corresponding to a plurality of surfaces of the one or more objects detected by the one or more sensors. 13. The computer-implemented method of claim 1 , wherein the sensor data is associated with a birds eye view vantage point, the one or more sensors comprising one or more light detection and ranging devices (LIDAR), one or more cameras, one or more radar devices, one or more sonar devices, or one or more thermal sensors. 14. One or more tangible non-transitory computer-readable media storing computer-readable instructions that are executable by one or more processors to cause the one or more processors to perform operations, the operations comprising: receiving sensor data comprising information based at least in part on one or more sensor outputs associated with detection of an environment over one or more time intervals by one or more sensors, wherein the environment comprises one or more objects; generating, based at least in part on the sensor data, an input representation of the one or more objects, wherein the input representation comprises a temporal dimension and one or more spatial dimensions; determining, based at least in part on one or more fusion criteria provided in data associated w

Assignees

Uatc Llc

Inventors

Classifications

G06N3/084Primary
Backpropagation, e.g. using gradient descent · CPC title
G06V20/56
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
G06V10/82
using neural networks · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G06N20/00Primary
Machine learning · CPC title

Patent family

Related publications grouped by family.

View patent family 66432258

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11475351B2 cover?: Systems, methods, tangible non-transitory computer-readable media, and devices for object detection, tracking, and motion prediction are provided. For example, the disclosed technology can include receiving sensor data including information based on sensor outputs associated with detection of objects in an environment over one or more time intervals by one or more sensors. The operations can in…
Who is the assignee on this patent?: Uatc Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 18 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).