Neuromorphic vision with frame-rate imaging for target detection and tracking
US-2021105421-A1 · Apr 8, 2021 · US
US12401885B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12401885-B2 |
| Application number | US-202318240526-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 31, 2023 |
| Priority date | Apr 7, 2022 |
| Publication date | Aug 26, 2025 |
| Grant date | Aug 26, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A target tracking method and a target tracking system of a spiking neural network based on an event camera are provided. The method includes: acquiring a data stream of asynchronous events in a high dynamic scene of a target by an event camera as input data; dividing the data stream of the asynchronous events into synchronous event frames with millisecond time resolution; training a twin network based on a spiking neural network by a gradient substitution algorithm with a target image as a template image and a complete image as a searched image; and tracking the target by a trained twin network with interpolating a result of feature mapping to up-sample and obtaining the position of the target in an original image. The twin network includes a feature extractor and a cross-correlation calculator.
Opening claim text (preview).
What is claimed is: 1. A target tracking method of a spiking neural network based on an event camera, the method comprising: acquiring a data stream of asynchronous events in a high dynamic scene of a target by an event camera as input data; dividing the data stream of the asynchronous events into synchronous event frames with millisecond time resolution by asynchronous event accumulation, wherein the synchronous event frames are binary images similar to a spiking; training a twin network based on a spiking neural network by a gradient substitution algorithm with a target image as a template image denoted as z and a complete image as a searched image denoted as x, wherein the twin network comprises a feature extractor with weight sharing and a cross-correlation calculator for calculating a position of the target, the target image is an image of the target in the synchronous event frames, and the complete image comprises any one of all the synchronous event frames; and tracking the target by a trained twin network with interpolating a result of feature mapping to up-sample and obtaining the position of the target in an original image. 2. The target tracking method of the spiking neural network based on the event camera of claim 1 , wherein the synchronous event frames are generated by dividing the asynchronous events according to a set size and number of time steps, accumulating the data stream of the asynchronous events within each time step, setting a pixel of a coordinate to 1 as long as the number of the asynchronous events generated at the coordinate within the same time step is greater than 0, otherwise setting the pixel of the coordinate to 0, and ultimately generating event frame images divided by the time steps. 3. The target tracking method of the spiking neural network based on the event camera of claim 1 , wherein the feature extractor is generated by adopting a spiking convolutional neural network as the feature extractor, wherein a network structure of the spiking convolutional neural network is 96C5-2S-256C3-2S-384C3-384C3-256C3, wherein 96C5 represents a spiking convolutional layer with a convolutional kernel size of 5 and an output channel of 96, 2S represents a pooling layer with a down-sampling of 2 times, and the rest network structure is in a similar manner; a convolutional step of a first convolutional layer is 2, convolutional steps of the rest convolutional layers are 1, and all of convolutional layers of the feature extractor are followed by a spiking neuron. 4. The target tracking method of the spiking neural network based on the event camera of claim 3 , wherein the spiking neuron is a Leaky integrate and fire neuron model, i.e., τ m d V d t = V r e s t - V + R m I , wherein τ m represents a membrane time constant, V represents a membrane potential, t represents a spiking time, V rest represents a resting potential, and R m and I represent impedance and input current of a cell membrane, respectively; the feature extractor is denoted as φ, a size of the template image z is 255*255*3, a size of the searched image x is 127*127*3, an output after an operation of the feature extractor is φ(z) with a size of 6*6*256 and φ(x) with a size of 22*22*256. 5. The target tracking method of the spiking neural network based on the event camera of claim 1 , wherein an operation of the cross-correlation calculator comprises: configuring a feature mapping denoted as φ(z) after extracting features from the template image z to be a convolutional kernel, configuring a feature mapping φ(x) after extracting features from the searched image x to be a feature map to be convolved, and performing a convolution operation on the convolutional kernel and the feature map to be convolved, wherein a result produced after the convolution operation of the current convolutional layer is a similarity heatmap that represents a prediction probability of a predicted center position of the target, and a position of a maximum spiking issuance rate is the predicted center position of the target. 6. The target tracking method of the spiking neural network based on the event camera of claim 1 , wherein the twin network is generated by: adopting a brain-inspired computing development framework, and putting a padded template image and the searched image into the same batch sequentially based on batch training, so that the number of neurons in an input layer for the padded template image is the same as that for the searched image, and the padded template image and the searched image share the same network connection; after operation of the feature extractor denoted as φ, cropping an output of an odd-numbered sample that is an output of a z-branch denoted as φ(z) to delete edge-padding of φ(z), and obtaining the feature mapping with a due size of 6*6*256. 7. The target tracking method of the spiking neural network based on the event camera of claim 1 , further comprising: performing no update on the target image that is the template image, performing an operation φ(z) of the feature extractor for an initial target once, configuring the searched image to be an image equivalent to 4 times a size of the template image, wherein the searched image is centered on the position of the target and cropped from a previous synchronous event frame, and a search area is narrowed to improve real-time performance; adopting bicubic interpolation to up-sample and revert a size of the similarity heatmap, determining a predicted position of the target, adopting three scales to search, that is, scaling the similarity heatmap to 1.03 {−1,0,1} , respectively; and selecting a position of a maximum spiking issuance rate from a scaling output as a final result, wherein the maximum spiking issuance rate is a maximum similarity. 8. An electronic device, comprising a processor and a memory, wherein the memory stores a computer program executable by the processor to implement the steps of the target tracking method of the spiking neural network based on the event camera of claim 1 . 9. The electronic device of claim 8 , wherein the synchronous event frames are generated by dividing the asynchronous events according to a set size and number of time steps, accumulating the data stream of the asynchronous events within each time step, setting a pixel of a coordinate to 1 as long as the number of the asynchronous events generated at the coordinate within the same time step is greater than 0, otherwise setting the pixel of the coordinate to 0, and ultimately generating event frame images divided by the time steps. 10. The electronic device of claim 8 , where
by increasing the dynamic range of the image compared to the dynamic range of the electronic image sensors · CPC title
Pixels for event detection · CPC title
Image sensors with pixel address output; Event-driven image sensors; Selection of pixels to be read out based on image data · CPC title
using neural networks · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.