A generic modular sparse three-dimensional (3d) convolution design utilizing sparse 3d group convolution
US-2022147791-A1 · May 12, 2022 · US
US12475364B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12475364-B2 |
| Application number | US-202016983717-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 3, 2020 |
| Priority date | Sep 17, 2019 |
| Publication date | Nov 18, 2025 |
| Grant date | Nov 18, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Device and method for training an artificial neural network, including providing a neural network layer for an equivariant feature mapping having a plurality of output channels, grouping channels of the output channels into a number of distinct groups, wherein the output channels of each individual distinct group are organized into an individual grid defining a spatial location of each of the output channels of the individual distinct group in the grid for the individual distinct group, providing for each of the output channels of each individual distinct group, a distinct normalization function which is defined depending on the spatial location of the output channel in the grid in that this output channel is organized and depending on tunable hyperparameters for the normalization function, determining an output of the artificial neural network depending on a result of each of the distinct normalization functions, training the hyperparameters of the artificial neural network.
Opening claim text (preview).
What is claimed is: 1 . A computer implemented method for training an artificial neural network, the artificial neural network including a convolutional neural network layer configured to determine an equivariant feature mapping having a plurality of output channels, the method comprising the following steps: operating the convolutional neural network layer to produce the plurality of output channels; operating an intermediate filter layer of the artificial neural network after the convolutional neural network layer, the intermediate filter layer receiving the plurality of output channels and performing steps of: grouping channels of the plurality of output channels into a plurality of distinct groups, wherein the output channels of each individual distinct group of the plurality of distinct groups are organized into an individual grid defining a spatial location of each of the output channels of the individual distinct group in the grid, wherein each output channel of each individual distinct group is associated with an index; executing, for each output channel of the plurality of output channels, a neighborhood function defining a set of output channels according to a distance parameter defining a spatial distance between output channels of the neural network layer, the spatial distance being a function of indices of the output channels; executing, for each of the output channels of each individual distinct group of the plurality of distinct groups, a distinct normalization function, wherein the distinct normalization function is defined depending on the spatial location of the output channel in the grid that the output channel is organized, depending on tunable hyperparameters for the normalization function, and depending on output channels in the set of output channels defined by the neighborhood function and disregarding other output channels; and determining an output of the artificial neural network depending on a result of each of the distinct normalization functions; and training the distance parameter and the hyperparameters by repeating the steps performed by the intermediate filter layer a plurality of times. 2 . The method according to claim 1 , further comprising: providing a dimension parameter, wherein the grouping step includes grouping the channels of the plurality of output channels of the neural network layer into a number of distinct groups each having a size, where the size is defined depending on a total number of output channels of the neural network layer, and organizing for each of the distinct groups its output channels into a grid having a dimension according to the dimension parameter; and training the dimension parameter. 3 . The method according to claim 1 , further comprising: determining a rescaling parameter for each of the output channels; and rescaling the output for each of the output channels after normalization depending on the rescaling parameter for the output channel. 4 . The method according to claim 1 , further comprising: providing sensor data characterizing a digital image; processing the sensor data depending on the trained artificial neural network to determine an output signal for classifying the sensor data; and (i) detecting or localizing or segmenting, object in the sensor data, and/or (ii) detecting anomalies in the sensor data. 5 . The method according to claim 4 , further comprising: (i) actuating a physical system depending on the output signal, or (ii) generating input data for a generative adversarial neural network or a variational autoencoder for data synthesis. 6 . The method according to claim 1 , wherein the artificial neural network is adapted for segmenting or classifying or detecting: pedestrians and/or road signs and/or vehicles, in digital images, the method further comprising: collecting a set of digital images from a database; applying one or more transformations to each digital image including mirroring, or rotating, or smoothing, or contrast reduction, to create a modified set of digital images; creating a first training set including the collected set of digital images, the modified set of digital images, and a set of digital images unrelated to pedestrians and/or road signs and/or vehicles; training the artificial neural network in a first stage using the first training set; creating a second training set for a second stage of training including the first training set and digital images that are incorrectly detected as images depicting pedestrians and/or road signs and/or vehicles, after the first stage of training; and training the artificial neural network in a second stage using the second training set. 7 . A device for processing data, comprising: an input for data; an output for an output signal; at least one processor; and memory for an artificial neural network; wherein the device is configured to train the artificial neural network, the artificial neural network including a convolutional neural network layer configured to determine an equivariant feature mapping having a plurality of output channels, and the device is configured to: operate the convolutional neural network layer to produce the plurality of output channels; operate an intermediate filter layer of the artificial neural network after the convolutional neural network layer, the intermediate filter layer receiving the plurality of output channels and performing steps of: group channels of the plurality of output channels into a plurality of distinct groups, wherein the output channels of each individual distinct group of the plurality of distinct groups are organized into an individual grid defining a spatial location of each of the output channels of the individual distinct group in the grid, wherein each output channel of each individual distinct group is associated with an index; execute, for each output channel of the plurality of output channels, a neighborhood function defining a set of output channels according to a distance parameter defining a spatial distance between output channels of the neural network layer, the spatial distance being a function of indices of the output channels; execute, for each of the output channels of each individual distinct group of the plurality of distinct groups, a distinct normalization function, wherein the distinct normalization function is defined depending on the spatial location of the output channel in the grid that the output channel is organized, depending on tunable hyperparameters for the normalization function, and depending on output channels in the set of output channels defined by the neighborhood function and disregarding other output channels; and determine an output of the artificial neural network depending on a result of each of the distinct normalization functions; and train the distance parameter and the hyperparameters by repeating the steps performed by the intermediate filter layer a plurality of times. 8 . A non-transitory computer-readable storage medium on which is stored a computer program including machine-readable instructions for training an artificial neural network, the artificial neural network including a convolutional neural network layer configured to determine an equivariant feature mapping having a plurality of output channels, the instructions, when executed by a computer, causing the computer to perform the following steps: operating the convolutional neural network layer to produce the plurality of output channels; operating an intermediate filter layer of the artificial neural network after the convolutional neural network layer, the intermediate filter layer receiving the plurality of output channels and performing steps of: grouping channels of the plurality of output cha
Combinations of networks · CPC title
Correlation function computation {including computation of convolution operations (arithmetic circuits for sum of products per se, e.g. multiply-accumulators G06F7/5443; digital filters, e.g. FIR, IIR, adaptive filters H03H17/00)} · CPC title
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.