System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions
US-11354577-B2 · Jun 7, 2022 · US
US11734545B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11734545-B2 |
| Application number | US-201815898566-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 17, 2018 |
| Priority date | Nov 14, 2017 |
| Publication date | Aug 22, 2023 |
| Grant date | Aug 22, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As another example, in some implementations, the neural network architectures of the present disclosure can include one or more inverted residual blocks where the input and output of the inverted residual block are thin bottleneck layers, while an intermediate layer is an expanded representation. For example, the expanded representation can include one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. A residual shortcut connection can exist between the thin bottleneck layers that play a role of an input and output of the inverted residual block.
Opening claim text (preview).
What is claimed is: 1. A computing system, comprising: one or more processors; and one or more non-transitory computer-readable media that store a convolutional neural network implemented by the one or more processors, the convolutional neural network comprising: one or more inverted residual blocks, each of the one or more inverted residual blocks comprising: one or more expansion convolutional layers configured to produce, from an input with an initial number of channels, an expanded representation with an expanded number of channels, the expanded number of channels being larger than the initial number of channels; a linear convolutional bottleneck layer downstream of the one or more expansion convolutional layers, the linear convolutional bottleneck layer configured to produce a bottleneck tensor with a reduced number of channels lower than the expanded number of channels; and a residual shortcut connection between the linear convolutional bottleneck layer and one or more of: a subsequent linear convolutional bottleneck layer of a subsequent inverted residual block of the one or more inverted residual blocks, or a previous linear convolutional bottleneck layer of a previous inverted residual block of the one or more inverted residual blocks. 2. The computing system of claim 1 , wherein the one or more expansion convolutional layers comprise a spatial convolutional layer that filters the expanded representation, and wherein the linear convolutional bottleneck layer projects the filtered expanded representation to the reduced number of channels. 3. The computing system of claim 2 , wherein the one or more expansion convolutional layers comprise one or more separable convolutional layers. 4. The computing system of claim 3 , wherein each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output. 5. The computing system of claim 1 , wherein the one or more inverted residual blocks comprise a plurality of inverted residual blocks arranged in a stack one after the other. 6. The computing system of claim 1 , wherein, for each of the one or more inverted residual blocks, the linear convolutional bottleneck layer is structurally subsequent to the one or more convolutional layers. 7. The computing system of claim 1 , wherein the computing system is configured to receive an input image and to generate an output for the input image. 8. The computing system of claim 1 , wherein the inverted residual block comprises a residual shortcut connection between the linear convolutional bottleneck layer and the subsequent linear convolutional bottleneck layer of the subsequent inverted residual block. 9. The computing system of claim 1 , wherein the inverted residual block comprises a residual shortcut connection between the linear convolutional bottleneck layer and the previous linear convolutional bottleneck layer of the previous inverted residual block. 10. The computing system of claim 8 , wherein the subsequent inverted residual block is an immediately subsequent inverted residual block of the one or more inverted residual blocks. 11. The computing system of claim 9 , wherein the previous inverted residual block is an immediately previous inverted residual block of the one or more inverted residual blocks. 12. One or more non-transitory computer-readable media that store a convolutional neural network configured to be implemented by one or more processors, the convolutional neural network comprising: one or more inverted residual blocks, each of the one or more inverted residual blocks comprising: one or more expansion convolutional layers configured to produce, from an input with an initial number of channels, an expanded representation with an expanded number of channels, the expanded number of channels being larger than the initial number of channels; a linear convolutional bottleneck layer downstream of the one or more expansion convolutional layers, the linear convolutional bottleneck layer configured to produce a bottleneck tensor with a reduced number of channels lower than the expanded number of channels; and a residual shortcut connection between the linear convolutional bottleneck layer and one or more of: a next linear convolutional bottleneck layer of a subsequent inverted residual block of the one or more inverted residual blocks, or a previous linear convolutional bottleneck layer of a previous inverted residual block of the one or more inverted residual blocks. 13. The one or more non-transitory computer-readable media of claim 10 , wherein the one or more expansion convolutional layers comprise a spatial convolutional layer that filters the expanded representation, and wherein the linear convolutional bottleneck layer projects the filtered expanded representation to the reduced number of channels. 14. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more expansion convolutional layers comprise one or more separable convolutional layers. 15. The one or more non-transitory computer-readable media of claim 12 , wherein each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output. 16. The one or more non-transitory computer-readable media of claim 10 , wherein, for each of the one or more inverted residual blocks, the linear convolutional bottleneck layer is structurally subsequent to the one or more convolutional layers. 17. A method, comprising: processing, by a computing system comprising one or more processors, image data using a neural network, the neural network comprising one or more inverted residual blocks, each of the one or more inverted residual blocks comprising: one or more expansion convolutional layers configured to produce, from an input with an initial number of channels, an expanded representation with an expanded number of channels, the expanded number of channels being larger than the initial number of channels; a linear convolutional bottleneck layer downstream of the one or more expansion convolutional layers, the linear convolutional bottleneck layer configured to produce a bottleneck tensor with a reduced number of channels lower than the expanded number of channels; and a residual shortcut connection between the linear convolutional bottleneck layer and one or more of: a next linear convolutional bottleneck layer of a subsequent inverted residual block of the one or more inverted residual blocks, or a previous linear convolutional bottleneck layer of a previous inverted residual block of the one or more inverted residual blocks. 18. The method of claim 17 , wherein the one or more expansion convolutional layers comprise a spatial convolutional layer that filters the expanded representation, and wherein the linear convolutional bottleneck layer projects the filtered expanded representation to the reduced number of channels. 19. The method of claim 18 , wherein the one or more expansion convolutional layers comprise one or more separable convolutional layers. 20. The method of claim 19 , wherein each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
using neural networks · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.