Highly Efficient Convolutional Neural Networks
US-2024119256-A1 · Apr 11, 2024 · US
US12547878B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12547878-B2 |
| Application number | US-202318486534-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 13, 2023 |
| Priority date | Nov 14, 2017 |
| Publication date | Feb 10, 2026 |
| Grant date | Feb 10, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As another example, in some implementations, the neural network architectures of the present disclosure can include one or more inverted residual blocks where the input and output of the inverted residual block are thin bottleneck layers, while an intermediate layer is an expanded representation. For example, the expanded representation can include one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. A residual shortcut connection can exist between the thin bottleneck layers that play a role of an input and output of the inverted residual block.
Opening claim text (preview).
What is claimed is: 1 . A computing system comprising: one or more processors; and one or more non-transitory computer-readable media that store: a convolutional neural network configured to process an input image to extract features from the input image, the convolutional neural network comprising: a convolutional block comprising: a first expansion layer configured to expand a first input feature representation with a first input number of channels to a first intermediate number of channels higher than the first input number of channels; a first depthwise convolutional layer downstream of the first expansion layer that is configured to apply a first depthwise convolution to generate a first intermediate feature representation with the first intermediate number of channels; a first pointwise convolutional layer downstream of the first depthwise convolutional layer that is configured to apply a first pointwise convolution to generate a first output feature representation, wherein the first output feature representation comprises features that are linearly projected from the first intermediate number of channels to a first output number of channels lower than the first intermediate number of channels; and an inverted residual bottleneck block downstream of the convolutional block, a residual shortcut connection connecting an input and an output of the inverted residual bottleneck block, the inverted residual bottleneck block comprising: a second expansion layer configured to expand a second input feature representation with a second input number of channels to a second intermediate number of channels higher than the second input number of channels; a second depthwise convolutional layer downstream of the second expansion layer that is configured to apply a second depthwise convolution to generate a second intermediate feature representation with the second intermediate number of channels; and a second pointwise convolutional layer downstream of the second depthwise convolutional layer that is configured to apply a second pointwise convolution to generate a second output feature representation, wherein the second output feature representation comprises features that are linearly projected from the second intermediate number of channels to a second output number of channels lower than the second intermediate number of channels. 2 . The computing system of claim 1 , wherein the inverted residual bottleneck block is ordered immediately subsequent to the convolutional block. 3 . The computing system of claim 2 , wherein the convolutional block comprises a residual shortcut connection connecting an input and an output of the convolutional block. 4 . The computing system of claim 1 , wherein the second output number of channels of the inverted residual bottleneck block comprises one-half as many channels as an output number of channels of a downstream inverted residual bottleneck block. 5 . The computing system of claim 1 , wherein the second intermediate number of channels is six times as many channels as the second input number of channels. 6 . The computing system of claim 1 , wherein the second depthwise convolutional layer is configured to apply a relu6 operator. 7 . The computing system of claim 1 , wherein the one or more non-transitory computer-readable media store: a machine-learned model configured to process a feature map generated by the convolutional neural network. 8 . The computing system of claim 7 , wherein the machine-learned model is configured to perform an image processing task using the feature map. 9 . The computing system of claim 8 , wherein the image processing task comprises recognizing objects in the input image. 10 . The computing system of claim 1 , wherein the one or more non-transitory computer-readable media store instructions that are executable by the one or more processors to cause the computing system to perform operations comprising: generating a feature map by processing an image using the convolutional neural network; and processing the feature map using a machine-learned model to recognize objects in the image. 11 . The computing system of claim 10 , wherein the operations comprise: receiving the image from a client computing device; and returning an output generated using the convolutional neural network as part of a web service. 12 . The computing system of claim 10 , wherein the one or more non-transitory computer-readable media are included in a mobile computing device. 13 . The computing system of claim 1 , wherein the one or more non-transitory computer-readable media store instructions that are executable by the one or more processors to cause the computing system to perform operations comprising: generating a feature map by processing an image using the convolutional neural network; and processing the feature map to perform an image processing task. 14 . The computing system of claim 1 , wherein the first depthwise convolutional layer is configured to perform depthwise convolution with a stride of two. 15 . The computing system of claim 1 , wherein the input and the output of the inverted residual bottleneck block correspond to linear bottleneck layers. 16 . A computer-implemented method, comprising: providing, by a computing system comprising one or more processors, an image as input to a convolutional neural network, the convolutional neural network comprising: a convolutional block comprising: a first expansion layer configured to expand a first input feature representation with a first input number of channels to a first intermediate number of channels higher than the first input number of channels; a first depthwise convolutional layer downstream of the first expansion layer that is configured to apply a first depthwise convolution to generate a first intermediate feature representation with the first intermediate number of channels; a first pointwise convolutional layer downstream of the first depthwise convolutional layer that is configured to apply a first pointwise convolution to generate a first output feature representation, wherein the first output feature representation comprises features that are linearly projected from the first intermediate number of channels to a first output number of channels lower than the first intermediate number of channels; and an inverted residual bottleneck block downstream of the convolutional block, a residual shortcut connection connecting an input and an output of the inverted residual bottleneck block, the inverted residual bottleneck block comprising: a second expansion layer configured to expand a second input feature representation with a second input number of channels to a second intermediate number of channels higher than the second input number of channels; a second depthwise convolutional layer downstream of the second expansion layer that is configured to apply a second depthwise convolution to generate a second intermediate feature representation with the second intermediate number of channels; and a second pointwise convolutional layer downstream of the second depthwise convolutional layer that is configured to apply a second pointwise convolution to generate a second output feature representation, wherein the second output feature representation comprises features that are linearly projected from the second intermediate number of channels to a second output number of channels lower than the second intermediate number of channels; and generating, by the computing system and using the convolutional neural network, a feature map from
Related publications grouped by family.
Answers are generated from the same data shown on this page.