Highly efficient convolutional neural networks

US12547878B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12547878-B2
Application numberUS-202318486534-A
CountryUS
Kind codeB2
Filing dateOct 13, 2023
Priority dateNov 14, 2017
Publication dateFeb 10, 2026
Grant dateFeb 10, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As another example, in some implementations, the neural network architectures of the present disclosure can include one or more inverted residual blocks where the input and output of the inverted residual block are thin bottleneck layers, while an intermediate layer is an expanded representation. For example, the expanded representation can include one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. A residual shortcut connection can exist between the thin bottleneck layers that play a role of an input and output of the inverted residual block.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computing system comprising: one or more processors; and one or more non-transitory computer-readable media that store: a convolutional neural network configured to process an input image to extract features from the input image, the convolutional neural network comprising: a convolutional block comprising: a first expansion layer configured to expand a first input feature representation with a first input number of channels to a first intermediate number of channels higher than the first input number of channels; a first depthwise convolutional layer downstream of the first expansion layer that is configured to apply a first depthwise convolution to generate a first intermediate feature representation with the first intermediate number of channels; a first pointwise convolutional layer downstream of the first depthwise convolutional layer that is configured to apply a first pointwise convolution to generate a first output feature representation, wherein the first output feature representation comprises features that are linearly projected from the first intermediate number of channels to a first output number of channels lower than the first intermediate number of channels; and an inverted residual bottleneck block downstream of the convolutional block, a residual shortcut connection connecting an input and an output of the inverted residual bottleneck block, the inverted residual bottleneck block comprising: a second expansion layer configured to expand a second input feature representation with a second input number of channels to a second intermediate number of channels higher than the second input number of channels; a second depthwise convolutional layer downstream of the second expansion layer that is configured to apply a second depthwise convolution to generate a second intermediate feature representation with the second intermediate number of channels; and a second pointwise convolutional layer downstream of the second depthwise convolutional layer that is configured to apply a second pointwise convolution to generate a second output feature representation, wherein the second output feature representation comprises features that are linearly projected from the second intermediate number of channels to a second output number of channels lower than the second intermediate number of channels. 2 . The computing system of claim 1 , wherein the inverted residual bottleneck block is ordered immediately subsequent to the convolutional block. 3 . The computing system of claim 2 , wherein the convolutional block comprises a residual shortcut connection connecting an input and an output of the convolutional block. 4 . The computing system of claim 1 , wherein the second output number of channels of the inverted residual bottleneck block comprises one-half as many channels as an output number of channels of a downstream inverted residual bottleneck block. 5 . The computing system of claim 1 , wherein the second intermediate number of channels is six times as many channels as the second input number of channels. 6 . The computing system of claim 1 , wherein the second depthwise convolutional layer is configured to apply a relu6 operator. 7 . The computing system of claim 1 , wherein the one or more non-transitory computer-readable media store: a machine-learned model configured to process a feature map generated by the convolutional neural network. 8 . The computing system of claim 7 , wherein the machine-learned model is configured to perform an image processing task using the feature map. 9 . The computing system of claim 8 , wherein the image processing task comprises recognizing objects in the input image. 10 . The computing system of claim 1 , wherein the one or more non-transitory computer-readable media store instructions that are executable by the one or more processors to cause the computing system to perform operations comprising: generating a feature map by processing an image using the convolutional neural network; and processing the feature map using a machine-learned model to recognize objects in the image. 11 . The computing system of claim 10 , wherein the operations comprise: receiving the image from a client computing device; and returning an output generated using the convolutional neural network as part of a web service. 12 . The computing system of claim 10 , wherein the one or more non-transitory computer-readable media are included in a mobile computing device. 13 . The computing system of claim 1 , wherein the one or more non-transitory computer-readable media store instructions that are executable by the one or more processors to cause the computing system to perform operations comprising: generating a feature map by processing an image using the convolutional neural network; and processing the feature map to perform an image processing task. 14 . The computing system of claim 1 , wherein the first depthwise convolutional layer is configured to perform depthwise convolution with a stride of two. 15 . The computing system of claim 1 , wherein the input and the output of the inverted residual bottleneck block correspond to linear bottleneck layers. 16 . A computer-implemented method, comprising: providing, by a computing system comprising one or more processors, an image as input to a convolutional neural network, the convolutional neural network comprising: a convolutional block comprising: a first expansion layer configured to expand a first input feature representation with a first input number of channels to a first intermediate number of channels higher than the first input number of channels; a first depthwise convolutional layer downstream of the first expansion layer that is configured to apply a first depthwise convolution to generate a first intermediate feature representation with the first intermediate number of channels; a first pointwise convolutional layer downstream of the first depthwise convolutional layer that is configured to apply a first pointwise convolution to generate a first output feature representation, wherein the first output feature representation comprises features that are linearly projected from the first intermediate number of channels to a first output number of channels lower than the first intermediate number of channels; and an inverted residual bottleneck block downstream of the convolutional block, a residual shortcut connection connecting an input and an output of the inverted residual bottleneck block, the inverted residual bottleneck block comprising: a second expansion layer configured to expand a second input feature representation with a second input number of channels to a second intermediate number of channels higher than the second input number of channels; a second depthwise convolutional layer downstream of the second expansion layer that is configured to apply a second depthwise convolution to generate a second intermediate feature representation with the second intermediate number of channels; and a second pointwise convolutional layer downstream of the second depthwise convolutional layer that is configured to apply a second pointwise convolution to generate a second output feature representation, wherein the second output feature representation comprises features that are linearly projected from the second intermediate number of channels to a second output number of channels lower than the second intermediate number of channels; and generating, by the computing system and using the convolutional neural network, a feature map from

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • G06N3/04Primary

    Architecture, e.g. interconnection topology · CPC title

  • Combinations of networks · CPC title

  • Activation functions · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12547878B2 cover?
The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As anothe…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).