Highly efficient convolutional neural networks

US11734545B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11734545-B2
Application numberUS-201815898566-A
CountryUS
Kind codeB2
Filing dateFeb 17, 2018
Priority dateNov 14, 2017
Publication dateAug 22, 2023
Grant dateAug 22, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As another example, in some implementations, the neural network architectures of the present disclosure can include one or more inverted residual blocks where the input and output of the inverted residual block are thin bottleneck layers, while an intermediate layer is an expanded representation. For example, the expanded representation can include one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. A residual shortcut connection can exist between the thin bottleneck layers that play a role of an input and output of the inverted residual block.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system, comprising: one or more processors; and one or more non-transitory computer-readable media that store a convolutional neural network implemented by the one or more processors, the convolutional neural network comprising: one or more inverted residual blocks, each of the one or more inverted residual blocks comprising: one or more expansion convolutional layers configured to produce, from an input with an initial number of channels, an expanded representation with an expanded number of channels, the expanded number of channels being larger than the initial number of channels; a linear convolutional bottleneck layer downstream of the one or more expansion convolutional layers, the linear convolutional bottleneck layer configured to produce a bottleneck tensor with a reduced number of channels lower than the expanded number of channels; and a residual shortcut connection between the linear convolutional bottleneck layer and one or more of: a subsequent linear convolutional bottleneck layer of a subsequent inverted residual block of the one or more inverted residual blocks, or a previous linear convolutional bottleneck layer of a previous inverted residual block of the one or more inverted residual blocks. 2. The computing system of claim 1 , wherein the one or more expansion convolutional layers comprise a spatial convolutional layer that filters the expanded representation, and wherein the linear convolutional bottleneck layer projects the filtered expanded representation to the reduced number of channels. 3. The computing system of claim 2 , wherein the one or more expansion convolutional layers comprise one or more separable convolutional layers. 4. The computing system of claim 3 , wherein each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output. 5. The computing system of claim 1 , wherein the one or more inverted residual blocks comprise a plurality of inverted residual blocks arranged in a stack one after the other. 6. The computing system of claim 1 , wherein, for each of the one or more inverted residual blocks, the linear convolutional bottleneck layer is structurally subsequent to the one or more convolutional layers. 7. The computing system of claim 1 , wherein the computing system is configured to receive an input image and to generate an output for the input image. 8. The computing system of claim 1 , wherein the inverted residual block comprises a residual shortcut connection between the linear convolutional bottleneck layer and the subsequent linear convolutional bottleneck layer of the subsequent inverted residual block. 9. The computing system of claim 1 , wherein the inverted residual block comprises a residual shortcut connection between the linear convolutional bottleneck layer and the previous linear convolutional bottleneck layer of the previous inverted residual block. 10. The computing system of claim 8 , wherein the subsequent inverted residual block is an immediately subsequent inverted residual block of the one or more inverted residual blocks. 11. The computing system of claim 9 , wherein the previous inverted residual block is an immediately previous inverted residual block of the one or more inverted residual blocks. 12. One or more non-transitory computer-readable media that store a convolutional neural network configured to be implemented by one or more processors, the convolutional neural network comprising: one or more inverted residual blocks, each of the one or more inverted residual blocks comprising: one or more expansion convolutional layers configured to produce, from an input with an initial number of channels, an expanded representation with an expanded number of channels, the expanded number of channels being larger than the initial number of channels; a linear convolutional bottleneck layer downstream of the one or more expansion convolutional layers, the linear convolutional bottleneck layer configured to produce a bottleneck tensor with a reduced number of channels lower than the expanded number of channels; and a residual shortcut connection between the linear convolutional bottleneck layer and one or more of: a next linear convolutional bottleneck layer of a subsequent inverted residual block of the one or more inverted residual blocks, or a previous linear convolutional bottleneck layer of a previous inverted residual block of the one or more inverted residual blocks. 13. The one or more non-transitory computer-readable media of claim 10 , wherein the one or more expansion convolutional layers comprise a spatial convolutional layer that filters the expanded representation, and wherein the linear convolutional bottleneck layer projects the filtered expanded representation to the reduced number of channels. 14. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more expansion convolutional layers comprise one or more separable convolutional layers. 15. The one or more non-transitory computer-readable media of claim 12 , wherein each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable convolutional layer to generate a layer output. 16. The one or more non-transitory computer-readable media of claim 10 , wherein, for each of the one or more inverted residual blocks, the linear convolutional bottleneck layer is structurally subsequent to the one or more convolutional layers. 17. A method, comprising: processing, by a computing system comprising one or more processors, image data using a neural network, the neural network comprising one or more inverted residual blocks, each of the one or more inverted residual blocks comprising: one or more expansion convolutional layers configured to produce, from an input with an initial number of channels, an expanded representation with an expanded number of channels, the expanded number of channels being larger than the initial number of channels; a linear convolutional bottleneck layer downstream of the one or more expansion convolutional layers, the linear convolutional bottleneck layer configured to produce a bottleneck tensor with a reduced number of channels lower than the expanded number of channels; and a residual shortcut connection between the linear convolutional bottleneck layer and one or more of: a next linear convolutional bottleneck layer of a subsequent inverted residual block of the one or more inverted residual blocks, or a previous linear convolutional bottleneck layer of a previous inverted residual block of the one or more inverted residual blocks. 18. The method of claim 17 , wherein the one or more expansion convolutional layers comprise a spatial convolutional layer that filters the expanded representation, and wherein the linear convolutional bottleneck layer projects the filtered expanded representation to the reduced number of channels. 19. The method of claim 18 , wherein the one or more expansion convolutional layers comprise one or more separable convolutional layers. 20. The method of claim 19 , wherein each of the one or more separable convolutional layers is configured to separately apply both a depthwise convolution and a pointwise convolution during processing of an input to such separable

Assignees

Inventors

Classifications

  • G06N3/0464Primary

    Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • using neural networks · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11734545B2 cover?
The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As anothe…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/0464. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 22 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).