What technology area does this patent fall under?

Primary CPC classification G06N3/0495. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Model compression and fine-tuning

US10223635B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10223635-B2
Application number	US-201514846579-A
Country	US
Kind code	B2
Filing date	Sep 4, 2015
Priority date	Jan 22, 2015
Publication date	Mar 5, 2019
Grant date	Mar 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Compressing a machine learning network, such as a neural network, includes replacing one layer in the neural network with compressed layers to produce the compressed network. The compressed network may be fine-tuned by updating weight values in the compressed layer(s).

First claim

Opening claim text (preview).

What is claimed is: 1. A method of compressing a neural network, comprising: replacing at least one layer in the neural network with a plurality of compressed layers to produce a compressed neural network, a dimension of each compressed layer being less than a dimension of the at least one layer, and a size of a memory footprint of the compressed neural network being less than a memory footprint of the neural network; inserting nonlinearity between adjacent compressed layers of the compressed neural network by applying a nonlinear activation function to neurons of the plurality of compressed layers, such that the plurality of compressed layers are nonlinear layers; and fine-tuning the compressed neural network by updating weight values in at least one of the plurality of compressed layers. 2. The method of claim 1 , in which the nonlinear activation function is a rectifier, absolute value function, hyperbolic tangent function or a sigmoid function. 3. The method of claim 1 , in which the fine-tuning is performed by updating the weight values in the compressed neural network. 4. The method of claim 3 , in which the fine-tuning comprises updating weight values in at least one of a subset of the plurality of compressed layers or in a subset of uncompressed layers. 5. The method of claim 3 , in which the fine-tuning is performed using training examples, the training examples comprising at least one of a first set of examples used to train an uncompressed neural network or a new set of examples. 6. The method of claim 1 , further comprising: initializing the neural network by repeatedly applying compression, insertion of nonlinear layers, and the fine-tuning as a method for initializing deeper neural networks. 7. A method of compressing a neural network, comprising: replacing at least one layer in the neural network with multiple compressed layers to produce a compressed neural network such that a receptive field size of the multiple compressed layers combined matches a receptive field size of uncompressed layers, a dimension of each of the multiple compressed layers being less than a dimension of the at least one layer, and a size of a memory footprint of the compressed neural network being less than a memory footprint of the neural network; inserting nonlinearity between adjacent compressed layers of the compressed neural network by applying a nonlinear activation function to neurons of the plurality of compressed layers, such that the plurality of compressed layers are nonlinear layers; and fine-tuning the compressed neural network by updating weight values in at least one compressed layer. 8. The method of claim 7 , in which a kernel size of the uncompressed layers is equal to the receptive field size. 9. The method of claim 7 , in which the replacing comprises replacing at least one layer in the neural network having a kernel size k x ×k y with the multiple compressed layers of a same type with the kernel sizes k 1x ×k 1y , k 2x ×k 2y . . . k Lx ×k Ly to produce the compressed network in which properties (k 1x −1)+(k 2x −1) + . . . +(k Lx −1)=(k x −1) and (k 1y −1)+(k 2y −1)+ . . . +(k Ly −1)=(k y −1) are satisfied, and in which L is an Lth layer. 10. The method of claim 9 , in which a convolutional layer with the kernel size k x ×k y is replaced with three convolutional layers with the kernel sizes 1×1, k x ×k y and 1×1, respectively. 11. A method of compressing a neural network, comprising: replacing at least one layer in the neural network with a plurality of compressed layers to produce a compressed neural network, a dimension of each compressed layer being less than a dimension of the at least one layer, and a size of a memory footprint of the compressed neural network being less than a memory footprint of the neural network; determining weight matrices of the plurality of compressed layers by applying an alternating minimization process; and inserting nonlinearity between adjacent compressed layers of the compressed neural network by applying a nonlinear activation function to neurons of the plurality of compressed layers, such that the plurality of compressed layers are nonlinear layers. 12. The method of claim 11 , further comprising fine-tuning the compressed neural network by updating weight values in at least one of the plurality of compressed layers. 13. The method of claim 12 , in which the fine-tuning includes updating weight values in at least one of a subset of the plurality of compressed layers, or a subset of uncompressed layers. 14. The method of claim 12 , in which the fine-tuning is performed in multiple stages, in which in a first stage the fine-tuning is performed on a subset of the plurality of compressed layers, and in a second stage the fine-tuning is performed on a subset of the plurality of compressed layers and uncompressed layers. 15. An apparatus for compressing a neural network, comprising: a memory; and at least one processor coupled to the memory, the at least one processor configured: to replace at least one layer in the neural network with a plurality of compressed layers to produce a compressed neural network, a dimension of each compressed layer being less than a dimension of the at least one layer, and a size of a memory footprint of the compressed neural network being less than a memory footprint of the neural network; to insert nonlinearity between adjacent compressed layers of the compressed neural network by applying a nonlinear activation function to neurons of the plurality of compressed layers, such that the plurality of compressed layers are nonlinear layers; and to fine-tune the compressed neural network by updating weight values in at least one compressed layer. 16. The apparatus of claim 15 , in which the nonlinear activation function is a rectifier, absolute value function, hyperbolic tangent function or a sigmoid function. 17. The apparatus of claim 15 , in which the at least one processor is further configured to perform the fine-tune by updating the weight values in the compressed neural network. 18. The apparatus of claim 17 , in which the at least one processor is further configured to perform the fine-tuning by updating weight values in at least one of a subset of compressed layers or in a subset of uncompressed layers. 19. The apparatus of claim 17 , in which the at least one processor is further configured to perform the fine-tuning by using training examples, the training examples comprising at least one of a first set of examples used to train an uncompressed neural network or a new set of examples. 20. The apparatus of claim 15 , in which the at least one processor is further configured to initialize the neural network by repeatedly applying compression, insertion of nonlinear layers, and the fine-tuning as a method for initializing deeper neural networks. 21. An apparatus for compressing a neural network, comprising: a memory; and at least one processor coupled to the memory, the at least one processor configured: to replace at least one layer in the neural network with multiple compressed layers to produce a compressed neural network such that a receptive field size of the multiple compressed layers combined matches a receptive field size of uncompressed layers, a dimension of each of the multiple compressed layers being less than a dimension of the at least one layer, and a size of a memory footprint of the compressed neural network being less than a memory footprint of the neural network; to insert nonlinearity betw

Assignees

Qualcomm Inc

Inventors

Classifications

G06N3/0495Primary
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/08
Learning methods · CPC title
G06N3/04Primary
Architecture, e.g. interconnection topology · CPC title
G06N3/082Primary
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

View patent family 55085908

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10223635B2 cover?: Compressing a machine learning network, such as a neural network, includes replacing one layer in the neural network with compressed layers to produce the compressed network. The compressed network may be fine-tuned by updating weight values in the compressed layer(s).
Who is the assignee on this patent?: Qualcomm Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/0495. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).