Compression of fully connected / recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression

US11977974B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11977974-B2
Application numberUS-201715827465-A
CountryUS
Kind codeB2
Filing dateNov 30, 2017
Priority dateNov 30, 2017
Publication dateMay 7, 2024
Grant dateMay 7, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, having a memory that stores computer executable components, and a processor that executes the computer executable components, reduces data size in connection with training a neural network by exploiting spatial locality to weight matrices and effecting frequency transformation and compression. A receiving component receives neural network data in the form of a compressed frequency-domain weight matrix. A segmentation component segments the initial weight matrix into original sub-components, wherein respective original sub-components have spatial weights. A sampling component applies a generalized weight distribution to the respective original sub-components to generate respective normalized sub-components. A transform component applies a transform to the respective normalized sub-components. A cropping component crops high-frequency weights of the respective transformed normalized sub-components to yield a set of low-frequency normalized sub-components to generate a compressed representation of the original sub-components.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for compressing data during neural network training, comprising: a memory that stores computer executable components and neural network data; a processor that executes computer executable components stored in the memory, wherein the computer executable components: generate a weight matrix, wherein the weight matrix comprises respective weights to be applied to a neural network, and wherein the weight matrix is initialized based on a degree of spatial correlation at the beginning of the training of the neural network, and wherein generation comprises: providing corner weights of sub-blocks of respective original sub-components from a distribution of random numbers; and employing bilinear interpolation to fill up remaining values; and wherein the computer-executable components comprise: a receiving component that receives neural network data in the form of the weight matrix; a segmentation component that: segments the weight matrix into a plurality of original sub-components, wherein respective original sub-components in the plurality of original sub-components comprise a subset of the respective weights in the weight matrix, and wherein selected weights of the respective original sub-components within a defined region have values indicative of a degree of spatial correlation with one another; a transform component that applies a transform to the respective original sub-components resulting in data concentrated in a f first frequency segment; and a cropping component that crops a second frequency segment comprising high-frequency weights from respective transformed sub-components to generate compressed representations of original sub-components during training of the neural network, wherein the system for compressing data during neural network training results in greater compression ratio at a defined accuracy; an inverse transform component that performs an inverse transform on the data in the first frequency segment and a remaining area that is padded with zeros, wherein the first frequency segment comprises low frequency data and the inverse transform yields a data block of spatial weights that are an approximate representation of the respective original sub-component, and wherein the system: trains the neural network employing the data block of spatial weights that are an approximate representation of the respective original sub-component; determines whether a same training accuracy is achieved as a training task wherein compression is not applied and: continue neural network training if a same training accuracy is achieved; or vary the degree of spatial correlation and re-initialize the weight matrix based on the degree of spatial correlation if a same training accuracy is not achieved as a training task wherein compression is not applied. 2. The system of claim 1 , wherein low-frequency weights are located in a first region of the transformed original sub-components and high-frequency weights are located in a second region of the transformed original sub-components, wherein the first region is located in a corner of the respective transformed original sub-components. 3. The system of claim 1 , wherein the inverse transform component applies an inverse discrete cosine transform function to transform the data in the first frequency segment and a remaining area that is padded with zeros to a spatial domain. 4. The system of claim 1 , further comprising a communication component that transmits the compressed representations of original sub-components. 5. A computer-implemented method, comprising employing a processor and memory to execute computer executable components to perform the following acts comprising: generating a weight matrix, wherein the weight matrix comprises respective weights to be applied to a neural network, and wherein the weight matrix is initialized based on a degree of spatial correlation at the beginning of training of the neural network, and wherein the generating comprises: providing corner weights of sub-blocks of respective original sub-components from a distribution of random numbers; and employing bilinear interpolation to fill up remaining values; receiving neural network data in the form of the weight matrix, and wherein spatial locality is present in ones of the respective weights in a defined region; segmenting the weight matrix into a plurality of original sub-components, wherein respective original sub-components comprise a subset of the weights in the weight matrix, and wherein selected weights of the respective original sub-components within a defined region have values indicative of a degree of spatial correlation with one another; applying a transform to respective original sub-components generating a matrix of frequency domain weights determined based on the spatial weights in the respective original sub-components; and cropping high-frequency weights of the respective transformed original sub-components while retaining low frequency weights generating a set of sub-components comprising the low-frequency weights padded with zeros, wherein the computer-implemented method compresses data during neural network training resulting in lower memory and bandwidth usage by a system employing the computer-implemented method; performing an inverse transform on the set of sub-components, wherein the set of sub-components comprises low frequency data and the inverse transform yields a data block of spatial weights that are an approximate representation of the respective original sub-component, and wherein the method further comprises: training the neural network employing the data block of spatial weights that are an approximate representation of the respective original sub-component; determining whether a same training accuracy is achieved as a training task wherein compression is not applied and: continuing neural network training if a same training accuracy is achieved; or varying spatial locality and re-initializing the weight matrix based on a degree of spatial correlation if a same training accuracy is not achieved as a training task wherein compression is not applied. 6. The method of claim 5 , wherein the applying a transform comprises applying a discrete cosine transform. 7. The method of claim 5 , wherein applying an inverse transform comprises applying an inverse discrete cosine transform function to transform the set of sub-components to a spatial domain. 8. The method of claim 5 , wherein the set of sub-components area compressed representation of the weight matrix. 9. A computer program product for compressing training data, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: generate a weight matrix, wherein the weight matrix comprises respective weights to be applied to a neural network, and wherein the weight matrix is initialized based on a degree of spatial correlation at the beginning of the training of the neural network, and wherein generation comprises: determining corner weights of sub-blocks of respective original sub-components from a distribution of random numbers; and employing bilinear interpolation to fill up remaining values; receive neural network data in the form of the weight matrix, and wherein spatial locality is present in ones of the respective weights near one another in a defined region; segment the weight matrix into original sub-components, wherein respective original sub-components comprise a subset of weights in the weight matrix, and wherein selected weights of the respective original sub-components within a defined region have values indicative of a

Assignees

Inventors

Classifications

  • G06N3/098Primary

    Distributed learning, e.g. federated learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform (G06F17/145 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11977974B2 cover?
A system, having a memory that stores computer executable components, and a processor that executes the computer executable components, reduces data size in connection with training a neural network by exploiting spatial locality to weight matrices and effecting frequency transformation and compression. A receiving component receives neural network data in the form of a compressed frequency-dom…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/098. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 07 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).