Compression for deep learning in case of sparse values mapped to non-zero value

US2019197420A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019197420-A1
Application numberUS-201715853457-A
CountryUS
Kind codeA1
Filing dateDec 22, 2017
Priority dateDec 22, 2017
Publication dateJun 27, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments described herein provide a processing apparatus comprising compute logic to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute logic additionally includes a direct memory access (DMA) controller including a hardware codec having an encode unit and a decode unit, the DMA controller to read the neural network data from the memory buffer, encode the neural network data via the encode unit, write encoded neural network data to a memory device coupled with the processing apparatus, write metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decode encoded neural network data via the decode unit in response to a request from the compute logic.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processing apparatus comprising: compute logic to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer; and a direct memory access (DMA) controller including a hardware codec having an encode unit and a decode unit, the DMA controller to read the neural network data from the memory buffer, encode the neural network data via the encode unit, write encoded neural network data to a memory device coupled with the processing apparatus, write metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decode encoded neural network data via the decode unit in response to a request from the compute logic. 2 . The processing apparatus as in claim 1 , wherein: the compute logic to request the DMA controller to read the encoded neural network data from the memory device; in response to the request from the compute logic the DMA controller to pre-fetch the metadata for the encoded neural network data; and the DMA controller to decode the encoded neural network data based on the pre-fetched metadata. 3 . The processing apparatus as in claim 1 , wherein the neural network data includes feature map data and kernel data. 4 . The processing apparatus as in claim 3 , wherein the hardware codec is to encode the feature map data using an encode mode selected from a set of multiple encode modes. 5 . The processing apparatus as in claim 4 , wherein the set of multiple encode modes include an encode mode to the neural network data in a reduced-bit representation via encode of two or more of unique absolute values, non-zero values, and residual values. 6 . The processing apparatus as in claim 5 , wherein the set of multiple encode modes additionally include an encode mode to encode an arithmetic sequence of values in a reduced bit representation. 7 . The processing apparatus as in claim 6 , wherein the set of multiple encode modes additionally include an encode mode to encode a neural network data having a high frequency value in a reduced bit representation. 8 . The processing apparatus as in claim 1 , wherein the compute logic to generate the neural network data is compute logic within a general-purpose graphics processing unit. 9 . A method of performing processing operations to enable a convolutional neural network (CNN) the method comprising: decoding encoded kernel data for the CNN while reading the encoded kernel data from memory; generating feature map data for a layer of the CNN via compute logic within a general-purpose graphics processing unit using decoded kernel data; encoding the feature map data for the layer of the CNN via hardware encode logic within a direct memory access (DMA) controller during a write to memory; decoding encoded feature map data while reading the encoded feature map data from memory; and processing the feature map data as input feature map data for a next layer of the CNN, wherein decoding the encoded kernel data includes pre-fetching metadata associated with the encoded kernel data, the metadata associated with the encoded kernel data stored separately from the encoded kernel data. 10 . The method as in claim 9 , wherein decoding the encoded feature map data includes pre-fetching metadata associated with the encoded feature map data, the metadata associated with the encoded feature map data stored separately from the encoded feature map data. 11 . The method as in claim 9 , additionally comprising decoding the feature map data via hardware decode logic within the DMA controller. 12 . The method as in claim 9 , additionally comprising encoding the feature map data via hardware encode logic using one or more encode modes selected from a set of multiple encode modes, wherein the set of multiple encode modes includes include encode modes to store kernel data or feature map data in a reduced-bit representation via encode of two or more of unique absolute values, non-zero values, and residual values and wherein the set of multiple encode modes additionally include an encode mode to encode an arithmetic sequence of values in a reduced-bit representation. 13 . The method as in claim 9 , additionally comprising encoding kernel data for the CNN via the hardware encode logic within the DMA controller. 14 . The method as in claim 13 , wherein encoding the kernel data for the CNN includes: analyzing the kernel data to encode; determining that a high-frequency value of the kernel data is mapped to a zero value; storing the high-frequency value to a block of encoded kernel data; encoding the kernel data into the block of encoded kernel data, the encoded kernel data including the map of the high-frequency value; and writing encoded kernel data and the metadata associated with the encoded kernel data to memory. 15 . A data processing system configured to perform operations to enable a convolutional neural network (CNN), the data processing system comprising: a memory device to store feature map data for the CNN; a non-volatile storage device to persistently store kernel data for the CNN; a processor including a general-purpose graphics processor compute block and a DMA controller; wherein the general-purpose graphics processor compute block is to generate output feature map data for the CNN using the kernel data and write the output feature map data to a memory buffer within the processor; and wherein the DMA controller includes a hardware codec including an encode unit to automatically encode the output feature map data during a write of the output feature map data to the memory device. 16 . The data processing system as in claim 15 , wherein the DMA controller, via the encode unit of the hardware codec, is to write the feature map data to the memory device in an encoded format including one or more encode modes selected from a set of multiple encode modes, the set of multiple encode modes including encode modes based on two or more of unique absolute value encoding, arithmetic sequence encoding, significance map encoding, unique value coordinate encoding, and mean encoding value encoding. 17 . The data processing system as in claim 16 , wherein the DMA controller includes a hardware codec including a decode unit to automatically decode output feature map to be read from the memory device. 18 . The data processing system as in claim 17 , the kernel data for the CNN stored on the non-volatile storage device as encoded kernel data, the encoded kernel data to be loaded to the memory device, and the DMA controller to automatically decode the encoded kernel data during a read of the encoded kernel data from the memory device. 19 . The data processing system as in claim 18 , the DMA controller to automatically encode kernel data during a write to the memory device. 20 . The data processing system as in claim 19 , wherein to automatically encode the kernel data, the DMA controller is to: analyze the kernel data to encode; determine that a high-frequency value of the kernel data is mapped to a zero value; generate a map of each instance of the high-frequency value in the kernel data; encode the kernel data into encoded kernel data, the encoded kernel data including the map of each instance of the high frequency value; and write the encoded kernel data and metadata associated with the encoded kernel data to memory, the metadata written to a different block of memory than the encoded kernel data.

Assignees

Inventors

Classifications

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • G06F13/28Primary

    using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title

  • Learning methods · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019197420A1 cover?
Embodiments described herein provide a processing apparatus comprising compute logic to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute logic additionally includes a direct memory access (DMA) controller including a hardware codec having an encode unit and a decode unit, the DMA controller to read the neural…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 27 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).