Basic wavelet filtering for accelerated deep learning

US12169771B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12169771-B2
Application numberUS-202017764691-A
CountryUS
Kind codeB2
Filing dateOct 15, 2020
Priority dateOct 16, 2019
Publication dateDec 17, 2024
Grant dateDec 17, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques in wavelet filtering for advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element comprises a compute element to execute programmed instructions using the data and a router to route the wavelets in accordance with virtual channel specifiers. Each processing element is enabled to perform local filtering of wavelets received at the processing element, selectively, conditionally, and/or optionally discarding zero or more of the received wavelets, thereby preventing further processing of the discarded wavelets. The wavelet filtering is performed by one or more configurable wavelet filters operable in various modes, such as counter, sparse, and range modes.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: communicating packets between a plurality of processing elements coupled via a fabric, wherein each of the communicated packets comprises a respective color field; receiving, from the fabric, a particular one of the communicated packets in a particular one of the processing elements, wherein the color field of the particular packet is a particular color value; managing a wavelet filter of the particular processing element, wherein the wavelet filter is enabled to retain a state comprising a filtered color; and queueing information of the particular packet to enable execution, by the particular processing element, of an instruction that uses the queued information, wherein the queueing is conditional based on one or more queueing criteria and a first one of the queueing criteria is that the particular color value and the filtered color are different. 2. A method comprising: communicating packets between a plurality of processing elements coupled via a fabric, wherein each of the communicated packets comprises a respective color field; receiving, from the fabric, a particular one of the communicated packets in a particular one of the processing elements, wherein the color field of the particular packet is a particular color value; managing a wavelet filter of the particular processing element, wherein the wavelet filter is enabled to retain a state comprising a filtered color; and updating a counter, wherein: the updating is conditional based on one or more updating criteria, a first one of the updating criteria is that the particular color value and the filtered color are identical, the particular packet is associated, based on the particular color value, with one or more input queues, each of the input queues is configurable to operate in one of a plurality of mutually exclusive input queue operating modes, a second one of the updating criteria is that at least one of the input queues is configured to operate in a particular one of the input queue operating modes, the particular input queue operating mode is indicated by an indicator, the updating is responsive to the receiving, and the state further comprises the counter and the indicator. 3. A method comprising: communicating packets between a plurality of processing elements coupled via a fabric, wherein each of the communicated packets comprises a respective color field; receiving, from the fabric, a particular one of the communicated packets in a particular one of the processing elements, wherein the color field of the particular packet is a particular color value; managing a wavelet filter of the particular processing element, wherein the wavelet filter is enabled to retain a state comprising a filtered color; queueing information of the particular packet to enable execution, by the particular processing element, of an instruction that uses the queued information, wherein the queueing is conditional based on one or more queueing criteria and a first one of the queueing criteria is that the particular color value and the filtered color are different; and updating a counter, wherein: the updating is conditional based on one or more updating criteria and a first one of the updating criteria is that the particular color value and the filtered color are identical, the updating is responsive to the receiving, and the state further comprises the counter. 4. The method of claim 3 , wherein the state further comprises a filter mode indicator enabled to indicate a mutually exclusive one of a plurality of filter operating modes and a second one of the queueing criteria is that the filter mode indicator indicates a particular one of the filter operating modes. 5. The method of claim 3 , wherein the state further comprises a filter mode indicator enabled to indicate a mutually exclusive one of a plurality of filter operating modes and a second one of the updating criteria is that the filter mode indicator indicates a particular one of the filter operating modes. 6. The method of claim 3 , wherein the state further comprises a filter mode indicator enabled to indicate a mutually exclusive one of a plurality of filter operating modes, a second one of the queueing criteria is that the filter mode indicator indicates a particular one of the filter operating modes, and a second one of the updating criteria is that the filter mode indicator indicates the particular filter operating mode. 7. The method of claim 4, claim 5, or claim 6 , wherein the filter operating modes comprise a counter mode, a sparse mode, and a range mode. 8. The method of claim 3 , wherein the state further comprises a limit and a second one of the queueing criteria is that the counter is one of less than the limit and equal to the limit. 9. The method of claim 8 , wherein the counter is conditionally resettable based on a comparison to the limit. 10. The method of claim 9 , wherein the limit is a first limit, the state further comprises a second limit, and the first limit is conditionally loadable with the second limit. 11. The method of claim 2 or claim 3 , wherein the managing is a first managing, the wavelet filter is a first wavelet filter, the state is a first state, the filtered color is a first filtered color, the updating is a first updating, the counter is a first counter, and the updating criteria is a first updating criteria; and further comprising: a second managing a second wavelet filter of the particular processing element, wherein the second wavelet filter is enabled to retain a second state comprising a second filtered color; and a second updating a second counter, wherein the second updating is conditional based on one or more second updating criteria and a first one of the second updating criteria is that the particular color value and the second filtered color are identical, and the second updating is responsive to the receiving. 12. The method of claim 1, claim 2, or claim 3 , wherein one or more configuration registers of the particular processing element comprise one or more portions of the state. 13. The method of claim 12 , wherein at least one of the configuration registers is modifiable by a load control register instruction that is executable by the particular processing element. 14. The method of claim 12 , wherein at least one of the configuration registers is memory-mapped and modifiable by at least one memory store instruction that is executable by the particular processing element. 15. The method of claim 12 , wherein at least one of the configuration registers is modifiable via a system interface of the particular processing element. 16. The method of claim 1 or claim 3 , wherein the execution by the particular processing element is implemented at least in part by a compute element that the particular processing element comprises. 17. The method of claim 3 , further comprising, transmitting to the fabric a copy of the particular packet, as one of the communicated packets, to another one of the processing elements, wherein the transmitting is irrespective of the queueing criteria and irrespective of the updating criteria. 18. The method of claim 1 or claim 3 , wherein the queued information comprises an integer data value and the instruction comprises an integer arithmetic instruction. 19. The method of claim 1 or claim 3 , wherein the queued information comprises a floating-point data value and the instruction comprises a floating-point arithmetic instruction. 20. The method of claim 1 or claim 3 , wher

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Supervised learning · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

  • Bus transfer protocol, e.g. handshake; Synchronisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12169771B2 cover?
Techniques in wavelet filtering for advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element comprises a compute element to execute programmed instructions using the data and a router…
Who is the assignee on this patent?
Cerebras Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/148. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).