Systems and methods for deep learning processor
US-2017316312-A1 · Nov 2, 2017 · US
US11442889B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11442889-B2 |
| Application number | US-201816146886-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 28, 2018 |
| Priority date | Sep 28, 2018 |
| Publication date | Sep 13, 2022 |
| Grant date | Sep 13, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for dynamically reconfiguring a deep learning processor by operating the deep learning processor using a first configuration. The deep learning processor then tracking one or more parameters of a deep learning program executed using the deep learning processor in the first configuration. The deep learning processor then reconfigures the deep learning processor to a second configuration to enhance efficiency of the deep learning processor executing the deep learning program based at least in part on the one or more parameters.
Opening claim text (preview).
What is claimed is: 1. An integrated circuit device comprising: a deep learning processor comprising: a matrix execution circuitry that couples to and configures one or more processing elements of a programmable fabric, wherein the matrix execution circuitry comprises configuration memory to store configurations and to load the configurations as configuring the one or more processing elements; and an instruction-based controller configured to: monitor one or more parameters of a deep learning algorithm implemented in the deep learning processor, wherein the one or more parameters comprises sparsity of data analyzed b the deep learning processor; and based at least in part on the one or more monitored parameters, reconfigure the one or more processing elements to increase efficiency of the deep learning processor. 2. The integrated circuit device of claim 1 , wherein the deep learning processor is implemented in a programmable logic device. 3. The integrated circuit device of claim 2 , wherein the programmable logic device comprises a programmable fabric that includes the one or more processing elements. 4. The integrated circuit device of claim 3 , wherein reconfiguring the one or more processing elements comprises partially reconfiguring the programmable fabric at runtime without reconfiguring the entire programmable fabric. 5. The integrated circuit device of claim 2 , wherein the programmable logic device comprises a field-programmable gate array. 6. The integrated circuit device of claim 1 , wherein the deep learning processor comprises an external data management sub-system that controls an interface with an external device external to the integrated circuit device using a programmable fabric. 7. The integrated circuit device of claim 6 , wherein the instruction-based controller configures the interface via the external data management sub-system to include to compress transmissions to the external device. 8. The integrated circuit device of claim 6 , wherein the instruction-based controller configures the interface via the external data management sub-system to include to cryptographically secure transmissions to the external device. 9. The integrated circuit device of claim 1 , wherein reconfiguring the one or more processing elements comprises reconfiguring connections between the one or more processing elements. 10. The integrated circuit device of claim 1 , wherein reconfiguring the one or more processing elements comprises internal portions of the processing elements. 11. The integrated circuit device of claim 10 , wherein reconfiguring the one or more processing elements comprises selecting an engine between a standard dot product engine and a standardized dot product engine and configuring the one or more processing elements with the selected engine. 12. The integrated circuit device of claim 11 , wherein when the data analyzed comprises a sparse matrix, reconfiguring the one or more processing elements—includes sparse support circuitry of the one or more processing elements that compresses the sparse matrix for processing by one or more processing elements. 13. A method comprising: operating deep learning processor using a first configuration; tracking one or more parameters of a deep learning program using the deep learning processor in the first configuration, wherein the one or more parameters comprises sparsity of data analyzed by the deep learning processor; and reconfiguring one or more processing elements of a programmable fabric of the deep learning processor to a second configuration to enhance efficiency of the deep learning processor executing the deep learning program based at least in part on the one or more parameters. 14. The method of claim 13 , wherein tracking the one or more parameters comprises latency of the deep learning processor. 15. The method of claim 14 , wherein reconfiguring the deep learning processor comprises reconfiguring the deep learning processor to decrease latency when the latency has exceeded a threshold by reconfiguring the deep learning processor to arrange a broadcast configuration of one or more processing elements in a parallel configuration. 16. The method of claim 14 , wherein the one or more parameters comprises a stall in execution of the deep learning program, and reconfiguring the deep learning processor comprises reconfiguring the deep learning processor to increase throughput when throughput has dropped below a threshold by reconfiguring the deep learning processor to arrange a two-dimensional systolic configuration of one or more processing elements. 17. Tangible, non-transitory, and computer-readable medium having instructions stored thereon instructions, that when executed, are configured to cause a deep learning processor to: configure the deep learning processor in a first configuration; operate the deep learning processor using the first configuration; track one or more parameters of a deep learning program using the deep learning processor in the first configuration, wherein the one or more parameters comprises sparsity of data analyzed by the deep learning processor; and based on the one or more tracked parameters, reconfigure one or more processing elements of a programmable fabric to a second configuration to enhance efficiency of the deep learning processor executing the deep learning program. 18. The tangible, non-transitory, and computer-readable medium of claim 17 , wherein the deep learning processor comprises a multi-function sub-system that controls configuration of multiple functions in the one or more processing elements, wherein a first configuration comprises a series different functions implemented in the one or more processing elements, and a second configuration comprises parallel execution of a single function in the one or more processing elements.
the bridge chips being embedded in the package substrates, interposers or redistribution layers · CPC title
Package configurations · CPC title
between a chip and a stacked insulating package substrate, interposer or RDL · CPC title
Dispositions of multiple bumps · CPC title
changes in dispositions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.