Accelerator for processing data

US10509846B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10509846-B2
Application numberUS-201715840552-A
CountryUS
Kind codeB2
Filing dateDec 13, 2017
Priority dateDec 13, 2017
Publication dateDec 17, 2019
Grant dateDec 17, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An accelerator for increasing the processing speed of a processor. The accelerator operates in two distinct modes. In a first mode for dense layer processing, row data sets and column data sets are sent to a multiplier for multiplication. In a second mode for sparse layer processing compressed row data sets are received by a row multiplexer and compressed column data sets are received by a column multiplexer. Each multiplexer is configured to compare the indexes of data sets with one another to determine matching indexes. When indexes match, the matching data sets are selected and sent to the multiplier for multiplication. When indexes do not match, data sets are stored in memory devices for subsequent cycles.

First claim

Opening claim text (preview).

What is claimed: 1. An apparatus for accelerating processing of one or more processors, the apparatus comprising: at least one processing element having a multiplier that receives row data sets and column data sets while operating in a first mode; at least one row multiplexer and at least one row memory device coupled to the at least one processing element for selecting a row data set received by the multiplier based on matching a row data set index with a column data set index while operating in a second mode; and at least one column multiplexer and at least one column memory device coupled to the at least one processing element for selecting a column data set received by the multiplier based on matching a column data set index with a row data set index while operating in the second mode. 2. The apparatus of claim 1 , wherein the at least one row memory device receives the row data set, and after receiving the row data set, the at least one row memory device bypasses the row data set through the at least one row memory device while operating in the second mode. 3. The apparatus of claim 1 , wherein the at least one processing element receives the row data set, and after receiving the row data set, the at least one processing element prevents transmission of additional row data sets to the at least one row memory device while operating in the second mode. 4. The apparatus of claim 1 , wherein the at least one row memory device receives the row data set, and after receiving the row data set, the at least one row memory device stores the row data set in the at least one row memory device while operating in the second mode. 5. The apparatus of claim 1 , wherein the at least one row multiplexer comprises a first row multiplexer coupled to the at least one processing element and a second row multiplexer coupled to the at least one processing element. 6. The apparatus of claim 5 , wherein the at least one row memory device comprises a first row memory device coupled to the first row multiplexer and the at least one processing element, and a second row memory device coupled to the second row multiplexer and the at least one processing element. 7. The apparatus of claim 6 , wherein the at least one column multiplexer comprises a first column multiplexer coupled to the at least one processing element, and a second column multiplexer coupled to the at least one processing element. 8. The apparatus of claim 7 , wherein the at least one column memory device comprises a first column memory device coupled to the first column multiplexer and the at least one processing element, and a second column memory device coupled to the second column multiplexer and the at least one processing element. 9. The apparatus of claim 1 , wherein prior to the multiplier receiving the selected row data set and selected column data set, the at least one processing element compares the row data set index of the selected row data set to the column data set index of the selected column data set while operating in the second mode. 10. The apparatus of claim 9 , wherein when the compared row data set index and column data set index do not match, the at least one processing element discards one of the selected row data set or selected column data set while operating in the second mode. 11. The apparatus of claim 10 , wherein the at least one processing element discards one of the selected row data set or selected column data set based on the row data set index or the column data set index. 12. The apparatus of claim 1 , wherein the at least one processing element is one of a plurality of processing elements. 13. The apparatus of claim 1 , wherein the at least one processor is a processor of a neural network. 14. The apparatus of claim 1 wherein the at least one processing element has an accumulator that is coupled to the multiplier. 15. The apparatus of claim 1 wherein the at least one row memory device is a first in and first out memory device. 16. At least one non-transitory machine readable medium including instructions for increasing processing speed, the instructions, when executed by a machine, cause the machine to perform operations comprising: operate a processing element in a first mode, wherein operating in the first mode comprises: receiving row data sets and transmitting the row data sets to a multiplier of the processing element; and receiving column data sets and transmitting the column data sets to the multiplier; and operate the processing element in a second mode, wherein operating in the second mode comprises: using at least one row multiplexer and at least one row memory coupled to the processing element to select a row data set received by the multiplier based on matching a row data set index with a column data set index; and using at least one row multiplexer and at least one row memory coupled to the processing element to select a column data set received by the multiplier based on matching a column data set index with a row data set index. 17. The at least one non-transitory machine readable medium of claim 16 , wherein the processing element in the second mode further operates to: receive the row data set at a row memory device coupled to the processing element; and bypass the row data set through the row memory device. 18. The at least one non-transitory machine readable medium of claim 16 , wherein the processing element in the second mode further operates to: receive the row data set at a row memory device coupled to the processing element; and prevent transmission of additional row data sets to the row memory device. 19. The at least one non-transitory machine readable medium of claim 16 , wherein the processing element in the second mode further operates to: receive the row data set at a row memory device coupled to the processing element; and store the row data set in the row memory device. 20. The at least one non-transitory machine readable medium of claim 16 , wherein the processing element in the second mode further operates to: compare a row data set index to a column data set index that does not match; and store the row data set in a row memory device coupled to the processing element or store the column data set in a column memory device coupled to the processing element based on the comparison of the row data set index to the column data set index. 21. The at least one non-transitory machine readable medium of claim 16 , wherein while operating in the second mode, selecting the row data set and selecting the column data set are further based on a read pointer reference. 22. The at least one non-transitory machine readable medium of claim 16 , wherein the processing element is one of a plurality of processing elements operated by the non-transitory, machine readable medium. 23. A method of operating a processing element of a hardware accelerator, the method comprising: operating a processing element of an apparatus in a first mode, wherein operating in the first mode comprises: receiving row data sets and transmitting the row data sets to a multiplier of the processing element; and receiving column data sets and transmitting the column data sets to the multiplier; and operating in a second mode, wherein operating in the second triode comprises: using at least one row multiplexer and at least one row memory, of the apparatus, coupled to the processing element to select a row data set received by the multiplier based on matching a row da

Assignees

Inventors

Classifications

  • Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

  • using electronic means · CPC title

  • G06F17/16Primary

    Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

  • to perform operations on data operands · CPC title

  • Neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10509846B2 cover?
An accelerator for increasing the processing speed of a processor. The accelerator operates in two distinct modes. In a first mode for dense layer processing, row data sets and column data sets are sent to a multiplier for multiplication. In a second mode for sparse layer processing compressed row data sets are received by a row multiplexer and compressed column data sets are received by a colu…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).