Processor and control method for processor

US2020150958A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020150958-A1
Application numberUS-201916671428-A
CountryUS
Kind codeA1
Filing dateNov 1, 2019
Priority dateNov 9, 2018
Publication dateMay 14, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor having a systolic array that can perform operations efficiently is provided. The processor includes multiple processing cores aligned in a matrix, and each of the processing cores includes an arithmetic unit array including multiple arithmetic units that can form a systolic array. Each of the processing cores includes a first memory that stores first data, a second memory that stores second data, a first multiplexer that connects a first input for receiving the first data at the arithmetic unit array to an output of the first memory in the processing core or an output of the arithmetic unit array in an adjacent processing core, and a second multiplexer that connects a second input for receiving the second data at the arithmetic unit array to an output of the second memory in the processing core or an output of the arithmetic unit array in an adjacent processing core.

First claim

Opening claim text (preview).

What is claimed is 1 . A processor, comprising: a plurality of processing cores, each of the processing cores including an arithmetic unit array including a plurality of arithmetic units, wherein each of the processing cores includes: a first memory that stores first data; a second memory that stores second data; a first multiplexer that connects a first input for receiving the first data at the arithmetic unit array to an output of the first memory in the processing core or an output of the arithmetic unit array in an adjacent processing core; and a second multiplexer that connects a second input for receiving the second data at the arithmetic unit array to an output of the second memory in the processing core or an output of the arithmetic unit array in an adjacent processing core. 2 . The processor as claimed in claim 1 , wherein each of the processing cores includes: a first address generator that generates a first address indicative of a destination storage of the first data output from the first memory; and a second address generator that generates a second address indicative of a destination storage of the second data output from the second memory. 3 . The processor as claimed in claim 1 , wherein each of the processing cores includes: an instruction memory that stores an instruction; and a third multiplexer that connects a third input for receiving an instruction at the arithmetic unit array to an output of the instruction memory in the processing core or a transmission path of an instruction in the arithmetic unit array in an adjacent processing core, and each of the arithmetic units includes: an instruction decoder that decodes an instruction; a plurality of types of arithmetic elements that perform operations based on the decoded instruction; and a register that stores data for use in operations or an operational result. 4 . The processor as claimed in claim 3 , wherein instructions are stored in the instruction memory in one of the plurality of processing cores, and an instruction output from the instruction memory is provided to the arithmetic unit in the processing core via the third multiplexer and to the arithmetic unit in a different processing core via the third multiplexer in the different processing core. 5 . The processor as claimed in claim 3 , wherein the plurality of types of arithmetic elements include a product sum operator and an arithmetic operator. 6 . The processor as claimed in claim 1 , wherein each of the plurality of processing cores has a resultant memory to store operational results of the plurality of arithmetic units in the processing core. 7 . The processor as claimed in claim 1 , wherein one of the first data and the second data is input data for use in a convolutional operation, and the other is weight data for use in the convolutional operation. 8 . The processor as claimed in claim 1 , further comprising: a network that interconnects the plurality of processing cores; and a controller that controls transmissions of the first data to the first memory and of the second data to the second memory and operations of the first multiplexer and the second multiplexer. 9 . The processor as claimed in claim 8 , wherein the controller controls to transmit the first data stored in the first memory in any of the plurality of processing cores to the first memory in a different processing core via the network during execution of an operation at the arithmetic unit array, and the controller controls to transmit the second data stored in the second memory in any of the plurality of processing cores to the second memory in a different processing core via the network during execution of an operation at the arithmetic unit array. 10 . The processor as claimed in claim 8 , wherein the plurality of processing cores are arranged in a matrix, and the controller: interconnects the plurality of arithmetic unit arrays aligned in a first direction via the first multiplexer and the plurality of arithmetic unit arrays aligned in a second direction orthogonal to the first direction via the second multiplexer to form a systolic array including a predetermined number of arithmetic units; connects an arithmetic unit located at an end of the plurality of arithmetic unit arrays aligned in the first direction to an output of the first memory; connects an arithmetic unit located at an end of the plurality of arithmetic unit arrays aligned in the second direction to an output of the second memory; and causes the first memory and the second memory to output the first data and the second data, respectively, and causes an arithmetic unit in the systolic array to perform operations. 11 . A control method for a processor, wherein the processor has a plurality of processing cores, each including an arithmetic unit array including a plurality of arithmetic units, a first memory storing first data, a second memory storing second data, a first multiplexer connecting a first input for receiving the first data at the arithmetic unit array to an output of the first memory in the processing core or an output of an arithmetic unit array in an adjacent processing core, and a second multiplexer connecting a second input for receiving the second data at the arithmetic unit array to an output of the second memory in the processing core or an output of an arithmetic unit array in an adjacent processing core, the method comprising: connecting an output of the first memory to one of a first predetermined number of arithmetic units aligned in a first direction, connecting an output of the second memory to one of a second predetermined number of arithmetic units aligned in a direction different from the first direction, using the first multiplexer to form a path to transmit the first data output from the first memory to the first predetermined number of arithmetic units sequentially and using the second multiplexer to form a path to transmit the second data output from the second memory to the second predetermined number of arithmetic units sequentially to form a systolic array including a predetermined number of arithmetic units; and transferring the first data output from the first memory to the systolic array and transferring the second data output from the second memory to the systolic array to cause an arithmetic unit in the systolic array to perform operations. 12 . The control method as claimed in claim 11 , further comprising: generating a first address to be supplied to the first memory and outputting the first data corresponding to the first address from the first memory; and generating a second address to be supplied to the second memory and outputting the second data corresponding to the second address from the second memory. 13 . The control method as claimed in claim 11 , wherein each of the plurality of processing cores has an instruction memory storing instructions and a third multiplexer connecting a third input for receiving an instruction at the arithmetic unit array to an output of the instruction memory or a transmission path of an instruction in an arithmetic unit array in an adjacent processing core, and the method further comprises storing an instruction in the instruction memory in a processing core corresponding to a corner part of the systolic array, transferring instructions output from the instruction memory to arithmetic units in the systolic array sequentially and causing the respective arithmetic units to perform operations corresponding to the instructions. 14 . The control method as claimed in claim 11 , wherein the plurality of processing cores are interconnected via a

Assignees

Inventors

Classifications

  • Systolic arrays · CPC title

  • Learning methods · CPC title

  • G06F9/3001Primary

    Arithmetic instructions · CPC title

  • Instruction analysis, e.g. decoding, instruction word fields · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020150958A1 cover?
A processor having a systolic array that can perform operations efficiently is provided. The processor includes multiple processing cores aligned in a matrix, and each of the processing cores includes an arithmetic unit array including multiple arithmetic units that can form a systolic array. Each of the processing cores includes a first memory that stores first data, a second memory that store…
Who is the assignee on this patent?
Preferred Networks Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 14 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).