Processor with memory array operable as either cache memory or neural network unit memory

US10664751B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10664751-B2
Application numberUS-201615366027-A
CountryUS
Kind codeB2
Filing dateDec 1, 2016
Priority dateDec 1, 2016
Publication dateMay 26, 2020
Grant dateMay 26, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor comprising a mode indicator, a plurality of processing cores, and a neural network unit (NNU), comprising a memory array, an array of neural processing units (NPU), cache control logic, and selection logic that selectively couples the plurality of NPUs and the cache control logic to the memory array. When the mode indicator indicates a first mode, the selection logic enables the plurality of NPUs to read neural network weights from the memory array to perform computations using the weights. When the mode indicator indicates a second mode, the selection logic enables the plurality of processing cores to access the memory array through the cache control logic as a cache memory.

First claim

Opening claim text (preview).

The invention claimed is: 1. A processor, comprising: a mode indicator; a plurality of processing cores; and a neural network unit (NNU), comprising: a memory array; an array of neural processing units (NPU); cache control logic; and selection logic, configured to selectively couple the plurality of NPUs and the cache control logic to the memory array; wherein when the mode indicator indicates a first mode, the selection logic enables the plurality of NPUs to read neural network weights from the memory array to perform computations using the weights; and wherein when the mode indicator indicates a second mode, the selection logic enables the plurality of processing cores to access the memory array through the cache control logic as a cache memory; wherein the NNU is coupled to the plurality of processing cores; and wherein when the mode indicator indicates the first mode, the NNU is controllable by the plurality of processing cores to accelerate neural network computations for the plurality of processing cores; wherein the processor further comprises: a ring bus that couples the NNU and the plurality of processing cores; and a plurality of last level cache slices coupled to the ring bus; wherein when the mode indicator indicates the second mode, the memory array and cache control logic operate in conjunction with the plurality of last level cache slices as a last level cache memory that the plurality of processing cores access via the ring bus; wherein the ring bus consists of six ring stops connected to one another in a bi-directional fashion; wherein two different hash algorithms are employed: one that excludes the memory array as a last-level cache, LLC, slice and one that includes the memory array as a LLC slice; wherein the two hash algorithms are designed to support a selective write-back-invalidate operation. 2. The processor of claim 1 , further comprising: wherein when the mode indicator indicates the second mode, the plurality of processing cores access the memory array through the cache control logic as a slice of a last level cache memory of the processor. 3. The processor of claim 1 , further comprising: wherein to transition from the second mode to the first mode, the cache control logic write-back-invalidates the memory array. 4. The processor of claim 1 , further comprising: wherein the plurality of processing cores are x86 instruction set architecture processing cores. 5. A method for operating a processor having a mode indicator, a plurality of processing cores, and a neural network unit (NNU) comprising a memory array, an array of neural processing units (NPU), cache control logic, and selection logic, configured to selectively couple the plurality of NPUs and the cache control logic to the memory array, the method comprising: enabling, by the selection logic in response to setting the mode indicator to indicate a first mode, the plurality of NPUs to read neural network weights from the memory array to perform computations using the weights; enabling, by the selection logic in response to setting the mode indicator to indicate a second mode, the plurality of processing cores to access the memory array through the cache control logic as a cache memory; wherein the NNU is coupled to the plurality of processing cores; controlling, by the plurality of processing cores when the mode indicator indicates the first mode, the NNU to accelerate neural network computations for the plurality of processing cores; wherein the processor further comprises a ring bus that couples the NNU and the plurality of processing cores; wherein the processor further comprises a plurality of last level cache slices coupled to the ring bus; and operating, by the memory array and cache control logic when the mode indicator indicates the second mode, in conjunction with the plurality of last level cache slices as a last level cache memory that the plurality of processing cores access via the ring bus; wherein the ring bus consists of six ring stops connected to one another in a bi-directional fashion; wherein two different hash algorithms are employed: one that excludes the memory array as a last-level cache, LLC, slice and one that includes the memory array as a LLC slice; wherein the two hash algorithms are designed to support a selective write-back-invalidate operation. 6. The method of claim 5 , further comprising: accessing, by the plurality of processing cores when the mode indicator indicates the second mode, the memory array through the cache control logic as a slice of a last level cache memory of the processor. 7. The method of claim 5 , further comprising: write-back-invalidating, by the cache control logic, the memory array to transition from the second mode to the first mode. 8. The method of claim 5 , further comprising: wherein the plurality of processing cores are x86 instruction set architecture processing cores. 9. A computer program product encoded in at least one non-transitory computer usable medium for use with a computing device, the computer program product comprising: computer usable program code embodied in said medium, for specifying a processor, the computer usable program code comprising: first program code for specifying a mode indicator; second program code for specifying a plurality of processing cores; third program code for specifying a neural network unit (NNU), comprising: a memory array; an array of neural processing units (NPU); cache control logic; and selection logic, configured to selectively couple the plurality of NPUs and the cache control logic to the memory array; wherein when the mode indicator indicates a first mode, the selection logic enables the plurality of NPUs to read neural network weights from the memory array to perform computations using the weights; and wherein when the mode indicator indicates a second mode, the selection logic enables the plurality of processing cores to access the memory array through the cache control logic as a cache memory; wherein the NNU is coupled to the plurality of processing cores; and wherein when the mode indicator indicates the first mode, the NNU is controllable by the plurality of processing cores to accelerate neural network computations for the plurality of processing cores; fourth program code for specifying a ring bus that couples the NNU and the plurality of processing cores; and fifth program code for specifying a plurality of last level cache slices coupled to the ring bus; wherein when the mode indicator indicates the second mode, the memory array and cache control logic operate in conjunction with the plurality of last level cache slices as a last level cache memory that the plurality of processing cores access via the ring bus; wherein the ring bus consists of six ring stops connected to one another in a bi-directional fashion; wherein two different hash algorithms are employed: one that excludes the memory array as a last-level cache, LLC, slice and one that includes the memory array as a LLC slice; wherein the two hash algorithms are designed to support a selective write-back-invalidate operation. 10. The computer program product of claim 9 , wherein the at least one non-transitory computer usable medium is selected from the set of a disk, tape, or other magnetic, optical, or electronic storage medium.

Assignees

Inventors

Classifications

  • being configurable for different purposes, e.g. as cache or non-cache memory · CPC title

  • Data buffering arrangements · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • with a shared cache · CPC title

  • Performance improvement · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10664751B2 cover?
A processor comprising a mode indicator, a plurality of processing cores, and a neural network unit (NNU), comprising a memory array, an array of neural processing units (NPU), cache control logic, and selection logic that selectively couples the plurality of NPUs and the cache control logic to the memory array. When the mode indicator indicates a first mode, the selection logic enables the plu…
Who is the assignee on this patent?
Via Alliance Semiconductor Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 26 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).