Memory access for a vector processor

US9798550B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9798550-B2
Application numberUS-201313737290-A
CountryUS
Kind codeB2
Filing dateJan 9, 2013
Priority dateJan 9, 2013
Publication dateOct 24, 2017
Grant dateOct 24, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and device for memory access in processors is provided. A processor, comprising a plurality of computational units, is capable of executing a single instruction on multiple pieces of data simultaneously (SIMD). A read operation is initiated to load data from memory into the plurality of computational units (CUs) arranged into a plurality of CU groups. The memory is arranged into a plurality of memory macro-blocks each associated with a respective CU group of the plurality of CU groups. For each CU group a respective first memory address is determined and for each CU group, the data in the associated memory macro-block is accessed at the respective first memory address.

First claim

Opening claim text (preview).

What is claimed is: 1. A device comprising: a vector memory space divided into a plurality of memory macro-blocks for storing data; a vector processor comprising a plurality of computational units (CUs) for executing instructions, the plurality of CUs arranged into a plurality of CU groups, each CU group comprising two or more CUs of the plurality of CUs, the plurality of CUs providing execution of a single instruction on multiple pieces of data (SIMD); and a plurality of memory macro-block access units, each of the plurality of memory macro-block access units: couples a respective CU group to a respective associated memory macro-block, is directly connected to each CU of the respective CU group, controls access of the CUs of the respective CU group to the associated memory macro-block, controls memory access based on an associated CU address input to the associated memory macro-block to retrieve data from, or place data to, per read/write cycle, and operates in a first mode when the associated address for each of the CUs in the respective CU group is a same address, and in a second mode when the associated addresses for at least two of the CUs in the respective CU group are different. 2. The device of claim 1 , wherein each of the memory macroblock access units determines the address individually for each CU in the associated CU group in subsequent cycles. 3. The device of claim 1 , wherein each of the memory macroblock access units determines the address for two or more CUs in the associated CU group in a single cycle. 4. The device of claim 1 , wherein there are z CU groups, each with m CUs, each of the CUs has an n-bit interface to the associated memory macro-block, wherein each of the memory macro-blocks can provide n×m bits of data to the associated CU group in a memory access operation, wherein the n×m bits of data for a respective CU group are addressed by a single memory macro-block address. 5. The device of claim 4 , wherein each of the memory macro-block access units controls data provided to, or received from, each of the CUs in the respective CU group based on a CU mask indicating a portion of the n×m bits of data from the associated memory macro-block the respective CU is to receive. 6. The device of claim 1 , wherein each of the memory macro-block access units can access data from neighboring memory macro-blocks during a portion of a memory access operation. 7. The device of claim 6 , wherein each of the memory macro-block access units determines the address from the respective neighboring memory macro-block for two or more CUs in the associated CU group in a single cycle. 8. The device of claim 1 , wherein each of the memory macro-block access units can access data from a plurality of neighboring memory macro-blocks during a portion of a memory access operation. 9. The device of claim 8 , wherein each memory macro-block access unit has a plurality of neighbors. 10. The device of claim 1 wherein the CU address consists of a base address plus a locally derived offset value. 11. The device of claim 1 wherein if two or more CU's have the same address then they can access the memory concurrently. 12. A method comprising: initiating a read operation for loading data from memory into a plurality of computational units (CUs) arranged into a plurality of CU groups, the memory arranged into a plurality of memory macro-blocks each associated with a respective CU group of the plurality of CU groups; for each CU group, determining a respective first memory address; and for each CU group, accessing the data in the associated memory macro-block at the respective first memory address comprising reading data from the respective memory macro-block or providing data to the respective memory macro-block, wherein accessing to each of the memory macro-blocks is controlled by a respective memory access unit which controls memory access based on an associated CU address input to the associated memory macro-block to retrieve data from, or place data to, per read/write cycle, each respective memory access unit is directly connected to each CU in a respective CU group, there are z CU groups, each with m CUs, each of the CUs has an n-bit interface to the associated memory macro-block, each of the memory macro-blocks can provide n×m bits of data to the associated CU group in a memory access operation, and the n×m bits of data for a respective CU group are addressed by a single memory macro-block address. 13. The method of claim 12 , further comprising: reading data from the respective memory macro-block to a first CU of the respective CU group or providing data to the respective memory macro-block from the first CU of the respective CU group, wherein the first memory address is associated with the first CU; for each CU group, determining a respective second memory address associated with a respective second CU in the CU group; and reading data from the respective memory macro-block to the second CU of the respective CU group or providing data to the respective memory macro-block from the second CU of the respective CU group. 14. The method of claim 13 , wherein the first and second memory addresses are individually determined for each of the first and second CUs in the associated CU group in subsequent cycles. 15. The method of claim 12 , wherein the first memory address is determined for two or more CUs in the associated CU group in a single cycle. 16. The method of claim 12 , further comprising controlling data provided to, or received from, each of the CUs in the respective CU group based on a CU mask indicating a portion of the n×m bits of data from the associated memory macro-block the respective CU is to receive. 17. The method of claim 12 , further comprising accessing data from a respective neighboring memory macro-block during a portion of a memory access operation. 18. The method of claim 17 , determining an address from the respective neighboring memory macro-block for two or more CUs in the associated CU group in a single cycle. 19. The method of claim 12 , further comprising accessing data from one of a plurality of neighboring memory macro-blocks during a portion of a memory access operation.

Assignees

Inventors

Classifications

  • G06F9/3891Primary

    organised in groups of units sharing resources, e.g. clusters · CPC title

  • Operand accessing · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Organisation of register space, e.g. banked or distributed register file · CPC title

  • Details on data memory access · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9798550B2 cover?
A method and device for memory access in processors is provided. A processor, comprising a plurality of computational units, is capable of executing a single instruction on multiple pieces of data simultaneously (SIMD). A read operation is initiated to load data from memory into the plurality of computational units (CUs) arranged into a plurality of CU groups. The memory is arranged into a plur…
Who is the assignee on this patent?
Nxp Canada Inc, Nxp Usa Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/3891. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).