Automatic communication and optimization of multi-dimensional arrays for many-core coprocessor using static compiler analysis

US9535826B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9535826-B2
Application numberUS-201414293667-A
CountryUS
Kind codeB2
Filing dateJun 2, 2014
Priority dateAug 30, 2013
Publication dateJan 3, 2017
Grant dateJan 3, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

There are provided source-to-source transformation methods for a multi-dimensional array and/or a multi-level pointer for a computer program. A method includes minimizing a number of holes for variable length elements for a given dimension of the array and/or pointer using at least two stride values included in stride buckets. The minimizing step includes modifying memory allocation sites, for the array and/or pointer, to allocate memory based on the stride values. The minimizing step further includes modifying a multi-dimensional memory access, for accessing the array and/or pointer, into a single dimensional memory access using the stride values. The minimizing step also includes inserting offload pragma for a data transfer of the array and/or pointer prior as at least one of a single-dimensional array and a single-level pointer. The data transfer is from a central processing unit to a coprocessor over peripheral component interconnect express.

First claim

Opening claim text (preview).

What is claimed is: 1. A source-to-source transformation method for at least one of a multi-dimensional array and a multi-level pointer for a computer program, comprising a transcompiler using static compiler analysis, comprising: minimizing a number of holes for variable length elements for a given dimension of the at least one of a multi-dimensional array and a multi-level pointer using stride values included in stride buckets, where said minimizing step comprises: modifying memory allocation sites, for the at least one of the multi-dimensional array and the multi-level pointer, to allocate memory based on the stride values, the stride values including at least two stride values; modifying a multi-dimensional memory access into a single dimensional memory access using the stride values, the multi-dimensional memory access for accessing the at least one of the multi-dimensional array and the multi-level pointer; and inserting offload pragma for a data transfer of the at least one of the multi-dimensional array and the multi-level pointer prior as at least one of a single-dimensional array and a single-level pointer, the data transfer being from a central processing unit to a coprocessor over peripheral component interconnect express. 2. The method of claim 1 , wherein the offload pragma is inserted at a point in the computer program prior to an off-loadable code region of the program that includes the at least one of the single-dimensional array and the single-level pointer. 3. The method of claim 1 , further comprising transferring one chunk of memory for the at least one of a multi-dimensional array and a multi-level pointer from the central processing unit to the coprocessor over peripheral component interconnect express, the one chunk of memory including all of the variable length elements for all dimensions. 4. The method of claim 3 , wherein said transferring step is performed to avoid separately transferring components of the at least one of a multi-dimensional array and a multi-dimensional pointer. 5. The method of claim 1 , further comprising: parsing and analyzing the memory allocation sites for the given dimension of the at least one of the multi-dimensional array and the multi-dimensional pointer to obtain a memory size of each of elements of the at least one of the multi-dimensional array and the multi-dimensional pointer in the given dimension; creating the stride values as only two stride values, a first one of the two stride values being equal to a maximum memory size, and a second one of the two stride values being equal to half of the maximum memory size; and labeling each of the elements in the given dimension having the memory size equal to less than half of the maximum memory size as half of the maximum memory size and other ones of the elements as the maximum memory size. 6. The method of claim 5 , further comprising, for each respective remaining dimension of the at least one of the multi-dimensional array and the multi-dimensional pointer, using a respective maximum memory size of the respective remaining dimension as a stride value for that respective remaining dimension. 7. The method of claim 1 , wherein the first one of the two stride values is stored in a first one of the stride buckets and the second one of the two stride values is stored in a second one of the stride buckets. 8. A source-to-source transformation method for at least one of a multi-dimensional array and a multi-level pointer for a computer program, comprising a transcompiler using static compiler analysis, comprising: replacing an original set of memory allocation statements for the at least one of the multi-dimensional array and the multi-level pointer by a single memory allocation statement that allocates a memory region of a given size based on length information parsed from the original set of memory allocation statements; resetting pointers, for both a central processing unit and a coprocessor, that retain original memory accesses to the at least one of the multi-dimensional array and the multi-level pointer based on the length information; generating pragma offload statements for a data transfer from the processor to the coprocessor over peripheral component interconnect express; and transferring an amount of memory for the at least one of the multi-dimensional array and the multi-level pointer, wherein said transferring step is performed to collectively transfer all components of the at least one of a multi-dimensional array and a multi-dimensional pointer. 9. The method of claim 8 , wherein said transferring step is performed to avoid separately transferring the components of the at least one of a multi-dimensional array and a multi-dimensional pointer. 10. The method of claim 8 , further comprising generating a nested loop to determine a total length of all of the components of the at least one of the multi-dimensional pointer and the multi-level array, wherein the length information comprises the total length. 11. The method of claim 10 , wherein each iteration of the nested loop determines a respective length of a respective one of the components for summing to obtain the total length. 12. The method of claim 10 , optimizing the data transfer by hoisting the offload statements outside the nested loop to enable data reuse and minimization of data transfer overhead. 13. The method of claim 10 , wherein said resetting step comprises generating another nested loop, each iteration of the other nested loop assigning a respective pointer to the memory region. 14. The method of claim 8 , further comprising generating a copy of the single memory allocation statement and forwarding the copy to the coprocessor using a pragma offload. 15. The method of claim 8 , further comprising generating a _shared clause to support a use of a virtual shared memory for a data structure identified as being incapable of being managed using previous steps of the method. 16. The method of claim 15 , further comprising modifying a coherence mechanism of the virtual shared memory to skip recording writes to the virtual shared memory. 17. The method of claim 15 , wherein the data structure is one of a graph and a tree.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9535826B2 cover?
There are provided source-to-source transformation methods for a multi-dimensional array and/or a multi-level pointer for a computer program. A method includes minimizing a number of holes for variable length elements for a given dimension of the array and/or pointer using at least two stride values included in stride buckets. The minimizing step includes modifying memory allocation sites, for …
Who is the assignee on this patent?
Nec Lab America Inc, Nec Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/0207. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).