Channel sizing for inter-kernel communication
US-2016378441-A1 · Dec 29, 2016 · US
US10430919B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10430919-B2 |
| Application number | US-201715594512-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 12, 2017 |
| Priority date | May 12, 2017 |
| Publication date | Oct 1, 2019 |
| Grant date | Oct 1, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method is described. The method includes simulating execution of an image processing application software program. The simulating includes intercepting kernel-to-kernel communications with simulated line buffer memories that store and forward lines of image data communicated from models of producing kernels to models of consuming kernels. The simulating further includes tracking respective amounts of image data stored in the respective line buffer memories over a simulation runtime. The method also includes determining respective hardware memory allocations for corresponding hardware line buffer memories from the tracked respective amounts of image data. The method also includes generating configuration information for an image processor to execute the image processing application software program. The configuration information describes the hardware memory allocations for the hardware line buffer memories of the image processor.
Opening claim text (preview).
What is claimed is: 1. A non-transitory machine readable storage medium containing program code that when processed by a computing system causes the computing system to perform operations comprising: simulating execution of an image processing application software program having a plurality of kernels, each kernel comprising load instructions that read from a line buffer storing data produced by another kernel, store instructions that write to a line buffer storing data to be consumed by another kernel, or both, wherein simulating the execution of the image processing application software program comprises simulating operations of a plurality of line buffers using a respective plurality of simulated line buffers, including performing operations comprising: simulating each load instruction occurring in the plurality of kernels including updating a respective read pointer for a respective simulated line buffer that simulates a line buffer referenced by the load instruction, simulating each write instruction occurring in the plurality of kernels including updating a respective write pointer for a respective simulated line buffer that simulates a line buffer referenced by the store instruction; computing, for each simulated line buffer, a respective maximum difference encountered during the simulation between the respective read pointer and the respective write pointer of the simulated line buffer; and generating a respective memory size to allocate to line buffers of an image processor based on the respective maximum differences computed for the simulated line buffers. 2. The non-transitory machine readable storage medium of claim 1 , wherein the operations further comprise continually updating the maximum difference between each read pointer and each write pointer for each respective simulated line buffer during the simulation. 3. The non-transitory machine readable storage medium of claim 1 wherein simulating execution of the image processing application comprises imposing a write policy for a particular simulated line buffer that prevents a next unit of image data from being written into the particular simulated line buffer memory until one or more simulated load instructions are stalled. 4. The non-transitory machine readable storage medium of claim 1 , further comprising stripping out one or more instructions that are not load or store instructions from one or more of the plurality of kernels. 5. The non-transitory machine readable storage medium of claim 4 , wherein simulating execution of the image processing application comprises simulating respective delays for instructions that were stripped out of one or more of the plurality of kernels. 6. The non-transitory machine readable storage medium of claim 1 wherein the each simulated line buffer corresponds to a respective line buffer of an image processor having a plurality of line buffers configured to buffer data among a plurality of processing cores of the image processor. 7. The non-transitory machine readable storage medium of claim 6 wherein the image processing application software program is code that is compiled to be executed by a processing core having a two dimensional execution lane array and a two dimensional shift register array. 8. The non-transitory machine readable storage medium of claim 1 , wherein each simulated line buffer comprises an unbounded portion of memory. 9. A system, comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising simulating execution of an image processing application software program having a plurality of kernels, each kernel comprising load instructions that read from a line buffer storing data produced by another kernel, store instructions that write to a line buffer storing data to be consumed by another kernel, or both, wherein simulating the execution of the image processing application software program comprises simulating operations of a plurality of line buffers using a respective plurality of simulated line buffers, including performing operations comprising: simulating each load instruction occurring in the plurality of kernels including updating a respective read pointer for a respective simulated line buffer that simulates a line buffer referenced by the load instruction, simulating each write instruction occurring in the plurality of kernels including updating a respective write pointer for a respective simulated line buffer that simulates a line buffer referenced by the store instruction; computing, for each simulated line buffer, a respective maximum difference encountered during the simulation between the respective read pointer and the respective write pointer of the simulated line buffer; and generating a respective memory size to allocate to line buffers of an image processor based on the respective maximum differences computed for the simulated line buffers. 10. The system of claim 9 wherein simulating execution of the image processing application comprises imposing a write policy for a particular simulated line buffer that prevents a next unit of image data from being written into the particular simulated line buffer memory until one or more simulated load instructions are stalled. 11. The system of claim 9 , further comprising stripping out one or more instructions that are not load or store instructions from one or more of the plurality of kernels. 12. The system of claim 9 , wherein simulating execution of the image processing application comprises simulating respective delays for instructions that were stripped out of one or more of the plurality of kernels. 13. A method, comprising: simulating execution of an image processing application software program having a plurality of kernels, each kernel comprising load instructions that read from a line buffer storing data produced by another kernel, store instructions that write to a line buffer storing data to be consumed by another kernel, or both, wherein simulating the execution of the image processing application software program comprises simulating operations of a plurality of line buffers using a respective plurality of simulated line buffers, including performing operations comprising: simulating each load instruction occurring in the plurality of kernels including updating a respective read pointer for a respective simulated line buffer that simulates a line buffer referenced by the load instruction, simulating each write instruction occurring in the plurality of kernels including updating a respective write pointer for a respective simulated line buffer that simulates a line buffer referenced by the store instruction; computing, for each simulated line buffer, a respective maximum difference encountered during the simulation between the respective read pointer and the respective write pointer of the simulated line buffer; and generating a respective memory size to allocate to line buffers of an image processor based on the respective maximum differences computed for the simulated line buffers. 14. The method of claim 13 wherein simulating execution of the image processing application comprises imposing a write policy for a particular simulated line buffer that prevents a next unit of image data from being written into the particular simulated line buffer memory until one or more simulated load instructions are stalled. 15. The method of claim 13 , further comprising stripping out one or more instructions that are not load or store instructions from one or more of the plura
HW-SW co-design, e.g. HW-SW partitioning · CPC title
Design verification, e.g. functional simulation or model checking · CPC title
Design optimisation, verification or simulation (optimisation, verification or simulation of circuit designs G06F30/30) · CPC title
at device level, e.g. emulation of a storage device or system · CPC title
Hybrid storage device · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.