Compiler managed memory for image processor

US10304156B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10304156-B2
Application numberUS-201715625972-A
CountryUS
Kind codeB2
Filing dateJun 16, 2017
Priority dateFeb 26, 2016
Publication dateMay 28, 2019
Grant dateMay 28, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method is described. The method includes repeatedly loading a next sheet of image data from a first location of a memory into a two dimensional shift register array. The memory is locally coupled to the two-dimensional shift register array and an execution lane array having a smaller dimension than the two-dimensional shift register array along at least one array axis. The loaded next sheet of image data keeps within an image area of the two-dimensional shift register array. The method also includes repeatedly determining output values for the next sheet of image data through execution of program code instructions along respective lanes of the execution lane array, wherein, a stencil size used in determining the output values encompasses only pixels that reside within the two-dimensional shift register array.

First claim

Opening claim text (preview).

The invention claimed is: 1. A processor comprising: a two-dimensional array of processing elements; a two-dimensional shift-register array having a first portion of registers that are each dedicated to one of the processing elements in the two-dimensional array of processing elements and having a halo portion of registers that borders the first portion of registers on one or more sides of the first portion; and a sheet generator configured to load sheets of image data into the two-dimensional shift register array, wherein each sheet of image data has at least as many pixels as processing elements in the two-dimensional array of processing elements, wherein the processor is configured to execute instructions to load input data to perform a stencil function requiring data from multiple sheets of image data, wherein the instructions cause the processor to perform operations comprising: initially loading a first sheet of image data and a second sheet of image into a local random access memory (RAM) that is local to the processor, assigning a first pointer that references a first address of the first sheet of image data loaded into the local RAM, assigning a second pointer that references a second address of the second sheet of image data loaded into the local RAM, loading the first sheet of image data into the first portion of the two-dimensional shift-register array using the first pointer, loading a portion of the second sheet of image data into the halo portion of the two-dimensional shift-register array using the second pointer, performing a first iteration of the stencil function using the first sheet of image data loaded into the first portion of the two-dimensional shift-register array and using the portion of the second sheet of image data loaded into the halo portion of the two-dimensional shift-register array, after performing the first iteration of the stencil function, updating the first pointer to reference the second address of the second sheet of image data loaded into the local RAM, loading the second sheet of image data into the first portion of the two-dimensional shift-register array using the first pointer, loading a portion of a third sheet of image data into the halo portion of the two-dimensional shift-register array, and performing a second iteration of the stencil function using the second sheet of image data loaded into the first portion of the two-dimensional shift-register array and using the portion of the third sheet of image data loaded into the halo portion of the two-dimensional shift-register array. 2. The processor of claim 1 , wherein loading the portion of a second sheet of image data into the halo portion of the two-dimensional shift-register array comprises loading the portion of the second sheet of image data from the local RAM. 3. The processor of claim 1 , wherein the operations further comprise: loading, into the local RAM, the third sheet of image data at least partially concurrently with performing the first iteration of the stencil function. 4. The processor of claim 1 , wherein loading the second sheet of image data from the local RAM into the first portion of the two-dimensional shift-register array comprises executing a load instruction that references the updated first pointer. 5. The processor of claim 1 , wherein the instructions include an offset instruction that, that when executed by a particular processing element having a particular location in the two-dimensional array of processing elements, causes the processing element to compute, given the particular location, an offset representing a particular sheet of data in the local RAM from which to load data. 6. The processor of claim 1 , wherein the operations further comprise providing an output sheet of data to the sheet generator to be provided by the sheet generator to one or more other components of the processor. 7. A computer program product, encoded on one or more non-transitory computer storage media, comprising instructions that when executed by a processor comprising: a two-dimensional array of processing elements; a two-dimensional shift-register array having a first portion of registers that are each dedicated to one of the processing elements in the two-dimensional array of processing elements and having a halo portion of registers that borders the first portion of registers on one or more sides of the first portion; and a sheet generator configured to load sheets of image data into the two-dimensional shift register array, wherein each sheet of image data has at least as many pixels as processing elements in the two-dimensional array of processing elements, wherein the processor is configured to execute instructions to load input data to perform a stencil function requiring data from multiple sheets of image data, wherein the instructions cause the processor to perform operations comprising: initially loading a first sheet of image data and a second sheet of image into a local random access memory (RAM) that is local to the processor, assigning a first pointer that references a first address of the first sheet of image data loaded into the local RAM, assigning a second pointer that references a second address of the second sheet of image data loaded into the local RAM, loading the first sheet of image data into the first portion of the two-dimensional shift-register array using the first pointer, loading a portion of the second sheet of image data into the halo portion of the two-dimensional shift-register array using the second pointer, performing a first iteration of the stencil function using the first sheet of image data loaded into the first portion of the two-dimensional shift-register array and using the portion of the second sheet of image data loaded into the halo portion of the two-dimensional shift-register array, after performing the first iteration of the stencil function, updating the first pointer to reference the second address of the second sheet of image data loaded into the local RAM, loading the second sheet of image data into the first portion of the two-dimensional shift-register array using the first pointer, loading a portion of a third sheet of image data into the halo portion of the two-dimensional shift-register array, and performing a second iteration of the stencil function using the second sheet of image data loaded into the first portion of the two-dimensional shift-register array and using the portion of the third sheet of image data loaded into the halo portion of the two-dimensional shift-register array. 8. The computer program product of claim 7 , wherein loading the portion of a second sheet of image data into the halo portion of the two-dimensional shift-register array comprises loading the portion of the second sheet of image data from the local RAM. 9. The computer program product of claim 7 , wherein the operations further comprise: loading, into the local RAM, the third sheet of image data at least partially concurrently with performing the first iteration of the stencil function. 10. The computer program product of claim 7 , wherein loading the second sheet of image data from the local RAM into the first portion of the two-dimensional shift-register array comprises executing a load instruction that references the updated first pointer. 11. The computer program product of claim 7 , wherein the instructions include an offset instruction that, that when executed by a particular processing element having a particular location in the two-dimensional array of processing elements, causes the processing element to compute, given the particular location, an offset representing a particular sheet of data in the local R

Assignees

Inventors

Classifications

  • G06T1/60Primary

    Memory management · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • Extension of register space, e.g. register cache · CPC title

  • LOAD or STORE instructions; Clear instruction · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10304156B2 cover?
A method is described. The method includes repeatedly loading a next sheet of image data from a first location of a memory into a two dimensional shift register array. The memory is locally coupled to the two-dimensional shift register array and an execution lane array having a smaller dimension than the two-dimensional shift register array along at least one array axis. The loaded next sheet o…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06T1/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 28 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).