Apparatuses, methods, and systems for swizzle operations in a configurable spatial accelerator

US10817291B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10817291-B2
Application numberUS-201916370915-A
CountryUS
Kind codeB2
Filing dateMar 30, 2019
Priority dateMar 30, 2019
Publication dateOct 27, 2020
Grant dateOct 27, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconnect network between the plurality of processing elements, and a configuration register within each processing element to store a configuration value having a first portion that, when set to a first value that indicates a first mode, causes the processing element to pass an input value to operation circuitry of the processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the processing element, and a second portion that causes the processing element to perform an operation indicated by the second portion the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a plurality of processing elements; an interconnect network between the plurality of processing elements to transfer values between the plurality of processing elements; and a first processing element of the plurality of processing elements comprising: a plurality of input queues, a configuration register within the first processing element to store a configuration value having: a first portion that, when set to a first value that indicates a first mode, causes the first processing element to pass an input value to operation circuitry of the first processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the first processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the first processing element, and a second portion that causes the first processing element to perform an operation indicated by the second portion of the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value. 2. The apparatus of claim 1 , wherein, when at least one of the plurality of input queues stores the input value, the input controller is to send a not empty value to the operation circuitry of the first processing element to indicate the first processing element is to, when the first portion of the configuration value is set to the second value, perform the swizzle operation on the input value from the at least one of the plurality of input queues to form the swizzled input value, and then begin the operation on the swizzled input value. 3. The apparatus of claim 1 , wherein, when at least one of the plurality of output queues is not full, the output controller is to send a not full value to the operation circuitry of the first processing element to indicate the first processing element is to, when the first portion of the configuration value is set to the second value, perform the swizzle operation on the input value stored in at least one of the plurality of input queues to form the swizzled input value, and then begin the operation on the swizzled input value. 4. The apparatus of claim 1 , wherein, when at least one of the plurality of input queues stores the input value, the input controller is to send a not empty value to the operation circuitry of the first processing element and when at least one of the plurality of output queues is not full, the output controller is to send a not full value to the operation circuitry of the first processing element, and the operation circuitry of the first processing element is to, when the first portion of the configuration value is set to the second value, swizzle the input value from the at least one of the plurality of input queues to form the swizzled input value, and then begin the operation on the swizzled input value. 5. The apparatus of claim 1 , wherein when the first portion of the configuration value is set to the second value, the swizzle operation replicates a lower portion of the input value into multiple locations in the swizzled input value. 6. The apparatus of claim 1 , wherein when the first portion of the configuration value is set to the second value, the swizzle operation replicates an upper portion of the input value into multiple locations in the swizzled input value. 7. The apparatus of claim 1 , wherein when the first portion of the configuration value is set to the second value, the swizzle operation swaps a lower portion and an upper portion of the input value in the swizzled input value. 8. The apparatus of claim 1 , wherein the first portion of the configuration value comprises: at least a first bit corresponding to a first input queue of the plurality of input queues and that when set to a first value causes the first processing element to pass a first input value to the operation circuitry of the first processing element without modifying the first input value, and, when set to a second value, causes the first processing element to perform a first swizzle operation on the first input value to form a first swizzled input value before sending the first swizzled input value to the operation circuitry of the first processing element, and at least a second, separate bit corresponding to a second input queue of the plurality of input queues and that when set to a first value causes the first processing element to pass a second input value to the operation circuitry of the first processing element without modifying the second input value, and, when set to a second value, causes the first processing element to perform a second, different swizzle operation on the second input value to form a second swizzled input value before sending the second swizzled input value to the operation circuitry of the first processing element. 9. A method comprising: coupling a plurality of processing elements together by an interconnect network between the plurality of processing elements to transfer values between the plurality of processing elements; storing a configuration value in a configuration register within a first processing element of the plurality of processing elements, the configuration value comprising: a first portion that, when set to a first value that indicates a first mode, causes the first processing element to pass an input value to operation circuitry of the first processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the first processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the first processing element, and a second portion that causes the first processing element to perform an operation indicated by the second portion of the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry; controlling enqueue and dequeue of values into a plurality of input queues of the first processing element according to the configuration value with an input controller in the first processing element; and controlling enqueue and dequeue of values into a plurality of output queues of the first processing element according to the configuration value with an output controller in the first processing element. 10. The method of claim 9 , wherein, when at least one of the plurality of input queues stores the input value, the input controller sends a not empty value to the operation circuitry of the first processing element to indicate the first processing element is to, when the first portion of the configuration value is set to the second value, perform the swizzle operation on the input value from the at least one of the plurality of input queues to form the swizzled input value, and then begin the operation on the swizzled input value. 11. The method of claim 9 , wherein, when at least one of the plurality of output queues is not full, the output controller sends a not full value to the operation circuitry of the first processing element to indicate the first processing element is to, when the first portion of the configuration value is set to the second value, perform the swizzle operation on the input value s

Assignees

Inventors

Classifications

  • G06F9/3005Primary

    to perform operations for flow control · CPC title

  • using a mask · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

  • with adaptable data path · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10817291B2 cover?
Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconn…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/3005. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 27 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).