Operation Accelerator, Processing Method, and Related Device
US-2021224125-A1 · Jul 22, 2021 · US
US12399743B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12399743-B2 |
| Application number | US-202217652109-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 23, 2022 |
| Priority date | Feb 23, 2022 |
| Publication date | Aug 26, 2025 |
| Grant date | Aug 26, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Processing input data for transmittal to a data consumer such as an artificial intelligence engine is performed by arranging the input data into a uniform structure made up of sticks of data combined to form pages of sticks. A stick is any well-sized set of input data elements whereby the size of the stick is fixed. A masking pattern is established for sticks of data having certain ranges of invalid data for consumption of partial sticks while maintaining validity of the input data being transferred. The mask pattern is derived based on set-active-mask-and-value (SAMV) instructions. The derived mask pattern is carried forward for subsequent load instructions to the data consumer.
Opening claim text (preview).
What is claimed is: 1. A method for padding data while loading data from a memory to a data consumer, the method comprising: receiving a load instruction including a padding instruction and a read instruction to read input data arranged in a stick layout from a memory to a data consumer, the padding instruction including a replacement value for masked elements and padding parameters; deriving a mask pattern from parameters specified in the padding instruction; padding the input data by masking invalid elements of the stick layout; and generating a set of data sticks including the padded data for the data consumer. 2. The method of claim 1 , wherein the parameters specified in the padding instructions include layout parameters and pad range parameters. 3. The method of claim 1 , further comprising: responsive to the load instruction, transmitting the set of data sticks to the data consumer. 4. The method of claim 1 , further comprising: decoding the padding instructions to determine the replacement value and the mask pattern. 5. The method of claim 1 , wherein: the stick layout is separable into a predefined number of slices, and masking the invalid elements includes: identifying invalid slices in a stick, the invalid slices including an invalid element, and determining a starting slice having a portion of valid elements and a portion of invalid elements. 6. The method of claim 1 , wherein the padding instruction provides a specified data format in a 6-bit instruction including data format, cross-slice dimension type, and a key dimension of a stick. 7. The method of claim 1 , wherein the stick layout in which the input data is arranged is a pre-determined size matching an SIMD (single instruction, multiple data) capacity of an accelerator performing the deriving and padding steps. 8. A computer program product comprising a computer-readable storage medium having a set of instructions stored therein which, when executed by a processor, causes the processor to perform a method comprising: receiving a load instruction including a padding instruction, the load instruction including a read instruction to read input data arranged in a stick layout from a memory to a data consumer, the padding instruction including a replacement value for masked elements and padding parameters; deriving a mask pattern from parameters specified in the padding instruction; padding the input data by masking invalid elements of the stick layout; and generating a set of data sticks including the padded data for the data consumer. 9. The computer program product of claim 8 , wherein the parameters specified in the padding instructions include layout parameters and pad range parameters. 10. The computer program product of claim 8 , further causing the processor to perform a method comprising: responsive to the load instruction, transmitting the set of data sticks to the data consumer. 11. The computer program product of claim 8 , further causing the processor to perform a method comprising: decoding the padding instructions to determine the replacement value and the mask pattern. 12. The computer program product of claim 8 , wherein: the stick layout is separable into a predefined number of slices, and masking the invalid elements includes: identifying invalid slices in a stick, the invalid slices including an invalid element, and determining a starting slice having a portion of valid elements and a portion of invalid elements. 13. The computer program product of claim 8 , wherein the padding instruction provides a specified data format in a 6-bit instruction including data format, cross-slice dimension type, and a key dimension of a stick. 14. A computer system for padding data while loading data from a memory to a data consumer, the computer system comprising: a processor set; and a computer readable storage medium; wherein: the processor set is structured, located, connected, and/or programmed to run program instructions stored on the computer readable storage medium; and the program instructions which, when executed by the processor set, cause the processor set to perform a method comprising: receiving a load instruction including a padding instruction, the load instruction including a read instruction to read input data arranged in a stick layout from a memory to a data consumer, the padding instruction including a replacement value for masked elements and padding parameters; deriving a mask pattern from parameters specified in the padding instruction; padding the input data by masking invalid elements of the stick layout; and generating a set of data sticks including the padded data for the data consumer. 15. The computer system of claim 14 , wherein the parameters specified in the padding instructions include layout parameters and pad range parameters. 16. The computer system of claim 14 , further causing the processor set to perform a method comprising: responsive to the load instruction, transmitting the set of data sticks to the data consumer. 17. The computer system of claim 14 , further causing the processor set to perform a method comprising: decoding the padding instructions to determine the replacement value and the mask pattern. 18. The computer system of claim 14 , wherein: the stick layout is separable into a predefined number of slices, and masking the invalid elements includes: identifying invalid slices in a stick, the invalid slices including an invalid element, and determining a starting slice having a portion of valid elements and a portion of invalid elements. 19. The computer system of claim 14 , wherein the padding instruction provides a specified data format in a 6-bit instruction including data format, cross-slice dimension type, and a key dimension of a stick. 20. A computer-implemented method comprising: decoding a padding instruction to determine a replacement pad value and padding parameters of a mask pattern for a set of stickified input data, the padding instruction embedded in a first load instruction for the stickified input data; deriving a mask pattern from the padding parameters; and applying the derived mask pattern and the associated replacement pad value to a subsequent load instruction to load subsequent input data, the subsequent load instruction equivalent to the first load instruction, the subsequent input data is padded during transmission to a requesting consumer. 21. The method of claim 20 , wherein the set of stickified input data is arranged as sticks, each stick being separable into pre-defined number of slices. 22. The method of claim 21 , further comprising: padding the set of stickified input data to generate the set of data sticks by masking invalid elements of each stick, the masking invalid elements of a stick including: identifying invalid slices in the stick, the invalid slices including an invalid element, and determining a starting slice having a portion of valid elements and a portion of invalid elements. 23. The method of claim 20 , wherein the replacement pad value is determined with reference to a pad value register file. 24. The method of claim 20 , further comprising: generating a set of data sticks from the set of stickified input data for a requesting consumer, the set of data sticks padded according to the derived mask pattern including the replacement pad value.
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
Grid computing · CPC title
the resource being the memory · CPC title
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
Loop control instructions; iterative instructions, e.g. LOOP, REPEAT · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.