Method and apparatus for integral image computation instructions
US-9442723-B2 · Sep 13, 2016 · US
US11567765B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11567765-B2 |
| Application number | US-201716487766-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 1, 2017 |
| Priority date | Mar 20, 2017 |
| Publication date | Jan 31, 2023 |
| Grant date | Jan 31, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.
Opening claim text (preview).
We claim: 1. A processor comprising: programmable configuration storage to store configuration information for a multi-dimensional matrix destination operand, the configuration information including a first value corresponding to a number of rows for the multi-dimensional matrix destination operand, a second value corresponding to a number of columns for the multi-dimensional matrix destination operand, and a start row value corresponding to a row of the multi-dimensional matrix destination operand at which to restart execution; decode circuitry to decode an instance of a single instruction having fields for an opcode, a multi-dimensional matrix destination operand identifier, and source memory information, wherein the opcode is to indicate execution circuitry is to load data elements from memory into configured rows of the identified multi-dimensional matrix destination operand; and execution circuitry to execute the decoded instance of the single instruction according to the opcode to load data elements from memory into configured rows of the identified multi-dimensional matrix destination operand. 2. The processor of claim 1 , wherein the opcode defines a size of each data element of the destination multi-dimensional matrix destination operand. 3. The processor of claim 2 , wherein the size of each data element of the destination multi-dimensional matrix destination operand is a doubleword. 4. The processor of claim 2 , wherein the size of each data element of the destination multi-dimensional matrix destination operand is a word. 5. The processor of claim 1 , wherein the execution circuitry is to store each configured row into the identified multi-dimensional matrix destination operand and update a counter value as each row is stored. 6. The processor of claim 1 , wherein the identified multi-dimensional matrix destination operand is a plurality of registers configured to represent a matrix. 7. The processor of claim 1 , wherein the source memory information includes a scale, an index, a base, and a displacement. 8. A method comprising: decoding an instance of a single instruction having fields for an opcode, a multi-dimensional matrix destination operand identifier, and source memory information, wherein the opcode is to indicate execution circuitry is to load groups of strided data elements from memory into configured rows of the identified multi-dimensional matrix destination operand and wherein a stride value is determined by shifting an index value provided by the instance of the single instruction by a scale value provided by the instance of the single instruction; and executing the decoded instance of the single instruction according to the opcode to load groups of strided data elements from memory into configured rows of the identified multi-dimensional matrix destination operand, wherein programmable configuration storage stores configuration information for the multi-dimensional matrix destination operand, the configuration information including a first value corresponding to a number of rows for the multi-dimensional matrix destination operand, a second value corresponding to a number of columns for the multi-dimensional matrix destination operand, and a start row value corresponding to a row of the multi-dimensional matrix destination operand at which to restart execution. 9. The method of claim 8 , wherein the opcode defines a size of each data element of the multi-dimensional matrix destination operand. 10. The method of claim 9 , wherein the size of each data element of the multi-dimensional destination matrix operand is a doubleword. 11. The method of claim 9 , wherein the size of each data element of the multi-dimensional matrix destination operand is a word. 12. The method of claim 8 , further comprising loading each configured row of the identified multi-dimensional matrix destination operand and update a counter value as each row is loaded. 13. The method of claim 8 , wherein the identified multi-dimensional matrix destination operand is a plurality of registers configured to represent a matrix. 14. The method of claim 8 , wherein the source memory information includes a scale, an index, a base, and a displacement. 15. A non-transitory machine-readable medium storing an instance of an instruction which causes a processor to perform a method, the method comprising: decoding the instance of a single instruction having fields for an opcode, a multi-dimensional matrix destination operand identifier, and source memory information, wherein the opcode is to indicate execution circuitry is to load groups of strided data elements from memory into configured rows of the identified multi-dimensional matrix destination operand and wherein a stride value is determined by shifting an index value provided by the instance of the single instruction by a scale value provided by the instance of the single instruction; and executing the decoded instance of the single instruction according to the opcode to load groups of strided data elements from memory into configured rows of the identified multi-dimensional matrix destination operand, wherein programmable configuration storage stores configuration information for the multi-dimensional matrix destination operand, the configuration information including a first value corresponding to a number of rows for the multi-dimensional matrix destination operand, a second value corresponding to a number of columns for the multi- dimensional matrix destination operand, and a start row value corresponding to a row of the multi-dimensional matrix destination operand at which to restart execution. 16. The non-transitory machine-readable medium of claim 15 , wherein the opcode defines a size of each data element of the multi-dimensional matrix destination operand. 17. The non-transitory machine-readable medium of claim 16 , wherein the size of each data element of the multi-dimensional matrix destination operand is a doubleword. 18. The non-transitory machine-readable medium of claim 16 , wherein the size of each data element of the multi-dimensional matrix destination operand is a word. 19. The non-transitory machine-readable medium of claim 15 , wherein the identified multi-dimensional matrix destination operand is a plurality of registers configured to represent a matrix. 20. A system comprising: a processor including: programmable configuration storage to store configuration information for a multi-dimensional matrix, the configuration information including a first value corresponding to a number of rows for the multi-dimensional matrix, a second value corresponding to a number of columns for the multi-dimensional matrix, and a start row value corresponding to a row of the multi-dimensional matrix at which to restart execution, and decode circuitry to decode an instance of a single instruction having fields for an opcode, a multi-dimensional matrix destination operand identifier, and source memory information, wherein the opcode is to indicate execution circuitry is to load data elements from memory into configured rows of the identified multi-dimensional matrix destination operand; and an accelerator coupled to the processor, the accelerator including: execution circuitry to execute the decoded instance of the single instruction according to the opcode to load data elements from memory into configured rows of the identified multi-dimensional matrix destination operand.
having multiple operands in a single register · CPC title
Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title
according to one or more bits in the instruction, e.g. prefix, sub-opcode · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Adding; Subtracting {(G06F7/4833, G06F7/4836 take precedence)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.