Multiple register memory access instructions, processors, methods, and systems

US9786338B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9786338-B2
Application numberUS-201615238186-A
CountryUS
Kind codeB2
Filing dateAug 16, 2016
Priority dateJun 28, 2013
Publication dateOct 10, 2017
Grant dateOct 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor includes N-bit registers and a decode unit to receive a multiple register memory access instruction. The multiple register memory access instruction is to indicate a memory location and a register. The processor includes a memory access unit coupled with the decode unit and with the N-bit registers. The memory access unit is to perform a multiple register memory access operation in response to the multiple register memory access instruction. The operation is to involve N-bit data, in each of the N-bit registers comprising the indicated register. The operation is also to involve different corresponding N-bit portions of an M×N-bit line of memory corresponding to the indicated memory location. A total number of bits of the N-bit data in the N-bit registers to be involved in the multiple register memory access operation is to amount to at least half of the M×N-bits of the line of memory.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a plurality of N-bit packed data registers; a decode unit to receive a memory access instruction, the memory access instruction to indicate a memory location and to indicate a packed data register; and an execution engine unit coupled with the decode unit and coupled with the plurality of the N-bit packed data registers, the execution engine unit to perform a memory access operation in response to the memory access instruction, the memory access operation to involve N-bit data, in each of the plurality of the N-bit packed data registers that are to comprise the indicated packed data register, and different corresponding N-bit portions of an M×N-bit line of memory, that is to correspond to the indicated memory location, in which a total number of bits of the N-bit data in the plurality of the N-bit packed data registers to be involved in the memory access operation is to amount to all of the M×N-bits of the line of memory. 2. The processor of claim 1 , in which the execution engine unit is to perform the memory access operation in which the total number of bits of the N-bit data in the plurality of the N-bit packed data registers to be involved in the memory access operation is to amount to at least 512-bits. 3. The processor of claim 1 , in which the execution engine unit is to perform the memory access operation that is to involve the N-bit data in each of at least three N-bit packed data registers. 4. The processor of claim 3 , in which the execution engine unit is to perform the memory access operation that is to involve the N-bit data in each of at least four N-bit packed data registers. 5. The processor of claim 1 , in which the execution engine unit is to perform the memory access operation that is to involve 128-bit data, in each of at least four 128-bit packed data registers, and the different corresponding 128-bit portions of the line of memory that is to be at least 512-bits. 6. The processor of claim 1 , in which the execution engine unit is to perform the memory access operation that is to involve 256-bit data, in each of at least two 256-bit packed data registers, and the different corresponding 256-bit portions of the line of memory that is to be at least 512-bits. 7. The processor of claim 1 , in which the memory access instruction comprises a load from memory instruction, and in which the execution engine unit is to load the different N-bit portions of the M×N-bit line of memory in each of the plurality of the N-bit packed data registers, in response to the load from memory instruction, in which the total number of bits of the different N-bit portions to be loaded in the plurality of the N-bit packed data registers from the M×N-bit line of memory is to amount to at least 512-bits. 8. The processor of claim 7 , in which the execution engine unit is to load different 128-bit portions of the line of memory in each of at least four 128-bit packed data registers. 9. The processor of claim 7 , in which the execution engine unit is to load different 256-bit portions of the line of memory in each of at least two 256-bit packed data registers. 10. The processor of claim 1 , in which the memory access instruction comprises a write to memory instruction, and in which the execution engine unit is to write the N-bit data, from each of the plurality of the N-bit packed data registers, to the different corresponding N-bit portions of the M×N-bit line of memory, in response to the write to memory instruction, in which the total number of bits of the N-bit data to be written from the plurality of the N-bit packed data registers to the M×N-bit line of memory is to amount to at least 512-bits. 11. The processor of claim 1 , in which the memory access instruction is to explicitly specify each of the plurality of N-bit packed data registers. 12. The processor of claim 1 , in which the memory access instruction is to specify a number of the plurality of N-bit packed data registers. 13. A processor comprising: four 128-bit packed data registers of a set of 128-bit packed data registers; a decode unit to receive a load from memory instruction, the load from memory instruction to indicate a memory location, and to indicate a destination packed data register; and an execution engine unit coupled with the decode unit, and coupled with the four 128-bit packed data registers, the execution engine unit to perform a load from memory operation, in response to the load from memory instruction, the load from memory operation to load four 128-bit portions of a 512-bit line of memory, that is to correspond to the indicated memory location, into the four 128-bit packed data registers that are to comprise the indicated destination packed data register, wherein the four 128-bit packed data registers are implicitly to be sequential 128-bit packed data registers. 14. The processor of claim 13 , wherein the processor is capable of viewing each of the four 128-bit packed data registers as two 64-bit packed data registers. 15. The processor of claim 13 , wherein the decode unit is to decode the load from memory instruction that is to have a number of registers specifier. 16. A processor comprising: eight 128-bit packed data registers of a set of 128-bit packed data registers; a decode unit to receive a load from memory instruction, the load from memory instruction to indicate a memory location, and to indicate a destination packed data register; and an execution engine unit coupled with the decode unit, and coupled with the eight 128-bit packed data registers, the execution engine unit to perform a load from memory operation, in response to the load from memory instruction, the load from memory operation to load eight 128-bit portions of at least one line of memory, that is to correspond to the indicated memory location, into the eight 128-bit packed data registers that are to comprise the indicated destination packed data register, wherein the eight 128-bit packed data registers are implicitly to be sequential 128-bit packed data registers. 17. The processor of claim 16 , wherein the processor is capable of viewing each of the eight 128-bit packed data registers as two 64-bit packed data registers. 18. The processor of claim 16 , wherein the decode unit is to decode the load from memory instruction that is to have a number of registers specifier. 19. A processor comprising: two 256-bit packed data registers of a set of 256-bit packed data registers; a decode unit to receive a load from memory instruction, the load from memory instruction to indicate a memory location, and to indicate a destination packed data register; and an execution engine unit coupled with the decode unit, and coupled with the two 256-bit packed data registers, the execution engine unit to perform a load from memory operation, in response to the load from memory instruction, the load from memory operation to load two 256-bit portions of a 512-bit line of memory, that is to correspond to the indicated memory location, into the two 256-bit packed data registers that are to comprise the indicated destination packed data register, wherein the two 256-bit packed data registers are implicitly to be sequential 256-bit packed data registers. 20. The processor of claim 19 , wherein the processor is capable of viewing each of the 256-bit packed data registers as two 128-bit packed data registers.

Assignees

Inventors

Classifications

  • LOAD or STORE instructions; Clear instruction · CPC title

  • G11C7/1036Primary

    using data shift registers · CPC title

  • having multiple operands in a single register · CPC title

  • with implied specifier, e.g. top of stack · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9786338B2 cover?
A processor includes N-bit registers and a decode unit to receive a multiple register memory access instruction. The multiple register memory access instruction is to indicate a memory location and a register. The processor includes a memory access unit coupled with the decode unit and with the N-bit registers. The memory access unit is to perform a multiple register memory access operation in …
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30043. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).