Processor and method for tracking progress of gathering/scattering data element pairs in different cache memory banks

US10387151B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10387151-B2
Application numberUS-201113250223-A
CountryUS
Kind codeB2
Filing dateSep 30, 2011
Priority dateDec 31, 2007
Publication dateAug 20, 2019
Grant dateAug 20, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus are disclosed for accessing multiple data cache lines for scatter/gather operations. Embodiment of apparatus may comprise address generation logic to generate an address from an index of a set of indices for each of a set of corresponding mask elements having a first value. Line or bank match ordering logic matches addresses in the same cache line or different banks, and orders an access sequence to permit a group of addresses in multiple cache lines and different banks. Address selection logic directs the group of addresses to corresponding different banks in a cache to access data elements in multiple cache lines corresponding to the group of addresses in a single access cycle. A disassembly/reassembly buffer orders the data elements according to their respective bank/register positions, and a gather/scatter finite state machine changes the values of corresponding mask elements from the first value to a second value.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a cache memory having a plurality of banks to store data in mutually exclusive portions of a cache line; a first register comprising a plurality of data fields, wherein the plurality of data fields in the first register corresponds to a plurality of data elements accessible using a plurality of corresponding indices in a second register, wherein for each data field in the first register, a first value indicates the corresponding data element has not been accessed and a second value indicates that the corresponding data element does not need to be, or has already been, accessed using a corresponding index from the second register; a decode stage to decode a first instruction; and one or more execution units, responsive to the decoded first instruction, to: read the values of each of the plurality of data fields in the first register; for two or more of the plurality of data fields in the first register having the first value, determine a first pair of corresponding data elements stored in different banks of the cache memory, and simultaneously access the first pair of corresponding data elements in said different banks using their corresponding indices; and change the values of a pair of data fields in the first register corresponding to said first pair of corresponding data elements from the first value to the second value. 2. The processor of claim 1 wherein said simultaneously accessing the first pair of corresponding data elements means gathering the first pair of corresponding data elements from said different banks in a single cache access. 3. The processor of claim 1 wherein said simultaneously accessing the first pair of corresponding data elements means scattering the first pair of corresponding data elements to said different banks in a single cache access. 4. A processor comprising: a cache memory having a plurality of banks to store data in mutually exclusive portions of a cache line; a first register comprising data fields, wherein each data field in the first register corresponds to a data element to be written into a second register, wherein for each data field in the first register, a first value is to indicate the corresponding data element has not been written into the second register and a second value is to indicate that the corresponding data element does not need to be, or has already been, written into the second register; a decode stage to decode a first instruction; and one or more execution units, responsive to the decoded first instruction, to: read the values of each of the data fields in the first register; for a plurality of data fields in the first register having the first value, determine a first pair of corresponding data elements stored in different banks of the cache memory, and access said different banks using a second pair of addresses, corresponding to said first pair of corresponding data elements, to gather the first pair of corresponding data elements and write the first pair of corresponding data elements into the second register; and change the values of a third pair of data fields in the first register, corresponding to said first pair of corresponding data elements, from the first value to the second value. 5. The processor of claim 4 further comprising: a disassembly/reassembly buffer, coupled with the cache memory and with the second register, to order the first pair of corresponding data elements according to the respective positions of the third pair of data fields in the first register to be merged into the second register. 6. The processor of claim 4 further comprising: line or bank match ordering circuitry to match the second pair of addresses corresponding to different banks to determine the first pair of corresponding data elements. 7. A method comprising: decoding a first instruction; and executing the decoded first instruction, to: read values of each of a plurality of data fields in a first register, wherein the plurality of data fields in the first register corresponds to a plurality of data elements accessible using a plurality of corresponding indices in a second register, wherein for each data field in the first register, a first value indicates the corresponding data element has not been accessed and a second value indicates that the corresponding data element does not need to be, or has already been, accessed using a corresponding index from the second register, for two or more of the plurality of data fields in the first register having the first value, determine a first pair of corresponding data elements stored in different banks of a cache memory having a plurality of banks to store data in mutually exclusive portions of a cache line, and simultaneously access the first pair of corresponding data elements in said different banks using their corresponding indices; and change the values of a pair of data fields in the first register corresponding to said first pair of corresponding data elements from the first value to the second value. 8. The method of claim 7 wherein said simultaneously accessing the first pair of corresponding data elements means gathering the first pair of corresponding data elements from said different banks in a single cache access. 9. The method of claim 7 wherein said simultaneously accessing the first pair of corresponding data elements means scattering the first pair of corresponding data elements to said different banks in a single cache access. 10. A method comprising: decoding a first instruction; and executing the decoded first instruction, to: read values of each data field in a first register, wherein each data field in the first register corresponds to a data element to be written into a second register, wherein for each data field in the first register, a first value indicates the corresponding data element has not been written into the second register and a second value indicates that the corresponding data element does not need to be, or has already been, written into the second register, for a plurality of data fields in the first register having the first value, determine a first pair of corresponding data elements stored in different banks of a cache memory having a plurality of banks to store data in mutually exclusive portions of a cache line, and access said different banks using a second pair of addresses, corresponding to said first pair of corresponding data elements, to gather the first pair of corresponding data elements and write the first pair of corresponding data element into the second register; and change the values of a third pair of data fields in the first register, corresponding to said first pair of corresponding data elements, from the first value to the second value. 11. The method of claim 10 further comprising: ordering the first pair of corresponding data elements according to the respective positions of the third pair of data fields in the first register to be merged into the second register. 12. The method of claim 10 further comprising: matching the second pair of addresses corresponding to different banks to determine the first pair of corresponding data elements.

Assignees

Inventors

Classifications

  • of multiple operands or results {(addressing multiple banks G06F12/06)} · CPC title

  • LOAD or STORE instructions; Clear instruction · CPC title

  • in hierarchically structured memory systems, e.g. virtual memory systems · CPC title

  • Special purpose registers · CPC title

  • Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication (G06F12/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10387151B2 cover?
Methods and apparatus are disclosed for accessing multiple data cache lines for scatter/gather operations. Embodiment of apparatus may comprise address generation logic to generate an address from an index of a set of indices for each of a set of corresponding mask elements having a first value. Line or bank match ordering logic matches addresses in the same cache line or different banks, and o…
Who is the assignee on this patent?
Hall Jonathan C, Kottapalli Sailesh, Forsyth Andrew T, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F9/30043. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 20 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).