Apparatus and method for transferring a plurality of data structures between memory and a plurality of vector registers

US9875214B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9875214-B2
Application numberUS-201514814590-A
CountryUS
Kind codeB2
Filing dateJul 31, 2015
Priority dateJul 31, 2015
Publication dateJan 23, 2018
Grant dateJan 23, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus and method are provided for transferring a plurality of data structures between memory and a plurality of vector registers, each vector register being arranged to store a vector operand comprising a plurality of data elements. Access circuitry is used to perform access operations to move data elements of vector operands between the data structures in memory and specified vector registers, each data structure comprising multiple data elements stored at contiguous addresses in the memory. Decode circuitry is responsive to a single access instruction identifying a plurality of vector registers and a plurality of data structures that are located discontiguously with respect to each other in the memory, to generate control signals to control the access circuitry to perform a sequence of access operations to move the plurality of data structures between the memory and the plurality of vector registers such that the vector operand in each vector register holds a corresponding data element from each of the plurality of data structures. This provides a very efficient mechanism for performing complex access operations, resulting in an increase in execution speed, and potential reductions in power consumption.

First claim

Opening claim text (preview).

We claim: 1. An apparatus comprising: a set of vector registers, each vector register arranged to store a vector operand comprising a plurality of data elements; access circuitry to perform access operations to move data elements of vector operands between data structures in memory and said set of vector registers, each data structure comprising multiple data elements stored at contiguous addresses in said memory; decode circuitry, responsive to a single access instruction identifying a plurality of vector registers from said set and a plurality of data structures that are located discontiguously with respect to each other in said memory, to generate control signals to control the access circuitry to perform a sequence of said access operations to move said plurality of data structures between said memory and said plurality of vector registers such that the vector operand in each vector register of said plurality holds a corresponding data element from each of said plurality of data structures. 2. An apparatus as claimed in claim 1 , wherein the multiple data elements of one or more of the plurality of data structures are rearranged as they are moved between said memory and said plurality of vector registers. 3. An apparatus as claimed in claim 1 , wherein said single access instruction is a load instruction, and the access circuitry is responsive to the control signals to perform said sequence of access operations in order to obtain the data elements of each identified data structure from said memory and to write into each identified vector register a vector operand comprising a corresponding data element from each of said plurality of data structures. 4. An apparatus as claimed in claim 3 , wherein said sequence of access operations comprises a sequence of gather operations, each gather operation obtaining a corresponding data element from each of said plurality of data structures and writing the obtained data elements into a vector register associated with that gather operation. 5. An apparatus as claimed in claim 1 , wherein said single access instruction is a store instruction, and the access circuitry is responsive to the control signals to perform said sequence of access operations in order to read from each identified vector register a vector operand comprising a corresponding data element from each of said plurality of data structures, and to rearrange the data elements as they are written to said memory so as to store each data structure at an address in said memory corresponding to its discontiguous location whilst ensuring that the data elements of each individual data structure are stored at contiguous addresses in said memory. 6. An apparatus as claimed in claim 5 , wherein said sequence of access operations comprises a sequence of scatter operations, each scatter operation obtaining from a vector register associated with that scatter operation a vector operand comprising a corresponding data element from each of said plurality of data structures, and writing the data elements of that vector operand to addresses in said memory determined from the addresses of said plurality of data structures. 7. An apparatus as claimed in claim 1 , wherein said single access instruction includes a data structure identifier field providing information used to determine the addresses of said plurality of data structures. 8. An apparatus as claimed in claim 7 , further comprising: a set of scalar registers to store scalar data values; wherein said data structure identifier field comprises a scalar register identifier field identifying a scalar register from said set whose stored scalar data value is used to determine a base address in said memory, and a stride identifier field containing stride information used to derive the addresses of said plurality of data structures from said base address. 9. An apparatus as claimed in claim 8 , wherein said stride information identifies a constant stride value. 10. An apparatus as claimed in claim 9 , wherein said stride identifier field includes one of an immediate value and a scalar register identifier in order to identify said constant stride value. 11. An apparatus as claimed in claim 8 , wherein said stride information identifies a series of stride values, each stride value being associated with at least one of said plurality of data structures. 12. An apparatus as claimed in claim 11 , wherein said stride identifier field identifies a vector register within said set, and each data element in said vector register identifies a stride value to be used to determine from the base address the address of an associated one of said data structures. 13. An apparatus as claimed in claim 7 , wherein said data structure identifier field identifies a vector register within said set, and each data element in said vector register provides pointer data used to determine the address of an associated one of said data structures. 14. An apparatus as claimed in claim 1 , wherein said single access instruction includes a vector register identifier field providing information used to determine said plurality of vector registers to be accessed. 15. An apparatus as claimed in claim 14 , wherein said vector register identifier field comprise a vector register identifier used to identify one vector register in said set and an integer value used to identify the number of vector registers in said plurality of vector registers to be accessed, the decode circuitry being arranged to apply a predetermined rule in order to determine each vector register in said plurality from the identified one vector register and said integer. 16. An apparatus as claimed in claim 15 , wherein the decode circuitry is arranged to determine, as said plurality of vector registers to be accessed, a consecutive plurality of vector registers including the identified one vector register. 17. An apparatus as claimed in claim 1 , wherein: the access circuitry operates on a plurality of lanes, each lane incorporating a corresponding data element position from each of said plurality of vector registers; said single access instruction includes a predicate identifier field providing predicate information used to determine which of said plurality of lanes are active lanes for the sequence of access operations; and the access circuitry being arranged to determine, as said plurality of data structures to be moved, those data structures associated with the active lanes. 18. An apparatus as claimed in claim 17 , wherein the single access instruction is a load instruction, and the access circuitry is arranged to perform a compaction operation using said predicate information, the predicate information being used to identify the plurality of data structures to be loaded, and the access circuitry being arranged to store those data structures within a series of consecutive lanes within the plurality of vector registers. 19. An apparatus as claimed in claim 7 , wherein the single access instruction includes an offset identifier field providing offset data to be applied in combination with the information in the data structure identifier field when determining the addresses of said plurality of data structures. 20. An apparatus as claimed in claim 1 , wherein the access circuitry comprises a load/store unit and an associated buffer storage to allow data elements to be temporarily buffered during performance of said sequence of access operations. 21. An apparatus as claimed in claim 1 , wherein the access circuitry comprises a load/store

Assignees

Inventors

Classifications

  • LOAD or STORE instructions; Clear instruction · CPC title

  • using stride · CPC title

  • having multiple operands in a single register · CPC title

  • to perform operations on memory · CPC title

  • of multiple operands or results {(addressing multiple banks G06F12/06)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9875214B2 cover?
An apparatus and method are provided for transferring a plurality of data structures between memory and a plurality of vector registers, each vector register being arranged to store a vector operand comprising a plurality of data elements. Access circuitry is used to perform access operations to move data elements of vector operands between the data structures in memory and specified vector reg…
Who is the assignee on this patent?
Advanced Risc Mach Ltd, Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F15/8076. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 23 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).