Multi-modal gather operation

US11842200B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11842200-B2
Application numberUS-201916586247-A
CountryUS
Kind codeB2
Filing dateSep 27, 2019
Priority dateSep 27, 2019
Publication dateDec 12, 2023
Grant dateDec 12, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. The load store unit ignores exceptions or faults while performing the gather operation in the first mode and transitions to a second mode in response to an exception or fault. Two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a plurality of load buses; a load store unit comprising a plurality of load ports to access the plurality of load buses, wherein the load store unit is configured to perform a gather operation to concurrently gather, based on a memory address, a plurality of subsets of data from a memory via the plurality of load buses in a first mode, wherein the gather operation iteratively loads data via each of the plurality of load buses in the first mode until the gather operation is complete in response to no exceptions or faults being generated during the gather operation; and a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. 2. The apparatus of claim 1 , wherein the load store unit comprises a plurality of lanes that is partitioned into lane subsets that are configured to concurrently execute the gather operation to gather the plurality of subsets of data. 3. The apparatus of claim 2 , wherein the load store unit includes two load ports and two load buses, and wherein the plurality of lanes is partitioned into even lane subsets and odd lane subsets. 4. The apparatus of claim 3 , wherein gather operations are dispatched to two lanes to concurrently perform the gather operation per clock cycle in the first mode. 5. The apparatus of claim 1 , wherein the load store unit is configured to ignore exceptions or faults while performing the gather operation in the first mode. 6. The apparatus of claim 5 , wherein the load store unit is configured to transition from the first mode to a second mode in response to an exception or fault occurring while performing the gather operation. 7. The apparatus of claim 6 , wherein the load store unit is configured to perform the gather operation in order by a plurality of lanes based on a mask that indicates lanes that successfully gathered data and stored the successfully gathered data in the register in a previous iteration of the second mode. 8. The apparatus of claim 7 , wherein the gather operation is dispatched to a single lane to perform the gather operation per clock cycle in the second mode. 9. A method comprising: concurrently gathering, at a load store unit in a floating-point unit (FPU) that is operating in a first mode, a plurality of subsets of data from a memory via a plurality of load buses implemented in the FPU, wherein gathering comprises performing a gather operation for iteratively loading data via each of the plurality of load buses based on a memory address until the gather operation is complete in response to no exceptions or faults being generated during the gathering; and storing the plurality of subsets of data in a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. 10. The method of claim 9 , wherein concurrently gathering the plurality of subsets of the data comprises concurrently executing the gather operation on a plurality of lanes that is partitioned into lane subsets that are configured to concurrently execute the gather operation to gather the plurality of subsets of data. 11. The method of claim 10 , wherein the load store unit includes two load ports and two load buses, and wherein the plurality of lanes is partitioned into even lane subsets and odd lane subsets. 12. The method of claim 11 , further comprising: dispatching gather operations to two lanes to concurrently perform the gather operation per clock cycle. 13. The method of claim 9 , further comprising: ignoring exceptions or faults while performing the gather operation in the first mode. 14. The method of claim 13 , further comprising: transitioning the load store unit from the first mode to a second mode in response to an exception or fault occurring while performing the gather operation. 15. The method of claim 14 , further comprising: performing the gather operation in order by a plurality of lanes based on a mask that indicates lanes that successfully gathered data and stored the successfully gathered data in the register in a previous iteration of the second mode. 16. The method of claim 15 , further comprising: dispatching the gather operation to a single lane to perform the gather operation per clock cycle. 17. An apparatus comprising: a plurality of load buses; a load store unit configured to perform a gather operation selectively in a first mode or a second mode, wherein the load store unit is configured to concurrently gather, based on a memory address, a plurality of subsets of data from a memory via the plurality of load buses in the first mode, wherein the gather operation iteratively loads data via each of the plurality of load buses until the gather operation is complete in response to no exceptions or faults being generated during the gather operation, and wherein the load store unit is configured to gather the plurality of subsets of data from the memory using partial updating in the second mode; and a register configured to store the plurality of subsets of data provided by the load store unit. 18. The apparatus of claim 17 , wherein the load store unit ignores exceptions or faults while performing the gather operation in the first mode. 19. The apparatus of claim 18 , wherein the load store unit transitions from the first mode to the second mode in response to an exception or fault occurring while performing the gather operation in the first mode. 20. The apparatus of claim 19 , wherein two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.

Assignees

Inventors

Classifications

  • using a mask · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • G06F9/3887Primary

    controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • according to data content, e.g. floating-point registers, address registers · CPC title

  • Bit or string instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11842200B2 cover?
An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of port…
Who is the assignee on this patent?
Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/30036. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 12 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).