Error detection using vector processing circuitry
US-2019340054-A1 · Nov 7, 2019 · US
US11842200B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11842200-B2 |
| Application number | US-201916586247-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 27, 2019 |
| Priority date | Sep 27, 2019 |
| Publication date | Dec 12, 2023 |
| Grant date | Dec 12, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. The load store unit ignores exceptions or faults while performing the gather operation in the first mode and transitions to a second mode in response to an exception or fault. Two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a plurality of load buses; a load store unit comprising a plurality of load ports to access the plurality of load buses, wherein the load store unit is configured to perform a gather operation to concurrently gather, based on a memory address, a plurality of subsets of data from a memory via the plurality of load buses in a first mode, wherein the gather operation iteratively loads data via each of the plurality of load buses in the first mode until the gather operation is complete in response to no exceptions or faults being generated during the gather operation; and a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. 2. The apparatus of claim 1 , wherein the load store unit comprises a plurality of lanes that is partitioned into lane subsets that are configured to concurrently execute the gather operation to gather the plurality of subsets of data. 3. The apparatus of claim 2 , wherein the load store unit includes two load ports and two load buses, and wherein the plurality of lanes is partitioned into even lane subsets and odd lane subsets. 4. The apparatus of claim 3 , wherein gather operations are dispatched to two lanes to concurrently perform the gather operation per clock cycle in the first mode. 5. The apparatus of claim 1 , wherein the load store unit is configured to ignore exceptions or faults while performing the gather operation in the first mode. 6. The apparatus of claim 5 , wherein the load store unit is configured to transition from the first mode to a second mode in response to an exception or fault occurring while performing the gather operation. 7. The apparatus of claim 6 , wherein the load store unit is configured to perform the gather operation in order by a plurality of lanes based on a mask that indicates lanes that successfully gathered data and stored the successfully gathered data in the register in a previous iteration of the second mode. 8. The apparatus of claim 7 , wherein the gather operation is dispatched to a single lane to perform the gather operation per clock cycle in the second mode. 9. A method comprising: concurrently gathering, at a load store unit in a floating-point unit (FPU) that is operating in a first mode, a plurality of subsets of data from a memory via a plurality of load buses implemented in the FPU, wherein gathering comprises performing a gather operation for iteratively loading data via each of the plurality of load buses based on a memory address until the gather operation is complete in response to no exceptions or faults being generated during the gathering; and storing the plurality of subsets of data in a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. 10. The method of claim 9 , wherein concurrently gathering the plurality of subsets of the data comprises concurrently executing the gather operation on a plurality of lanes that is partitioned into lane subsets that are configured to concurrently execute the gather operation to gather the plurality of subsets of data. 11. The method of claim 10 , wherein the load store unit includes two load ports and two load buses, and wherein the plurality of lanes is partitioned into even lane subsets and odd lane subsets. 12. The method of claim 11 , further comprising: dispatching gather operations to two lanes to concurrently perform the gather operation per clock cycle. 13. The method of claim 9 , further comprising: ignoring exceptions or faults while performing the gather operation in the first mode. 14. The method of claim 13 , further comprising: transitioning the load store unit from the first mode to a second mode in response to an exception or fault occurring while performing the gather operation. 15. The method of claim 14 , further comprising: performing the gather operation in order by a plurality of lanes based on a mask that indicates lanes that successfully gathered data and stored the successfully gathered data in the register in a previous iteration of the second mode. 16. The method of claim 15 , further comprising: dispatching the gather operation to a single lane to perform the gather operation per clock cycle. 17. An apparatus comprising: a plurality of load buses; a load store unit configured to perform a gather operation selectively in a first mode or a second mode, wherein the load store unit is configured to concurrently gather, based on a memory address, a plurality of subsets of data from a memory via the plurality of load buses in the first mode, wherein the gather operation iteratively loads data via each of the plurality of load buses until the gather operation is complete in response to no exceptions or faults being generated during the gather operation, and wherein the load store unit is configured to gather the plurality of subsets of data from the memory using partial updating in the second mode; and a register configured to store the plurality of subsets of data provided by the load store unit. 18. The apparatus of claim 17 , wherein the load store unit ignores exceptions or faults while performing the gather operation in the first mode. 19. The apparatus of claim 18 , wherein the load store unit transitions from the first mode to the second mode in response to an exception or fault occurring while performing the gather operation in the first mode. 20. The apparatus of claim 19 , wherein two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.
using a mask · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
according to data content, e.g. floating-point registers, address registers · CPC title
Bit or string instructions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.