Chained split execution of fused compound arithmetic operations
US-2017097824-A1 · Apr 6, 2017 · US
US10782933B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10782933-B2 |
| Application number | US-202016779073-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 31, 2020 |
| Priority date | Apr 28, 2019 |
| Publication date | Sep 22, 2020 |
| Grant date | Sep 22, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Implementations of this specification provide a method and apparatus for computer data processing for large number operations. An example method performed by a computing device includes splitting a multiplier and a multiplicand into respective four 64-bit numbers from most significant bits to least significant bits; reading the split multipliers and the split multiplicands into a register; and obtaining a multiplication processing result for the multiplier and the multiplicand by performing operations including: classifying the split multipliers and the split multiplicands into groups of data pairs, calculating multiplication results of the groups of data pairs one by one, performing accumulation on multiplication results of data pairs in each group, and storing an accumulation result corresponding to the data pairs in memory as the multiplication processing result for the multiplier and the multiplicand.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: splitting, by a computing device, a multiplier into four 64-bit numbers from most significant bits to least significant bits to obtain split multipliers, the split multipliers comprising a[3], a[2], a[1], and a[0]; splitting, by the computing device, a multiplicand into four 64-bit numbers from most significant bits to least significant bits to obtain split multiplicands, the split multiplicands comprising b[3], b[2], b[1], and b[0]; reading, by the computing device, the split multipliers and the split multiplicands into a register; and obtaining, by the computing device, a multiplication processing result for the multiplier and the multiplicand by performing operations comprising: classifying the split multipliers and the split multiplicands into seven groups of data pairs, wherein a first group of data pairs comprises a[0]b[0], a second group of data pairs comprises a[1]b[0] and a[0]b[1], a third group of data pairs comprises a[2]b[0], a[1]b[1], and a[0]b[2], a fourth group of data pairs comprises a[3]b[0], a[2]b[1], a[1]b[2], and a[0]b[3], a fifth group of data pairs comprises a[3]b[1], a[2]b[2], and a[1]b[3], a sixth group of data pairs comprises a[3]b[2] and a[2]b[3], and a seventh group of data pairs comprises a[3]b[3]; calculating multiplication results of the first group of data pairs to the seventh group of data pairs one by one, and performing intra-group accumulation on multiplication results of data pairs in each group, the intra-group accumulation comprising: (i) in a same group of data pairs, accumulating a calculated multiplication result for each data pair with a multiplication result of a previous data pair, (ii) storing in memory, 64 least significant bits of a final accumulation result for data pairs in the same group of data pairs, (iii) obtaining a remaining accumulation result of the same group of data pairs, and (iv) releasing a corresponding register; and accumulating a multiplication result of a first data pair in each group of data pairs with the remaining accumulation result of a previous group of data pairs, accumulating an accumulation result with a multiplication result of a next data pair until accumulation of multiplication results of the data pairs in the seventh group of data pairs is completed, and storing the accumulation result corresponding to the data pairs in the seventh group of data pairs in memory as the multiplication processing result for the multiplier and the multiplicand. 2. The computer-implemented method according to claim 1 , wherein the multiplier and the multiplicand are each 256-bit numbers. 3. The computer-implemented method according to claim 1 , further comprising: releasing a register that stores the multiplication result of each data pair when accumulating the multiplication results of each group of data pairs, and storing the accumulation result in three registers. 4. The computer-implemented method according to claim 1 , wherein the computing device uses a 64-bit computer operating system. 5. The computer-implemented method according to claim 4 , further comprising: randomly selecting four registers from registers RBX, RBP, R12, R13, R14, and R15 of the 64-bit computer operating system, and storing values of the selected registers in memory; and obtaining the stored values of the selected registers from memory after the multiplication processing result for the multiplier and the multiplicand is obtained, and restoring the values of the selected registers. 6. The computer-implemented method according to claim 5 , further comprising: selecting registers RAX, RCX, RDX, RSI, RDI, R8, R9, R10, and R11 from the 64-bit computer operating system, wherein the selected registers are all used for data storage in a data processing procedure. 7. A computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform operations comprising: splitting, by a computing device, a multiplier into four 64-bit numbers from most significant bits to least significant bits to obtain split multipliers, the split multipliers comprising a[3], a[2], a[1], and a[0]; splitting, by the computing device, a multiplicand into four 64-bit numbers from most significant bits to least significant bits to obtain split multiplicands, the split multiplicands comprising b[3], b[2], b[1], and b[0]; reading, by the computing device, the split multipliers and the split multiplicands into a register; and obtaining, by the computing device, a multiplication processing result for the multiplier and the multiplicand by performing operations comprising: classifying the split multipliers and the split multiplicands into seven groups of data pairs, wherein a first group of data pairs comprises a[0]b[0], a second group of data pairs comprises a[1]b[0] and a[0]b[1], a third group of data pairs comprises a[2]b[0], a[1]b[1], and a[0]b[2], a fourth group of data pairs comprises a[3]b[0], a[2]b[1], a[1]b[2], and a[0]b[3], a fifth group of data pairs comprises a[3]b[1], a[2]b[2], and a[1]b[3], a sixth group of data pairs comprises a[3]b[2] and a[2]b[3], and a seventh group of data pairs comprises a[3]b[3]; calculating multiplication results of the first group of data pairs to the seventh group of data pairs one by one, and performing intra-group accumulation on multiplication results of data pairs in each group, the intra-group accumulation comprising: (i) in a same group of data pairs, accumulating a calculated multiplication result for each data pair with a multiplication result of a previous data pair, (ii) storing in memory, 64 least significant bits of a final accumulation result for data pairs in the same group of data pairs, (iii) obtaining a remaining accumulation result of the same group of data pairs, and (iv) releasing a corresponding register; and accumulating a multiplication result of a first data pair in each group of data pairs with the remaining accumulation result of a previous group of data pairs, accumulating an accumulation result with a multiplication result of a next data pair until accumulation of multiplication results of the data pairs in the seventh group of data pairs is completed, and storing the accumulation result corresponding to the data pairs in the seventh group of data pairs in memory as the multiplication processing result for the multiplier and the multiplicand. 8. The computer-implemented system according to claim 7 , wherein the multiplier and the multiplicand are each 256-bit numbers. 9. The computer-implemented system according to claim 7 , the operations further comprising: releasing a register that stores the multiplication result of each data pair when accumulating the multiplication results of each group of data pairs, and storing the accumulation result in three registers. 10. The computer-implemented system according to claim 7 , wherein the computing device uses a 64-bit computer operating system. 11. The computer-implemented system according to claim 10 , the operations further comprising: randomly selecting four registers from registers RBX, RBP, R12, R13, R14, and R15 of the 64-bit computer operating system, and storing values of the selected registers in memory; and obtaining the stored values of the selected registers from memory after the multiplication processing result for the multiplier and the multiplicand is obtained, and restoring the values of the selected registers. 12. The computer-implemented s
Arithmetic instructions · CPC title
partitioned, i.e. using repetitively a smaller parallel parallel multiplier or using an array of such smaller multipliers · CPC title
Special purpose registers · CPC title
Multiplying; Dividing · CPC title
Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.