System, method, and recording medium for mirroring matrices for batched Cholesky decomposition on a graphic processing unit
US-11036829-B2 · Jun 15, 2021 · US
US11790035B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11790035-B2 |
| Application number | US-202117315710-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 10, 2021 |
| Priority date | Jun 30, 2016 |
| Publication date | Oct 17, 2023 |
| Grant date | Oct 17, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring matrices to form paired matrices solving the paired matrices simultaneously.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-readable recording medium recording a batched Cholesky decomposition program for a Graphics Processing Unit (GPU) having a plurality of threads that perform work, the program causing a computer to perform: mirroring matrices; forming paired matrices by combining a mirrored matrix of the mirrored matrices with a non-mirrored matrix; and solving the paired matrices simultaneously by each thread of the plurality of threads of the GPU performing a same amount of work. 2. The non-transitory computer readable recording medium of claim 1 , wherein the mirroring is performed by shifting the mirrored matrix of the mirrored matrices by N+1, where N is an integer. 3. The non-transitory computer readable recording medium of claim 1 , wherein during the solving, each thread of the plurality of threads of the GPU performs an equal amount of work when running batched Cholesky decomposition program. 4. A batched Cholesky decomposition method for a Graphics Processing Unit (GPU) having a plurality of threads that perform work, the method comprising: mirroring matrices; forming paired matrices by combining a mirrored matrix of the mirrored matrices with a non-mirrored matrix; and solving the paired matrices simultaneously by each thread of the plurality of threads of the GPU performing a same amount of work. 5. A batched Cholesky decomposition system on a Graphics Processing Unit (GPU) having a plurality of threads that perform work, said system comprising: a processor; and a memory, the memory storing instructions to cause the processor to: mirroring matrices; forming paired matrices by combining a mirrored matrix of the mirrored matrices with a non-mirrored matrix; and solving the paired matrices simultaneously by each thread of the plurality of threads of the GPU performing a same amount of work.
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
having at least two separately controlled shifting levels, e.g. using shifting matrices (G06F5/012 takes precedence) · CPC title
Simultaneous equations {, e.g. systems of linear equations} · CPC title
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Memory management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.