System, Method, and recording medium for mirroring matrices for batched Cholesky decomposition on a graphic processing unit
US-10572569-B2 · Feb 25, 2020 · US
US11036829B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11036829-B2 |
| Application number | US-201916665313-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 28, 2019 |
| Priority date | Jun 30, 2016 |
| Publication date | Jun 15, 2021 |
| Grant date | Jun 15, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1 and combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)×N, where the first problem shared memory comprises regular intervals, where the second problem shared memory is continuous, and where the GPU performs batched dense Cholesky decomposition with the one matrix from the combining to accelerate the Cholesky decomposition.
Opening claim text (preview).
What is claimed is: 1. A batched Cholesky decomposition method for a Graphics Processing Unit (GPU), the method comprising: mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1; and combining the first problem matrix and a mirrored second problem matrix into one matrix that has a memory layout of an (N+1)×N matrix, wherein the first problem matrix and the second problem matrix are two different problems to be solved by Cholesky decomposition simultaneously at a same time to accelerate the Cholesky decomposition, the first problem matrix and the second problem matrix are symmetrical and positive definite matrices, and wherein a linear system is solved via the GPU using the one matrix. 2. The method of claim 1 , wherein the combining solves the linear system by a computer-implemented process. 3. A non-transitory computer-readable recording medium recording a batched Cholesky decomposition program for a Graphics Processing Unit (GPU), the program causing a computer to perform: mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1; and combining the first problem matrix and a mirrored second problem matrix into one matrix that has a memory layout of an (N+1)×N matrix, wherein the first problem matrix and the second problem matrix are two different problems to be solved by Cholesky decomposition simultaneously at a same time to accelerate the Cholesky decomposition, the first problem matrix and the second problem matrix are symmetrical and positive definite matrices, and wherein a linear system is solved via the GPU using the one matrix. 4. A batched Cholesky decomposition system on a Graphics Processing Unit (GPU), said system comprising: a processor; and a memory, the memory storing instructions to cause the processor to: mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1; and combining the first problem matrix and the mirrored second problem matrix into one matrix that has a memory layout of an (N+1)×N matrix when at least two problems are present to be solved by a processor simultaneously at a same time, the first problem matrix and the second problem matrix are symmetrical and positive definite matrices wherein the GPU performs batched dense Cholesky decomposition with the one matrix from the combining to accelerate the Cholesky decomposition to solve a linear system with the GPU.
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Indexing scheme relating to group G06F5/00; Methods or arrangements for data conversion without changing the order or content of the data handled · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Memory management · CPC title
having at least two separately controlled shifting levels, e.g. using shifting matrices (G06F5/012 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.