Mirroring matrices for batched cholesky decomposition on a graphic processing unit

US12086207B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12086207-B2
Application numberUS-202318216926-A
CountryUS
Kind codeB2
Filing dateJun 30, 2023
Priority dateJun 30, 2016
Publication dateSep 10, 2024
Grant dateSep 10, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring matrices to form paired matrices solving the paired matrices simultaneously.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer-readable recording medium recording a program for a Graphics Processing Unit (GPU), the program causing a computer to perform a method comprising: generating a combined matrix as a rectangular matrix via merging a first problem matrix for a first problem and a second problem matrix for a second problem, the second problem matrix being folded with respect to the first problem matrix; storing a diagonal intersection portion of the combined matrix in global memory, the diagonal intersection portion occurring at an intersection of the first problem matrix and the folded second problem matrix in the combined matrix; storing a first problem portion of the combined matrix in a shared first memory; storing a second problem portion of the combined matrix in a shared second memory; and utilizing the combined matrix to accelerate batched dense Cholesky decomposition on the GPU, the utilizing comprising allocating, to a thread, to the first problem, and to the second problem, data read from the diagonal intersection portion from the global memory. 2. A batched Cholesky decomposition method for a Graphics Processing Unit (GPUJ), the method comprising: generating a combined matrix as a rectangular matrix via merging a first problem matrix for a first problem and a second problem matrix for a second problem, the second problem matrix being folded with respect to the first problem matrix; storing a diagonal intersection portion of the combined matrix in global memory, the diagonal intersection portion occurring at an intersection of the first problem matrix and the folded second problem matrix in the combined matrix; storing a first problem portion of the combined matrix in a shared first memory; storing a second problem portion of the combined matrix in a shared second memory; and utilizing the combined matrix to accelerate batched dense Cholesky decomposition on the GPU, the utilizing comprising allocating, to a thread, to the first problem, and to the second problem, data read from the diagonal intersection portion from the global memory. 3. A batched Cholesky decomposition system on a Graphics Processing Unit (GPU), said system comprising: a processor; and a memory, the memory storing instructions to cause the processor to perform computer operations comprising: generating a combined matrix as a rectangular matrix via merging a first problem matrix for a first problem and a second problem matrix for a second problem, the second problem matrix being folded with respect to the first problem matrix; storing a diagonal intersection portion of the combined matrix in global memory, the diagonal intersection portion occurring at an intersection of the first problem matrix and the folded second problem matrix in the combined matrix; storing a first problem portion of the combined matrix in a shared first memory; storing a second problem portion of the combined matrix in a shared second memory; and utilizing the combined matrix to accelerate batched dense Cholesky decomposition on the GPU, the utilizing comprising allocating, to a thread, to the first problem, and to the second problem, data read from the diagonal intersection portion from the global memory.

Assignees

Inventors

Classifications

  • Indexing scheme relating to group G06F5/00; Methods or arrangements for data conversion without changing the order or content of the data handled · CPC title

  • Memory management · CPC title

  • Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Simultaneous equations {, e.g. systems of linear equations} · CPC title

  • having at least two separately controlled shifting levels, e.g. using shifting matrices (G06F5/012 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12086207B2 cover?
A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring matrices to form paired matrices solving the paired matrices simultaneously.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 10 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).