Who is the assignee on this patent?

Sony Interactive Entertainment LLC

What technology area does this patent fall under?

Primary CPC classification G06F9/4881. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Accessing local memory of a GPU executing a first kernel when executing a second kernel of another GPU

US12406324B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12406324-B2
Application number	US-202318504068-A
Country	US
Kind code	B2
Filing date	Nov 7, 2023
Priority date	Apr 28, 2020
Publication date	Sep 2, 2025
Grant date	Sep 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods for graphics processing are provided. One example method includes executing a plurality of kernels using a plurality of graphics processing units (GPUs), wherein responsibility for executing a corresponding kernel is divided into one or more portions each of which being assigned to a corresponding GPU. The method includes generating a plurality of dependency data at a first kernel as each of a first plurality of portions of the first kernel completes processing. The method includes checking dependency data from one or more portions of the first kernel prior to execution of a portion of a second kernel. The method includes delaying execution of the portion of the second kernel as long as the corresponding dependency data of the first kernel has not been met.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: executing, at a first graphics processing unit (GPU), a portion of a first kernel to completion to generate first kernel data; storing, by the first GPU, the first kernel data in local memory of the first GPU; generating, by the first GPU, first dependency data indicating that the first kernel data is stored in the local memory of the first GPU; storing, by the first GPU, the first dependency data at a first memory location of a dependency data store; checking, by a second GPU, status of the first dependency data at the first memory location of the dependency data store; when the status of the first dependency data indicates that the first kernel data is stored in the local memory of the first GPU, accessing, by the second GPU, the first kernel data from the local memory of the first GPU; and executing a portion of a second kernel using the first kernel data that has been accessed. 2. The method of claim 1 , wherein the accessing the first kernel data from the local memory of the first GPU includes: reading the first kernel data from the local memory of the first GPU when performing the executing the portion of the second kernel. 3. The method of claim 1 , wherein the accessing the first kernel data from the local memory of the first GPU includes: copying the first kernel data from the local memory of the first GPU to local memory of the second GPU; and reading, by the second GPU, the first kernel data from the local memory of the second GPU when performing the executing the portion of the second kernel. 4. The method of claim 3 , comprising: using direct memory access (DMA) when performing the copying the first kernel data from the local memory of the first GPU to the local memory of the second GPU. 5. The method of claim 1 , wherein the accessing the first kernel data from the local memory of the first GPU includes: reading at least a first portion of the first kernel data from the local memory of the first GPU; copying the first kernel data from the local memory of the first GPU to local memory of the second GPU; and executing the portion for the second kernel using the at least the first portion of the first kernel data. 6. The method of claim 5 , further comprising: reading, by the second GPU, a remaining portion of the first kernel data from the local memory of the second GPU after the copying the first kernel data to the local memory of the second GPU has completed; and performing, by the second GPU, the executing the portion of the second kernel using the remaining portion of the first kernel data. 7. The method of claim 1 , further comprising: dividing the first kernel into a plurality of portions during execution of an application, wherein the executing the portion of the second kernel begins before the plurality of portions of the first kernel has completed processing. 8. A computer system comprising: a processor; memory coupled to the processor and having stored therein instructions that, if executed by the computer system, cause the computer system to execute a method, comprising: executing, at a first graphics processing unit (GPU), a portion of a first kernel to completion to generate first kernel data; storing, by the first GPU, the first kernel data in local memory of the first GPU; generating, by the first GPU, first dependency data indicating that the first kernel data is stored in the local memory of the first GPU; storing, by the first GPU, the first dependency data at a first memory location of a dependency data store; checking, by a second GPU, status of the first dependency data at the first memory location of the dependency data store; when the status of the first dependency data indicates that the first kernel data is stored in the local memory of the first GPU, accessing by the second GPU the first kernel data from the local memory of the first GPU; and executing a portion of a second kernel using the first kernel data that has been accessed. 9. The computer system of claim 8 , wherein in the method the accessing the first kernel data from the local memory of the first GPU includes: reading the first kernel data from the local memory of the first GPU when performing the executing the portion of the second kernel. 10. The computer system of claim 8 , wherein in the method the accessing the first kernel data from the local memory of the first GPU includes: copying the first kernel data from the local memory of the first GPU to local memory of the second GPU; and reading, by the second GPU, the first kernel data from the local memory of the second GPU when performing the executing the portion of the second kernel. 11. The computer system of claim 10 , the method further comprising: using direct memory access (DMA) when performing the copying the first kernel data from the local memory of the first GPU to the local memory of the second GPU. 12. The computer system of claim 8 , wherein in the method the accessing the first kernel data from the local memory of the first GPU includes: reading at least a first portion of the first kernel data from the local memory of the first GPU; copying the first kernel data from the local memory of the first GPU to local memory of the second GPU; and executing the portion for the second kernel using the at least the first portion of the first kernel data. 13. The computer system of claim 12 , the method further comprising: reading, by the second GPU, a remaining portion of the first kernel data from the local memory of the second GPU after the copying the first kernel data to the local memory of the second GPU has completed; and performing, by the second GPU, the executing the portion of the second kernel using the remaining portion of the first kernel data. 14. The computer system of claim 8 , the method further comprising: dividing the first kernel into a plurality of portions during execution of an application, wherein the executing the portion of the second kernel begins before the plurality of portions of the first kernel has completed processing. 15. A non-transitory computer-readable medium storing a computer program for execution by a processor to perform a method, the non-transitory computer-readable medium comprising: program instructions for executing, at a first graphics processing unit (GPU), a portion of a first kernel to completion to generate first kernel data; program instructions for storing, by the first GPU, the first kernel data in local memory of the first GPU; program instructions for generating, by the first GPU, first dependency data indicating that the first kernel data is stored in the local memory of the first GPU; program instructions for storing, by the first GPU, the first dependency data at a first memory location of a dependency data store; program instructions for checking, by a second GPU, status of the first dependency data at the first memory location of the dependency data store; program instructions for accessing, by the second GPU, the first kernel data from the local memory of the first GPU when the status of the first dependency data indicates that the first kernel data is stored in the local memory of the first GPU; and program instructions for executing a portion of a second kernel using the first kernel data that has been accessed. 16. The non-transitory computer-readable medium of claim 15 , wherein the program instructions for accessing the first kernel data from the local memory of the first GPU includes: program instructions for reading the first kernel data from the local me

Assignees

Sony Interactive Entertainment LLC

Inventors

Classifications

G06T15/005
General purpose rendering architectures · CPC title
G06T1/60
Memory management · CPC title
G06F9/4881Primary
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
G06F9/505
considering the load · CPC title
G06T1/20Primary
Processor architectures; Processor configuration, e.g. pipelining · CPC title

Patent family

Related publications grouped by family.

View patent family 76250424

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12406324B2 cover?: Methods for graphics processing are provided. One example method includes executing a plurality of kernels using a plurality of graphics processing units (GPUs), wherein responsibility for executing a corresponding kernel is divided into one or more portions each of which being assigned to a corresponding GPU. The method includes generating a plurality of dependency data at a first kernel as ea…
Who is the assignee on this patent?: Sony Interactive Entertainment LLC
What technology area does this patent fall under?: Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).