What technology area does this patent fall under?

Primary CPC classification G06F9/3891. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Data locality enhancement for graphics processing units

US12190118B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12190118-B2
Application number	US-202318339454-A
Country	US
Kind code	B2
Filing date	Jun 22, 2023
Priority date	Nov 15, 2019
Publication date	Jan 7, 2025
Grant date	Jan 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: processor circuitry coupled to a memory, the processor circuitry to: map one or more tasks to one or more processing resources; and forward one or more destination identifiers corresponding to the one or more tasks to the one or more processing resources, wherein the one or more tasks are represented in a task graph, and receive data dependencies associated with the one or more tasks including one or more producer tasks or one or more consumer tasks. 2. The apparatus of claim 1 , wherein the one or more processing resources comprise one or more of a first processing resource or a second processing resource. 3. The apparatus of claim 1 , wherein the one or more producer tasks execute on the first processing resource, and wherein the one or more consumer tasks execute on the second processing resource. 4. The apparatus of claim 3 , wherein the processor circuitry is further to transport a data output from the one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. 5. The apparatus of claim 3 , wherein the processing circuitry is further to enqueue a kernel for execution by the one of the processing resources, wherein the cache memory comprises a L1 cache, and wherein the L1 cache is shared between the first and second processing resources. 6. The apparatus of claim 1 , wherein the processor circuitry comprises graphics processor circuitry co-located with application processor circuitry on a semiconductor package. 7. A method comprising: mapping, by a processor of a computing device, one or more tasks to one or more processing resources; and forwarding one or more destination identifiers corresponding to the one or more tasks to the one or more processing resources, wherein the one or more tasks are represented in a task graph, and receiving data dependencies associated with the one or more tasks including one or more producer tasks or one or more consumer tasks. 8. The method of claim 7 , wherein the one or more processing resources comprise one or more of a first processing resource or a second processing resource. 9. The method of claim 7 , wherein the one or more producer tasks execute on the first processing resource, and wherein the one or more consumer tasks execute on the second processing resource. 10. The method of claim 9 , further comprising transporting a data output from the one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. 11. The method of claim 9 , further comprising enqueuing a kernel for execution by one of the processing resources, wherein the cache memory comprises a L1 cache, and wherein the L1 cache is shared between the first and second processing resources. 12. The method of claim 7 , wherein the processor is coupled to a memory, the processor comprises a graphics processor co-located with an application processor on a semiconductor package. 13. At least one computer-readable medium having stored thereon instructions which, when executed, cause a computing device to facilitate operations comprising: mapping one or more tasks to one or more processing resources; and forwarding one or more destination identifiers corresponding to the one or more tasks to the one or more processing resources, wherein the one or more tasks are represented in a task graph, and receiving data dependencies associated with the one or more tasks including one or more producer tasks or one or more consumer tasks. 14. The computer-readable medium of claim 13 , wherein the one or more processing resources comprise one or more of a first processing resource or a second processing resource. 15. The computer-readable medium of claim 13 , wherein the one or more producer tasks execute on the first processing resource, and wherein the one or more consumer tasks execute on the second processing resource. 16. The computer-readable medium of claim 15 , wherein the operations further comprise transporting a data output from the one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. 17. The computer-readable medium of claim 15 , wherein the operations further comprise enqueuing a kernel for execution by one of the processing resources, wherein the cache memory comprises a L1 cache, and wherein the L1 cache is shared between the first and second processing resources. 18. The computer-readable medium of claim 13 , wherein the computing device comprises one or more processors coupled to a memory, the one or more processors include one or more graphics processors co-located with one or more application processors on a semiconductor package.

Assignees

Intel Corp

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0895
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

Patent family

Related publications grouped by family.

View patent family 75683475

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12190118B2 cover?: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the firs…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F9/3891. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Sparse convolutional neural network accelerator

Sparse convolutional neural network accelerator

Sparse convolutional neural network accelerator

Sparse convolutional neural network accelerator

Data Management for Multiple Processing Units Using Data Transfer Costs

Hardware apparatuses and methods to control access to a multiple bank data cache

Performing multi-convolution operations in a parallel processing system

Frequently asked questions