What technology area does this patent fall under?

Primary CPC classification G06F9/5038. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network scheduling mechanism

US12499347B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12499347-B2
Application number	US-202318471843-A
Country	US
Kind code	B2
Filing date	Sep 21, 2023
Priority date	Apr 9, 2017
Publication date	Dec 16, 2025
Grant date	Dec 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.

First claim

Opening claim text (preview).

What is claimed is: 1 . A graphics processing unit comprising: a shared memory; a memory interface coupled with the shared memory; and a processing cluster including a plurality of graphics multiprocessors internal to the graphics processing unit, the processing cluster coupled with the shared memory via the memory interface, the plurality of graphics multiprocessors coupled via a data interconnect, the data interconnect to facilitate exchange of data between the plurality of graphics multiprocessors during cooperative execution of a workload, the plurality of graphics multiprocessors configured to process workloads received for execution, wherein a graphics multiprocessor of the plurality of graphics multiprocessors includes: a plurality of processing engines; a scheduler to schedule direct access to the plurality of processing engines to process the workloads, wherein the workloads are each associated with a precompiled neural network (NN) kernel; and a gather unit to bypass zero data values and gather non-zero data values associated with the workloads, the non-zero data values stored sparsely in memory. 2 . The graphics processing unit of claim 1 , wherein the non-zero data values are data values for a convolutional kernel to be multiplied by data elements of a feature map. 3 . The graphics processing unit of claim 2 , wherein convolutional kernel is an irregular convolutional kernel and the plurality of processing engines are configured to multiply the data values for the convolutional kernel by the data elements of the feature map. 4 . The graphics processing unit of claim 2 , wherein the scheduler is to schedule the workloads to the plurality of processing engines based on priority and a submission type associated with the workloads. 5 . The graphics processing unit of claim 2 , wherein the plurality of graphics multiprocessors are associated with driver logic to facilitate access to the plurality of graphics multiprocessors by one or more clients, the one or more clients registered to the driver logic, the one or more clients to bypass an operating system to access the plurality of graphics multiprocessors via a function pointer received from the driver logic. 6 . The graphics processing unit of claim 5 , wherein each of the one or more clients receives a function pointer to enable direct access to the plurality of processing engines. 7 . The graphics processor processing unit of claim 5 , wherein each of the one or more clients include an input interface to the plurality of graphics multiprocessors. 8 . The graphics processing unit of claim 1 , wherein the gather unit is to store a map to the non-zero data values and gather the non-zero data values based on the map. 9 . A method to facilitate workload scheduling, comprising: receiving a request to access plurality of processing engines of a general purpose graphics processing unit, the general purpose graphics processing unit including a processing cluster having a plurality of graphics multiprocessors internal to the general purpose graphics processing unit, the plurality of graphics multiprocessors coupled via a data interconnect, the data interconnect to facilitate exchange of data between the plurality of graphics multiprocessors during cooperative execution of a workload, and a graphics multiprocessor of the plurality of graphics multiprocessors includes the plurality of processing engines; scheduling direct access to the plurality of processing engines to enable a client to process a workload provided by the client, wherein the workload is associated with a precompiled neural network (NN) kernel; and gathering, via a gather unit of the general purpose graphics processing unit, non-zero data values associated with the client while bypassing zero data values associated with the client, the non-zero data values stored sparsely in memory. 10 . The method of claim 9 , further comprising gathering the non-zero data values via a map to the non-zero data values. 11 . The method of claim 9 , further comprising bypassing an operating system and scheduling direct access to the plurality of processing engines via a Kernel Mode Driver (KMD) associated with the general purpose graphics processing unit, wherein the KMD provides a function pointer to enable direct access to the plurality of processing engines. 12 . The method of claim 9 , wherein access is provided to the client based on a priority and a submission client type. 13 . The method of claim 12 , further comprising registering the client with driver logic associated with the general purpose graphics processing unit. 14 . The method as in claim 9 , wherein the non-zero data values are data values for a convolutional kernel to be multiplied by data elements of a feature map. 15 . The method as in claim 14 , further comprising multiplying, via the plurality of processing engines of the general purpose graphics processing unit, the data values for the convolutional kernel by the data elements of the feature map. 16 . A data processing system comprising: memory to store instructions; and one or more processors configured to execute the instructions, wherein the one or more processors include a graphics processing unit including a processing cluster having a plurality of graphics multiprocessors internal to the graphics processing unit, the plurality of graphics multiprocessors coupled via a data interconnect, the data interconnect to facilitate exchange of data between the plurality of graphics multiprocessors during cooperative execution of a workload, and the instructions configure the one or more processors to: receive a request to access plurality of processing engines of a general purpose graphics processing unit; schedule, via a scheduler, direct access to the plurality of processing engines to enable the graphics processing unit to process a workload provided by a client, wherein the workload is associated with a precompiled neural network (NN) kernel; and gather, via a gather unit of the general purpose graphics processing unit, non-zero data values associated with the client while bypassing zero data values associated with the client, the non-zero data values stored sparsely in memory. 17 . The data processing system as in claim 16 , wherein the non-zero data values are data values for a convolutional kernel to be multiplied by data elements of a feature map. 18 . The data processing system of claim 17 , wherein convolutional kernel is an irregular convolutional kernel and the plurality of processing engines are configured to multiply the data values for the convolutional kernel by the data elements of the feature map. 19 . The data processing system of claim 17 , wherein the scheduler is to provide access to the plurality of processing engines based on a priority and a submission client type. 20 . The data processing system of claim 17 , wherein the graphics processing unit is associated with driver logic to facilitate access to the plurality of processing engines and the client is registered to the driver logic, the client to bypass an operating system to access the plurality of graphics multiprocessors via a function pointer received from the driver logic.

Assignees

Intel Corp

Inventors

Classifications

G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
G06N3/063
using electronic means · CPC title
G06F9/5038Primary
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title
G06F2209/5021
Priority · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

View patent family 61768045

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12499347B2 cover?: An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F9/5038. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).