What technology area does this patent fall under?

Primary CPC classification G06T1/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Page faulting and selective preemption

US12067641B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12067641-B2
Application number	US-202217749266-A
Country	US
Kind code	B2
Filing date	May 20, 2022
Priority date	Apr 9, 2017
Publication date	Aug 20, 2024
Grant date	Aug 20, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment provides a parallel processor comprising a memory interface and a processing array coupled with the memory interface. The processing array is configured to address memory accessed via the memory interface via a virtual address mapping and includes circuitry to resolve a page fault for the virtual address mapping, wherein each of the multiple compute blocks is separately preemptable.

First claim

Opening claim text (preview).

What is claimed is: 1. A general-purpose graphics processing unit (GPGPU) comprising: a host interface; a memory interface; a processing array coupled with the host interface and the memory interface, the processing array including multiple processing clusters to perform parallel operations, the processing array configured to address memory accessed via the memory interface via a virtual address mapping and includes circuitry to resolve a page fault for the virtual address mapping, wherein each of the multiple processing clusters is separately preemptable and is associated with a dedicated region of context save memory; and a scheduler to schedule a workload to the multiple processing clusters, the scheduler configured to track an average latency to resolve a page fault, enable page fault preemption for a first context in response to a determination that the average latency to resolve a page fault for the first context is above a high-watermark threshold, and disable page fault preemption for the first context in response to a determination that the average latency to resolve a page fault for the first context is below a low-watermark threshold that is different from the high-watermark threshold, wherein to preempt a processing cluster includes to halt execution at an instruction boundary of a first plurality of threads of a first context during execution of the first plurality of threads, save context state associated with the first plurality of threads to the dedicated region of context save memory, and replace the first plurality of threads of the first context with a second plurality of threads of a second context. 2. The GPGPU as in claim 1 , wherein each of the multiple processing clusters includes a plurality of graphics and general-purpose processing elements. 3. The GPGPU as in claim 2 , wherein the GPGPU is a single instruction multiple thread (SIMT) processor. 4. The GPGPU as in claim 1 , wherein the scheduler is configured to schedule a plurality of work groups associated with the workload to the multiple processing clusters. 5. The GPGPU as in claim 4 , further comprising an embedded microcontroller, wherein the embedded microcontroller includes the scheduler. 6. The GPGPU as in claim 5 , wherein the scheduler is configured to track an execution state of the plurality of work groups. 7. The GPGPU as in claim 6 , wherein a first processing cluster of the multiple processing clusters is configured to preempt a first work group of the workload while a second processing cluster is to concurrently execute a second work group of the workload. 8. The GPGPU as in claim 7 , wherein the first work group is associated with the first context, and in response to a determination that page fault preemption is enabled for the first context, a first processing cluster of the multiple processing clusters is configured to detect that a first work group has a number of unhandled page faults over a threshold, halt execution of the first work group, and save a context state for the first work group. 9. The GPGPU as in claim 8 , wherein the first processing cluster is to save the context state for the first work group to the dedicated region of context save memory associated with the first processing cluster. 10. The GPGPU as in claim 9 , wherein to save the context state for the first work group to the dedicated region of context save memory, the first processing cluster is to save internal state of the first processing cluster and state for shared assets accessed by the first processing cluster during execution of the first work group. 11. A data processing system comprising: an input/output interconnect; and a general-purpose graphics processor device coupled with the input/output interconnect, the general-purpose graphics processor device including: a host interface; a memory device; a parallel processor including a processing array coupled with the memory device and the host interface, the parallel processor including multiple processing clusters to perform parallel operations, the processing array configured to address the memory device via a virtual address mapping and includes circuitry to resolve a page fault for the virtual address mapping, wherein each of the multiple processing clusters is separately preemptable and is associated with a dedicated region of context save memory; and a scheduler to schedule a workload to the multiple processing clusters, the scheduler configured to track an average latency to resolve a page fault, enable page fault preemption for a first context in response to a determination that the average latency to resolve a page fault for the first context is above a high-watermark threshold, and disable page fault preemption for the first context in response to a determination that the average latency to resolve a page fault for the first context is below a low-watermark threshold that is different from the high-watermark threshold, wherein to preempt a processing cluster includes to halt execution at an instruction boundary of a first plurality of threads of a first context during execution of the first plurality of threads, save context state associated with the first plurality of threads to the dedicated region of context save memory, and replace the first plurality of threads of the first context with a second plurality of threads of a second context. 12. The data processing system as in claim 11 , wherein each of the multiple processing clusters includes a plurality of graphics and general-purpose processing elements. 13. The data processing system as in claim 12 , wherein the general-purpose graphics processor device includes a single instruction multiple thread (SIMT) processor. 14. The data processing system as in claim 11 , wherein the scheduler is configured to schedule a plurality of work groups associated with the workload to the multiple processing clusters. 15. The data processing system as in claim 14 , further comprising an embedded microcontroller, wherein the embedded microcontroller includes the scheduler. 16. The data processing system as in claim 15 , wherein the scheduler is configured to track an execution state of the plurality of work groups. 17. The data processing system as in claim 16 , wherein a first processing cluster of the multiple processing clusters is configured to preempt a first work group of the workload while a second processing cluster is to concurrently execute a second work group of the workload. 18. The data processing system as in claim 17 , wherein the first work group is associated with the first context, and in response to a determination that page fault preemption is enabled for the first context, a first processing cluster of the multiple processing clusters is configured to detect that a first work group has a number of unhandled page faults over a threshold, halt execution of the first work group, and save a context state for the first work group. 19. The data processing system as in claim 18 , additionally including a context save memory to store a context state for each of the multiple processing clusters, wherein the first processing cluster is to save the context state for the first work group to the context save memory in a region of the context save memory that is dedicated to the first processing cluster. 20. The data processing system as in claim 19 , wherein to save the context state for the first work group to the dedicated region of context save memory, the first processing cluster is to save internal state of the first processing cluste

Assignees

Intel Corp

Inventors

Classifications

G06F9/4843
by program, e.g. task dispatcher, supervisor, operating system · CPC title
G06F9/461
Saving or restoring of program or task context · CPC title
G06F9/30185
according to one or more bits in the instruction, e.g. prefix, sub-opcode · CPC title
G06F9/3009
Thread control instructions · CPC title
G06F9/3888
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

Patent family

Related publications grouped by family.

View patent family 61557135

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12067641B2 cover?: One embodiment provides a parallel processor comprising a memory interface and a processing array coupled with the memory interface. The processing array is configured to address memory accessed via the memory interface via a virtual address mapping and includes circuitry to resolve a page fault for the virtual address mapping, wherein each of the multiple compute blocks is separately preemptable.
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Sparse convolutional neural network accelerator

Sparse convolutional neural network accelerator

Sparse convolutional neural network accelerator

Sparse convolutional neural network accelerator

Performing multi-convolution operations in a parallel processing system

Frequently asked questions