What technology area does this patent fall under?

Primary CPC classification G06F9/3851. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 06 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for simultaneously executing multiple contexts on a graphics engine

US10796472B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10796472-B2
Application number	US-201816024821-A
Country	US
Kind code	B2
Filing date	Jun 30, 2018
Priority date	Jun 30, 2018
Publication date	Oct 6, 2020
Grant date	Oct 6, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatus and method for simultaneous command streamers. For example, one embodiment of an apparatus comprises: a plurality of work element queues to store work elements for a plurality of thread contexts, each work element associated with a context descriptor identifying a context storage region in memory; a plurality of command streamers, each command streamer associated with one of the plurality of work element queues, the command streamers to independently submit instructions for execution as specified by the work elements; a thread dispatcher to evaluate the thread contexts including priority values, to tag each instruction with an execution identifier (ID), and to responsively dispatch each instruction including the execution ID in accordance with the thread context; and a plurality of graphics functional units to independently execute each instruction dispatched by the thread dispatcher and to associate each instruction with a thread context based on its execution ID.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a plurality of work element queues to store work elements for a plurality of thread contexts, each work element associated with a context descriptor identifying a context storage region in memory; a plurality of command streamers, each command streamer associated with one of the plurality of work element queues, the command streamers to independently submit instructions for execution as specified by the work elements, wherein command streamers use different types of address space identifiers for simultaneously executing the plurality of thread contexts; a thread dispatcher to evaluate the thread contexts including priority values, to tag each instruction with an execution identifier (ID), and to responsively dispatch each instruction including the execution ID in accordance with the thread context; and a plurality of graphics functional units to independently execute each instruction dispatched by the thread dispatcher and to associate each instruction with a thread context based on its execution ID, wherein the execution IDs of instructions are propagated downstream from the thread dispatcher to individual graphics function units within the plurality of graphics functional units. 2. The apparatus of claim 1 , wherein the plurality of command streamers comprises: a first set of one or more command streamers to process three-dimensional (3D) graphics processing workloads; and a second set of one or more command streamers to process compute workloads. 3. The apparatus of claim 2 wherein the first set includes command streamers which also process compute workloads in addition to the 3D graphics processing workloads. 4. The apparatus of claim 1 wherein each command streamer is associated with a different application having a different thread context. 5. The apparatus of claim 1 wherein each context descriptor comprises a logical render context address (LRCA) comprising a starting address for an associated storage region in memory. 6. The apparatus of claim 5 wherein the associated storage region comprises a hardware status subregion, a ring context subregion, and an engine context subregion. 7. The apparatus of claim 1 wherein the thread dispatcher comprises prioritization circuitry/logic to determine priority values associated with each thread and responsively dispatch instructions in accordance with relative priority values. 8. The apparatus of claim 7 wherein the thread dispatcher dispatches the instructions based on both relative priority values and instruction execution counter values associated with each thread. 9. A method comprising: queuing a plurality of work elements for a plurality of thread contexts in a plurality of work queues, each work element associated with a context descriptor identifying a context storage region in memory; independently reading the work elements from the work queues by a plurality of command streamers, each command streamer having a work queue associated therewith, wherein command streamers use different types of address space identifiers for simultaneously executing the plurality of thread contexts; submitting instructions from the command streamers for execution as specified by the work elements; evaluating the thread contexts including priority values associated with the submitted instructions; dispatching instructions indicated by the work elements from a thread dispatcher to a plurality of graphics functional units in accordance with the evaluation, tagging each instruction with a corresponding execution identifier (ID); and independently executing each instruction, associating the instruction with its thread context based on the execution ID, wherein the execution IDs of the instructions are propagated downstream from the thread dispatcher to individual graphics function units within the plurality of graphics functional units. 10. The method of claim 9 further comprising: processing three-dimensional (3D) graphics processing workloads on a first set of one or more command streamers; and processing compute workloads on a second set of one or more command streamers. 11. The method of claim 10 wherein the first set includes command streamers which also process compute workloads in addition to the 3D graphics processing workloads. 12. The method of claim 9 wherein each command streamer is associated with a different application having a different thread context. 13. The method of claim 9 wherein each context descriptor comprises a logical render context address (LRCA) comprising a starting address for an associated storage region in memory. 14. The method of claim 13 wherein the associated storage region comprises a hardware status subregion, a ring context subregion, and an engine context subregion. 15. The method of claim 9 , wherein the method further comprises determining priority values associated with each thread and responsively dispatching instructions in accordance with relative priority values. 16. The method of claim 15 , wherein the method further comprises dispatching the instructions based on both relative priority values and instruction execution counter values associated with each thread. 17. A non-transitory machine-readable medium having program code stored thereon which, when executed by a machine, causes the machine to perform the operations of: queuing a plurality of work elements for a plurality of thread contexts in a plurality of work queues, each work element associated with a context descriptor identifying a context storage region in memory; independently reading the work elements from the work queues by a plurality of command streamers, each command streamer having a work queue associated therewith, wherein command streamers use different types of address space identifiers for simultaneously executing the plurality of thread contexts; submitting instructions from the command streamers for execution as specified by the work elements; evaluating the thread contexts including priority values associated with the submitted instructions; dispatching instructions indicated by the work elements from a thread dispatcher to a plurality of graphics functional units in accordance with the evaluation, tagging each instruction with a corresponding execution identifier (ID); and independently executing each instruction, associating the instruction with its thread context based on the execution ID, wherein the execution IDs of the instructions are propagated downstream from the thread dispatcher to individual graphics function units within the plurality of graphics functional units. 18. The non-transitory machine-readable medium of claim 17 further comprising program code to cause the machine to perform the operations of: processing three-dimensional (3D) graphics processing workloads on a first set of one or more command streamers; and processing compute workloads on a second set of one or more command streamers. 19. The non-transitory machine-readable medium of claim 18 wherein the first set includes command streamers which also process compute workloads in addition to the 3D graphics processing workloads. 20. The non-transitory machine-readable medium of claim 17 wherein each command streamer is associated with a different application having a different thread context. 21. The non-transitory machine-readable medium of claim 17 wherein each context descriptor comprises a logical render context address (LRCA) comprising a starting address for an associated storage re

Assignees

Intel Corp

Inventors

Classifications

G06F9/3851Primary
from multiple instruction streams, e.g. multistreaming · CPC title
G06F9/3888
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title
G06F9/4843
by program, e.g. task dispatcher, supervisor, operating system · CPC title
G06F9/4831Primary
with variable priority · CPC title
G06T2210/52
Parallel processing · CPC title

Patent family

Related publications grouped by family.

View patent family 66866950

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10796472B2 cover?: Apparatus and method for simultaneous command streamers. For example, one embodiment of an apparatus comprises: a plurality of work element queues to store work elements for a plurality of thread contexts, each work element associated with a context descriptor identifying a context storage region in memory; a plurality of command streamers, each command streamer associated with one of the plura…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F9/3851. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 06 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Fifo queue, memory resource, and task management for graphics processing

Position only shader context submission through a render command streamer

Data driven scheduler on multiple computing cores

Frequently asked questions