Sparse convolutional neural network accelerator
US-10891538-B2 · Jan 12, 2021 · US
US11210265B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11210265-B2 |
| Application number | US-202016869223-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 7, 2020 |
| Priority date | Apr 1, 2017 |
| Publication date | Dec 28, 2021 |
| Grant date | Dec 28, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
Opening claim text (preview).
The invention claimed is: 1. An apparatus, comprising: a general-purpose graphics processing compute block comprising a plurality of processing resources to execute graphics instructions; a first shared memory communicatively coupled to the plurality of processing resources; a first communication fabric to communicatively couple the plurality of processing resources to the first shared memory; and a copy engine to: receive a signal from an instruction scheduler indicating an initiation of a preemption process; stop an execution of an existing context on at least a first of the plurality of processing resources; and copy context state data from the existing context on the at least a first of the plurality of processing resources to the first shared memory in parallel with executing a new context on the plurality of processing resources. 2. The apparatus of claim 1 , wherein the copy engine is to: generate a signal to indicate that a context preemption process is complete; and upload the context data from the first shared memory to a second memory, separate from the first shared memory. 3. The apparatus of claim 2 , further comprising: a second communication fabric to communicatively couple the copy engine to the second memory. 4. The apparatus of claim 1 , wherein the copy engine is coupled to the first shared memory by the first communication fabric. 5. The apparatus of claim 1 , the copy engine to: copy context state data from the existing context on at least a first of the plurality of processing resources to the first shared memory in parallel with executing a new context on the plurality of processing resources. 6. The apparatus of claim 5 , the copy engine to: restore the first shared memory. 7. A method comprising: receiving, in a copy engine of general-purpose graphics processing compute block comprising a plurality of processing resources to execute graphics instructions, a signal from an instruction scheduler indicating an initiation of a preemption process; stopping an execution of an existing context on at least a first of the plurality of processing resources; and copying context state data from the existing context on the at least a first of the plurality of processing resources to a first shared memory in parallel with executing a new context on the plurality of processing resources via a first communication fabric communicatively coupled to the plurality of processing resources to the first shared memory. 8. The method of claim 7 , further comprising: generating a signal to indicate that a context preemption process is complete; and uploading the context data from the first shared memory to a second memory, separate from the first shared memory. 9. The method of claim 7 , wherein the general-purpose graphics processing compute block further comprises: a first communication fabric to communicatively couple the plurality of processing resources to the first shared memory. 10. The method of claim 9 , wherein the copy engine is coupled to the first shared memory by the first communication fabric. 11. The method of claim 7 , wherein the general-purpose graphics processing compute block further comprises: a second communication fabric to communicatively couple the copy engine to a second memory. 12. The method of claim 7 , further comprising: copying context state data from the existing context on at least a first of the plurality of processing resources to the first shared memory in parallel with executing a new context on the plurality of processing resources. 13. The method of claim 12 , further comprising: restoring the first shared memory. 14. One or more non-transitory computer-readable media comprising a plurality of instructions stored thereon that, when executed by a processor, causes the processor to: receive, in a copy engine of general-purpose graphics processing compute block comprising a plurality of processing resources to execute graphics instructions, a signal from an instruction scheduler indicating an initiation of a preemption process; stop an execution of an existing context on at least a first of the plurality of processing resources; and copy context state data from the existing context on the at least a first of the plurality of processing resources to a first shared memory in parallel with executing a new context on the plurality of processing resources via a first communication fabric communicatively coupled to the plurality of processing resources to the first shared memory. 15. The one or more computer-readable media of claim 14 , wherein the plurality of instructions further cause the processor to: generate a signal to indicate that a context preemption process is complete; and upload the context data from the first shared memory to a second memory, separate from the first shared memory. 16. The one or more computer-readable media of claim 14 , wherein the general-purpose graphics processing compute block further comprises: a first communication fabric to communicatively couple the plurality of processing resources to the first shared memory. 17. The one or more computer-readable media of claim 16 , wherein the copy engine is coupled to the first shared memory by the first communication fabric. 18. The one or more computer-readable media of claim 14 , wherein the general-purpose graphics processing compute block further comprises: a second communication fabric to communicatively couple the copy engine to the second memory. 19. The one or more computer-readable media of claim 14 , wherein the plurality of instructions further cause the processor to: copy context state data from the existing context on at least a first of the plurality of processing resources to the first shared memory in parallel with executing a new context on the plurality of processing resources. 20. The one or more computer-readable media of claim 19 , wherein the plurality of instructions further cause the processor to: restore the first shared memory.
Saving or restoring of program or task context · CPC title
Concurrent instruction execution, e.g. pipeline or look ahead · CPC title
associated with a data cache · CPC title
Using snapshots, i.e. a logical point-in-time copy of the data · CPC title
using a bus scheme, e.g. with bus monitoring or watching means · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.