Sparse convolutional neural network accelerator
US-10891538-B2 · Jan 12, 2021 · US
US11650928B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11650928-B2 |
| Application number | US-202217715734-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 7, 2022 |
| Priority date | Apr 21, 2017 |
| Publication date | May 16, 2023 |
| Grant date | May 16, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A mechanism is described for facilitating optimization of cache associated with graphics processors at computing devices. A method of embodiments, as described herein, includes introducing coloring bits to contents of a cache associated with a processor including a graphics processor, wherein the coloring bits to represent a signal identifying one or more caches available for use, while avoiding explicit invalidations and flushes.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: one or more processors including a graphics processor; and one or more caches associated with the graphics processor; wherein the one or more processors are to: define cache coloring bits to color contents of a plurality of cache lines of the one or more caches, the cache coloring bits associated with cache lines of a cache to provide a signal whether the cache lines are available for use; determine that data in one or more cache lines of a first cache is not valid; and in response to the determination, increment one or more cache coloring bits associated with the one or more cache lines of the first cache, wherein older cache coloring bits provide a signal that there is invalid data or a miss in the one or more cache lines of the first cache. 2. The apparatus of claim 1 , wherein the one or more cache coloring bits associated with the one or more cache lines of the first cache are attached as a part of a tag of the one or more cache lines. 3. The apparatus of claim 1 , wherein the one or more processors are further to facilitate replacement or reallocation of cache locations in the one or more cache lines of the first cache based at least in part on the one or more cache coloring bits. 4. The apparatus of claim 1 , wherein the first cache is a read-only cache, and wherein the older cache coloring bits are deemed to indicate invalid data for the one or more cache lines of the first cache. 5. The apparatus of claim 4 , wherein the one or more processors are to utilize a first set of one or more counters to track a presence of allocations of cache coloring bits. 6. The apparatus of claim 5 , wherein the one or more processors are further to ensure that entries associated with a set of cache coloring bits are reallocated before the set of cache coloring bits are reallocated based at least in part on the first set of one or more counters. 7. The apparatus of claim 1 , wherein the first cache is a read-write cache, and wherein the older cache coloring bits are deemed to be misses and to be victimized first for reallocation for the one or more cache lines of the first cache. 8. The apparatus of claim 7 , wherein the one or more processors are to utilize a second set of one or more counters to track a number of entries of the one or more cache lines of the first cache, and wherein the one or more processors are further to allocate priority for contents waiting to be written to the one or more cache lines of the first cache based at least in part on the second set of one or more counters. 9. The apparatus of claim 1 , wherein the graphics processor is co-located with an application processor on a common semiconductor package. 10. A method comprising: defining cache coloring bits to color contents of a plurality of cache lines of one or more caches associated with a graphics processor, the cache coloring bits associated with cache lines of a cache to provide a signal whether the cache lines are available for use; determining that data in one or more cache lines of a first cache of the one or more caches is not valid; and in response to the determination, incrementing one or more cache coloring bits associated with the one or more cache lines of the first cache, wherein older cache coloring bits provide a signal that there is invalid data or a miss in the one or more cache lines of the first cache. 11. The method of claim 10 , wherein the one or more cache coloring bits associated with the one or more cache lines of the first cache are attached as a part of a tag of the one or more cache lines. 12. The method of claim 10 , further comprising replacing or reallocating cache locations in the one or more cache lines of the first cache based at least in part on the one or more cache coloring bits associated with the one or more cache lines. 13. The method of claim 10 , wherein the first cache is a read-only cache, and wherein the older cache coloring bits are deemed to indicate invalid data for the one or more cache lines of the first cache. 14. The method of claim 13 , further comprising: tracking a presence of allocations of cache coloring bits utilizing a first set of one or more counters; and ensuring that entries associated with a set of cache coloring bits are reallocated before the set of cache coloring bits are reallocated based at least in part on the first set of one or more counters. 15. The method of claim 10 , wherein the first cache is a read-write cache, and the older cache coloring bits are deemed to be misses and are to be victimized first for reallocation for the one or more cache lines of the first cache. 16. The method of claim 15 , further comprising: tracking a number of entries of the one or more cache lines of the first cache utilizing a second set of one or more counters; and allocating priority for contents waiting to be written to the one or more cache lines of the first cache based at least in part on the second set of one or more counters. 17. At least one non-transitory machine-readable medium comprising instructions that when executed by a computing device, cause the computing device to perform operations comprising: defining cache coloring bits to color contents of a plurality of cache lines of one or more caches associated with a graphics processor, the cache coloring bits associated with cache lines of a cache to provide a signal whether the cache lines are available for use; determining that data in one or more cache lines of a first cache of the one or more caches is not valid; in response to the determination, incrementing one or more cache coloring bits associated with the one or more cache lines of the first cache, wherein older cache coloring bits provide a signal that there is invalid data or a miss in the one or more cache lines of the first cache; and replacing or reallocating cache locations in the one or more cache lines of the first cache based at least in part on the cache coloring bits associated with the one or more cache lines. 18. The machine-readable medium of claim 17 , wherein the one or more cache coloring bits associated with the one or more cache lines of the first cache are attached as a part of a tag of the one or more cache lines. 19. The machine-readable medium of claim 17 , wherein the first cache is a read-only cache, and wherein the older cache coloring bits are deemed to indicate invalid data for the one or more cache lines of the first cache. 20. The machine-readable medium of claim 17 , wherein the first cache is a read-write cache, and the older cache coloring bits are deemed to be misses and are to be victimized first for reallocation for the one or more cache lines of the first cache.
Space efficiency improvement · CPC title
with special data handling, e.g. priority of data or instructions, handling errors or pinning · CPC title
Memory management · CPC title
of parts of caches, e.g. directory or tag array · CPC title
Scalability · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.