Extended store forwarding for store misses without cache allocate

US10223266B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10223266-B2
Application numberUS-201615364411-A
CountryUS
Kind codeB2
Filing dateNov 30, 2016
Priority dateNov 30, 2016
Publication dateMar 5, 2019
Grant dateMar 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A load store unit (LSU) in a processor core detects that new data produced by the processor core is ready to be drained to an L2 cache. In response to the LSU detecting that an earlier version of the new data is not stored in L1 cache, a memory controller sends the new data as L1 cache missed data to a store queue (STQ), where the STQ makes data available for deallocation from the STQ to the L2 cache. In response to determining that there is no newer data waiting to be stored in the STQ, or no cache line invalidate to the line containing the store data in the STQ that misses the cache, the memory controller maintains the new data in the STQ with a zombie stat bit that indicates that the new data is a zombie store entry that can be utilized by the processor core.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: detecting, by a load store unit (LSU) in a processor core, that new data produced by the processor core is ready to be drained to an L2 cache in the processor core; checking, by the LSU, whether an earlier version of the new data is stored in an L1 cache in the processor core; in response to the LSU detecting that the earlier version of the new data is not stored in the L1 cache, sending, by a memory controller, the new data as L1 cache missed data to a store queue (STQ), wherein the STQ makes data available for deallocation from the STQ to the L2 cache; determining, by the memory controller, whether a newer data produced by the processor core is waiting to be stored in the STQ; and in response to the memory controller determining that there is no newer data waiting to be stored in the STQ, maintaining, by the memory controller, the new data in the STQ with a zombie stat bit, wherein the zombie stat bit indicates that the new data is a zombie store entry that can be utilized by the processor core. 2. The method of claim 1 , further comprising: writing, by the memory controller, the new data to system memory; and maintaining, by the memory controller, the new data in the STQ past completion of writing the new data to system memory. 3. The method of claim 1 , further comprising: in response to the LSU detecting that the earlier version of the new data is stored in the L1 cache, writing the new data to the L1 cache. 4. The method of claim 1 , wherein the zombie store entry does not have a valid tag suitable for comparison to an internal core request by the processor core. 5. The method of claim 4 , wherein the internal core request is for a pipeline flush of registers in the processor core. 6. The method of claim 4 , wherein the internal core request is for a completion of an execution performed by one or more execution unites within the processor core. 7. The method of claim 1 , further comprising: determining, by the memory controller, that the STQ has no current vacancies; determining, by the memory controller, that a younger store is attempting to create a new STQ entry in the STQ; and in response to the memory controller determining that the STQ has no current vacancies and that the younger store is attempting to create the new STQ entry in the STQ, deeming the zombie store entry invalid and replacing, by the memory controller, the zombie store entry with the new STQ entry in the STQ. 8. The method of claim 1 , further comprising: invalidating, by the memory controller, the zombie store entry in response to a new zombie store entry being stored in the STQ, wherein the new zombie store entry has a same memory address as the zombie store entry that was previously stored in the STQ, and wherein invalidating the zombie store entry frees space in the STQ for another cache line. 9. The method of claim 1 , further comprising: invalidating, by the memory controller, the zombie store entry in response to data being loaded into an execution unit within the processor core partially overlapping data in the zombie store entry, wherein invalidating the zombie store entry frees space in the STQ for another cache line; and in response to the data being loaded into the execution unit within the processor core partially overlapping data in the zombie store entry, sending, by the memory controller, an L1 cache miss request to the L2 cache. 10. The method of claim 1 , further comprising: invalidating, by the memory controller, the zombie store entry in response to a reload of the new data into the L1 cache, and wherein invalidating the zombie store entry frees space in the STQ for another cache line. 11. The method of claim 1 , further comprising: invalidating, by the memory controller, the zombie store entry in response to the L2 cache requesting data from the L1 cache, wherein invalidating the zombie store entry frees space in the STQ for another cache line. 12. A computer program product comprising one or more non-transitory computer readable storage mediums, and program instructions loaded on at least one of the one or more non-transitory computer readable storage mediums, the loaded program instructions comprising: program instructions to detect, by a load store unit (LSU) in a processor core, that new data produced by the processor core is ready to be drained to an L2 cache in the processor core; program instructions to check, by the LSU, whether an earlier version of the new data is stored in an L1 cache in the processor core; program instructions to, in response to the LSU detecting that the earlier version of the new data is not stored in the L1 cache, send, by a memory controller, the new data as L1 cache missed data to a store queue (STQ), wherein the STQ makes data available for deallocation from the STQ to the L2 cache; program instructions to determine, by the memory controller, whether a newer data produced by the processor core is waiting to be stored in the STQ; and program instructions to in response to the memory controller determining that there is no newer data waiting to be stored in the STQ, maintain, by the memory controller, the new data in the STQ with a zombie stat bit, wherein the zombie stat bit indicates that the new data is a zombie store entry that can be utilized by the processor core. 13. The computer program product of claim 12 , wherein the method further comprises: program instructions to write, by the memory controller, the new data to system memory; and program instructions to maintain, by the memory controller, the new data in the STQ past completion of writing the new data to system memory. 14. The computer program product of claim 12 , wherein the method further comprises: program instructions to in response to the LSU detecting that the earlier version of the new data is stored in the L1 cache, write the new data to the L1 cache. 15. The computer program product of claim 12 , wherein the zombie store entry does not have a valid tag suitable for comparison to an internal core request by the processor core. 16. The computer program product of claim 12 , wherein the method further comprises: program instructions to determine, by the memory controller, that the STQ has no current vacancies; program instructions to determine, by the memory controller, that a younger store is attempting to create a new STQ entry in the STQ; and program instructions to, in response to the memory controller determining that the STQ has no current vacancies and that the younger store is attempting to create the new STQ entry in the STQ, deem the zombie store entry invalid and to replace, by the memory controller, the zombie store entry with the new STQ entry in the STQ. 17. The computer program product of claim 12 , wherein the method further comprises: program instructions to invalidate, by the memory controller, the zombie store entry in response to a new zombie store entry being stored in the STQ, wherein the new zombie store entry has a same memory address as the zombie store entry that was previously stored in the STQ, and wherein invalidating the zombie store entry frees space in the STQ for another cache line. 18. The computer program product of claim 12 , wherein the method further comprises: program instructions to invalidate, by the memory controller, the zombie store entry in response to data being loaded into an execution unit within the processor core partially overlapping data in the zombie store entry, wherein invalidating the zombie store entry frees space in

Assignees

Inventors

Classifications

  • with dedicated cache, e.g. instruction or stack · CPC title

  • Coherency control relating to peripheral accessing, e.g. from DMA or I/O device · CPC title

  • with multilevel cache hierarchies · CPC title

  • Instruction code · CPC title

  • Details of cache memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10223266B2 cover?
A load store unit (LSU) in a processor core detects that new data produced by the processor core is ready to be drained to an L2 cache. In response to the LSU detecting that an earlier version of the new data is not stored in L1 cache, a memory controller sends the new data as L1 cache missed data to a store queue (STQ), where the STQ makes data available for deallocation from the STQ to the L2…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F12/0815. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).