Multi-core communication acceleration using hardware queue device

US10445271B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10445271-B2
Application numberUS-201614987676-A
CountryUS
Kind codeB2
Filing dateJan 4, 2016
Priority dateJan 4, 2016
Publication dateOct 15, 2019
Grant dateOct 15, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: one or more memory buffers to receive and store a plurality of incoming requests submitted by one or more processor cores (requestors), the plurality of incoming requests comprising enqueue requests and dequeue requests; a scheduler circuitry to select, in accordance to a scheduling policy, a request from the plurality of incoming requests stored in one of the one or more memory buffers; an enqueue circuitry to process an enqueue request selected by the scheduler circuitry and responsively insert data associated with the enqueue request into an internal storage unit; and a dequeue circuitry to process a dequeue request selected by the scheduler circuitry and responsively retrieve data associated with the dequeue request from the internal storage unit and sending the retrieved data to the one or more requestors; wherein each of the one or more memory buffers corresponds to a respective one of the one or more requestors and stores only the incoming requests from the corresponding requestor. 2. The apparatus of claim 1 , wherein the one or more memory buffers are first in, first out (FIFO) buffers. 3. The apparatus of claim 1 , wherein the scheduling policy is a Round Robin policy. 4. The apparatus of claim 1 , wherein the scheduling policy is a Weighted Round Robin policy. 5. The apparatus of claim 1 , wherein the scheduling policy is a preemptive priority policy. 6. The apparatus of claim 1 , wherein the internal storage unit is configurable to support data of varying lengths and sizes. 7. The apparatus of claim 1 , further comprising: a resource management circuitry to set, according to a resource policy, one or more limits for each of the one or more requestors on number of incoming requests that may be submitted by each of the one or more requestors. 8. The apparatus of claim 7 , wherein the resource policy includes a global resource pools and a plurality of local resource pools, the global resource pools to provide resource credits to be distributed amongst the plurality of local resource pools based on a credit replenishment policy. 9. The apparatus of claim 8 , wherein each of the plurality of local resource pools to correspond to one of the one or more requestors, and the resource credit in a given local resource pool determines the number of requests that may be submitted by the local resource pool's corresponding requestor. 10. The apparatus of claim 9 , wherein the resource credits include enqueue credits to allow the one or more requestors to submit enqueue requests and dequeue credits to allow the one or more requestors to submit dequeue requests. 11. The apparatus of claim 8 , wherein the credit replenishment policy is Round Robin policy. 12. The apparatus of claim 8 , wherein the credit replenishment policy is Weighted Round Robin policy. 13. The apparatus of claim 8 , wherein the credit replenishment policy is preemptive priority policy. 14. The apparatus of claim 1 , wherein each block of data to be inserted into the internal storage unit is combined with a metadata tag to indicate how the data should be handled by the enqueue circuitry and the dequeue circuitry. 15. The apparatus of claim 14 , wherein the metadata tag includes an atomic parameter to indicate whether the retrieved data should only be sent to one requestor at a time. 16. The apparatus of claim 14 , wherein the metadata tag includes a load-balancing parameter to indicate whether the retrieved data should be load-balanced across a plurality of output queues. 17. The apparatus of claim 14 , wherein the metadata tag includes a reordering parameter to indicate whether the retrieved data should be reordered based on a sequence number, before the retrieved data is sent to the one or more requestors. 18. The apparatus of claim 14 , wherein the metadata tag includes a fragmentation parameter to indicate whether the retrieved data can be divided into smaller blocks for traffic shaping, before the retrieved data is sent to the one or more requestors. 19. The apparatus of claim 1 , wherein the retrieved data is combined with a demotion instruction, such that an execution of the demotion instruction by the one or more requestors to cause data to be moved from the internal storage unit to a cache communicatively coupled to and shared by the one or more requestors.

Assignees

Inventors

Classifications

  • Using a specific cache allocation policy other than replacement policy · CPC title

  • Queue · CPC title

  • Data transfer between cache memory and other subsystems, e.g. storage devices or host systems · CPC title

  • Addressing variable-length words or parts of words · CPC title

  • Speculative instruction execution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10445271B2 cover?
Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardwar…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F13/37. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 15 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).