What technology area does this patent fall under?

Primary CPC classification G06F12/0875. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 15 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Sending packets using optimized PIO write sequences without sfences

US9734077B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9734077-B2
Application number	US-201615277527-A
Country	US
Kind code	B2
Filing date	Sep 27, 2016
Priority date	Jun 26, 2014
Publication date	Aug 15, 2017
Grant date	Aug 15, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Method and apparatus for sending packets using optimized PIO write sequences without sfences. Sequences of Programmed Input/Output (PIO) write instructions to write packet data to a PIO send memory are received at a processor supporting out of order execution. The PIO write instructions are received in an original order and executed out of order, with each PIO write instruction writing a store unit of data to a store buffer or a store block of data to the store buffer. Logic is provided for the store buffer to detect when store blocks are filled, resulting in the data in those store blocks being drained via PCIe posted writes that are written to send blocks in the PIO send memory at addresses defined by the PIO write instructions. Logic is employed for detecting the fill size of packets and when a packet's send blocks have been filled, enabling the packet data to be eligible for egress.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus, comprising: a processor, having a plurality of processor cores supporting out of order execution and including a memory interface, at least one store buffer, and a first Peripheral Component Interconnect Express (PCIe) interface; a second PCIe interface coupled to the first PCIe interface via a PCIe link; a transmit engine including, a Programmed Input/Output (PIO) send memory operatively coupled the second PCIe interface; and an egress block, operatively coupled to the PIO send memory; and a network port including a transmit port operatively coupled to the egress block, the processor further to, receive sequences of PIO write instructions to write packet data for respective packets stored in a memory when coupled to the memory interface to the PIO send memory; execute the sequences of PIO write instructions as an instruction thread on a processor core, wherein execution of PIO write instructions cause data to be written to store units in a store buffer, the store units grouped into store blocks, wherein a portion of the PIO write instructions are executed out of order resulting in data being written to store units in different store blocks prior to the store blocks being filled; detect when store blocks are filled; and in response to detecting a store block is filled, drain the data in the store block via a PCIe posted write to a send block in the PIO send memory sent over the PCIe interconnect, and the transmit engine further to, partition the PIO send memory into a plurality of send contexts, each comprising a plurality of sequential send blocks; receive inbound PCIe posted writes over the PCIe interconnect, wherein packet data for a given packet is written to one send block or a plurality of sequential send blocks, wherein packet data for a packet to be written to a plurality of sequential send blocks is enabled to be received out of order; detect when a plurality of sequential send blocks for a packet have been filled; and mark packet data in the plurality of sequential send blocks as eligible for egress to the egress block when all of the sequential send blocks for a packet are detected as being filled. 2. The apparatus to claim 1 , further to implement an arbiter to select a packet from among packets in the plurality of send contexts that have been filled to be egressed from the egress block to the transmit port. 3. The apparatus of claim 1 , wherein the transmit engine further comprises a send direct memory access (SDMA) memory and a plurality of SDMA engines to pull data from memory coupled to the processor using DMA transfers to write data to buffers in the SDMA memory. 4. The apparatus of claim 1 , wherein the apparatus comprises a host fabric interface further comprising: a receive engine, coupled to the PCIe interface; and a receive port, coupled to the receive engine. 5. The apparatus of claim 4 , wherein the apparatus comprises multiple host fabric interfaces having a configuration defined for the host fabric interface of claim 4 . 6. The apparatus of claim 1 , further to: receive a sequence of PIO write instructions for writing data for a packet to a plurality of sequential send blocks in a sequential order; and write the data for the packet to the sequential send blocks in a non-sequential order. 7. The apparatus of claim 6 , further to: detect that all of the plurality of sequential send blocks have been filled with the packet data; and enable data in the plurality of send blocks to be egressed once all of the plurality of send blocks are filled. 8. The apparatus of claim 7 , further to: encode a header field in each packet with virtual lane (VL) indicia used to identify a VL associated with that packet; enable packets with different VLs within the same send context to be egressed out of FIFO order; and enforce FIFO ordering for egress of data for packets associated with the same VL within the same send context. 9. An apparatus, comprising: a PCIe (Peripheral Component Interconnect Express) interface; a transmit engine including, a Programmed Input/Output (PIO) send memory operatively coupled the PCIe interface; and an egress block, operatively coupled to the PIO send memory; and a network port including a transmit port operatively coupled to the egress block, the transmit engine to, partition the PIO send memory into a plurality of send contexts, each comprising a plurality of sequential send blocks; receive inbound PCIe posted writes from a processor coupled to the PCIe interface via a PCIe interconnect, each PCIe posted write containing packet data corresponding to a packet stored in memory coupled to the processor and being written to a single send block via a PIO write instruction, wherein packet data for a given packet is written to one send block or a plurality of sequential send blocks, wherein packet data for a packet to be written to a plurality sequential send blocks is enabled to be received out of order; detect when a plurality of sequential send blocks for a packet have been filled; mark packet data in the plurality of sequential send blocks as eligible for egress to the egress block when all of the sequential send blocks for a packet are detected as being filled; detect that all of the plurality of sequential send blocks have been filled with the packet data; enable data in the plurality of send blocks to be egressed once all of the plurality of send blocks are filled; encode a header field in each packet with virtual lane (VL) indicia used to identify a VL associated with that packet; enable packets with different VLs within the same send context to be egressed out of FIFO order; and enforce FIFO ordering for egress of data for packets associated with the same VL within the same send context. 10. The apparatus to claim 9 , further to implement an arbiter to select a packet from among packets in the plurality of send contexts that have been filled to be egressed from the egress block to the transmit port. 11. The apparatus of claim 9 , wherein the transmit engine further comprises a send direct memory access (SDMA) memory and a plurality of SDMA engines to pull data from memory coupled to the processor using DMA transfers to write data to buffers in the SDMA memory. 12. The apparatus of claim 9 , wherein the PCIe interfaces comprises a first PCIe interface, the apparatus further comprising: a processor, having a plurality of processor cores supporting out of order execution and including a memory interface, at least one store buffer, and a second PCIe (Peripheral Component Interconnect Express) interface coupled to the first PCIe interface via a PCIe interconnect; the apparatus further to, receive sequences of PIO write instructions to write packet data for respective packets stored in a memory when coupled to the memory interface to the PIO send memory; execute the sequences of PIO write instructions as an instruction thread on a processor core, wherein execution of PIO write instructions cause data to be written to store units in a store buffer, the store units grouped into store blocks comprising a line of store units; wherein a portion of the PIO write instructions are executed out of order resulting in data being written to store units in different store blocks prior to the store blocks being filled; detect when store blocks are filled; and in response to detecting a store block is filled, drain the data in the store block via a PCIe posted write to a buffer in the PIO send memory sent over the PCIe interconnect. 13. The apparatus of claim 9 , wherein the apparatus comprises a host fabric interface further comprising

Assignees

Intel Corp

Inventors

Classifications

G06F12/0875Primary
with dedicated cache, e.g. instruction or stack · CPC title
G06F5/14
for overflow or underflow handling, e.g. full or empty flags · CPC title
G06F5/065
Partitioned buffers, e.g. allowing multiple independent queues, bidirectional FIFO's · CPC title
G06F13/4282
on a serial bus, e.g. I2C bus, SPI bus (on daisy chain buses G06F13/4247) · CPC title
G06F2205/126
Monitoring of intermediate fill level, i.e. with additional means for monitoring the fill level, e.g. half full flag, almost empty flag · CPC title

Patent family

Related publications grouped by family.

View patent family 54930564

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9734077B2 cover?: Method and apparatus for sending packets using optimized PIO write sequences without sfences. Sequences of Programmed Input/Output (PIO) write instructions to write packet data to a PIO send memory are received at a processor supporting out of order execution. The PIO write instructions are received in an original order and executed out of order, with each PIO write instruction writing a store …
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F12/0875. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 15 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).