Remote direct memory access (RDMA) multipath

US12323320B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12323320-B2
Application numberUS-202418443928-A
CountryUS
Kind codeB2
Filing dateFeb 16, 2024
Priority dateSep 1, 2022
Publication dateJun 3, 2025
Grant dateJun 3, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technologies for spreading a burst of data across multiple network paths in remote direct memory access (RDMA) over converged Ethernet (RoCE) and InfiniBand are described. A network interface controller sends a first burst of a transport flow directed to a second node over a first network path. The network interface controller determines that a second burst is to be sent over a different network path, and identifies a second network path using a multipath context. The multipath context stores a first weight value or a first state associated with the first network path and a second weight value or a second state associated with the second network path. The network interface controller sends the second burst of data to the second node via the second network path.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: sending, by a network interface controller of a first node, a first burst of data of a transport flow directed to a second node via a first network path between the first node and the second node, the transport flow using a network protocol that allows remote direct memory access (RDMA); determining, by the network interface controller using one or more parameters, that a second burst of data of the transport flow is to be sent on a different network path than the first network path; identifying, by the network interface controller using a multipath context, a second network path between the first node and the second node, the multipath context storing a first weight value associated with a first network routing identifier corresponding to the first network path and a second weight value associated with a second network routing identifier corresponding to the second network path, wherein the first weight value and the second weight value are each based on a difference from an average round-trip time (RTT) of available paths between the first node and the second node; assigning, by the network interface controller, the second network routing identifier to one or more packets of the second burst of data; and sending, by the first node, the second burst of data to the second node via the second network path. 2. The method of claim 1 , wherein the one or more parameters comprises at least one of i) a number of bursts since a last route change, ii) random entropy, iii) a requirement of an input fence, or iv) a comparison of the first weight value, corresponding to a current route over the first network path, and other weight values corresponding to other available network paths to the second node. 3. The method of claim 1 , further comprising: measuring a first RTT of the first network path; measuring a second RTT of the second network path; determining the average RTT; updating the first weight value based on the first RTT and the average RTT; and updating the second weight value based on the second RTT and the average RTT. 4. The method of claim 1 , further comprising: measuring a RTT of the first network path; determining whether the RTT is less than the average RTT of available network paths to the second node; and increasing the first weight value responsive to the RTT being less than the average RTT; or decreasing the first weight value responsive to the RTT being greater than the average RTT. 5. The method of claim 1 , further comprising: measuring a RTT of the second network path; determining whether the RTT is less than the average RTT of available network paths to the second node; and increasing the second weight value responsive to the RTT being less than the average RTT; or decreasing the second weight value responsive to the RTT being greater than the average RTT. 6. The method of claim 1 , wherein determining that the second burst of data of the transport flow is to be sent on the different network path than the first network path comprises determining that the first weight value is greater than a weight value of at least one available network path to the second node. 7. The method of claim 1 , wherein the network protocol is RDMA over Converged Ethernet (RoCE). 8. The method of claim 1 , wherein the network protocol is InfiniBand. 9. The method of claim 1 , further comprising: assigning a first queue pair (QP) and a second QP to the multipath context, wherein the first burst of data and the second burst of data are stored in the first QP; scheduling, by a scheduler of the network interface controller, the second burst of data from the second QP, wherein the second network routing identifier is assigned to the one or more packets of the second burst of data after the scheduling the second burst of data; scheduling, by the scheduler, a third burst of data from the second QP; assigning, by the network interface controller using the multipath context, a third network routing identifier to one or more packets of the third burst of data after the scheduling the third burst of data, the third network routing identifier corresponds to a third network path between the first node and the second node; and sending, by the first node, the third burst of data to the second node via the third network path. 10. A first node comprising: memory to store a multipath context and a plurality of queue pairs assigned to the multipath context; and a network interface controller coupled to the memory, the network interface controller to: send a first burst of data of a transport flow directed to a second node via a first network path between the first node and the second node, the transport flow using a network protocol that allows remote direct memory access (RDMA); determine, using one or more parameters, that a second burst of data of the transport flow is to be sent on a different network path than the first network path; identify, using the multipath context, a second network path between the first node and the second node, the multipath context storing a first weight value associated with a first network routing identifier corresponding to the first network path and a second weight value associated with a second network routing identifier corresponding to the second network path, wherein the first weight value and the second weight value are each based on a difference from an average round-trip time (RTT) of available paths between the first node and the second node; assign the second network routing identifier to one or more packets of the second burst of data; and send the second burst of data to the second node via the second network path. 11. The first node of claim 10 , wherein the network protocol is RDMA over Converged Ethernet (RoCE). 12. The first node of claim 10 , wherein the network protocol is InfiniBand. 13. The first node of claim 10 , wherein to determine that the second burst of data of the transport flow is to be sent on the different network path than the first network path, the network interface controller is to determine that the first weight value is greater than a weight value of at least one available network path to the second node. 14. The first node of claim 10 , wherein the one or more parameters comprises at least one of i) a number of bursts since a last route change, ii) random entropy, iii) a requirement of an input fence, or iv) a comparison of the first weight value, corresponding to a current route over the first network path, and other weight values corresponding to other available network paths to the second node. 15. The first node of claim 10 , wherein the network interface controller is further to: measure a first RTT of the first network path; measure a second RTT of the second network path; determine the average RTT; update the first weight value based on the first RTT and the average RTT; and update the second weight value based on the second RTT and the average RTT. 16. The first node of claim 10 , wherein the network interface controller is further to: measure a RTT of the first network path; determine whether the RTT is less than the average RTT of available network paths to the second node; and increase the first weight value responsive to the RTT being less than the average RTT; or decrease the first weight value responsive to the RTT being greater than the average RTT. 17. The first node of claim 10 , wherein the network interface controller is further to: measure a RTT of the second network path; determine whether the RTT is less than

Assignees

Inventors

Classifications

  • Multipath · CPC title

  • by attributing bandwidth to queues · CPC title

  • Round trip delays · CPC title

  • Assignment of logical groups to network elements · CPC title

  • Network utilisation, e.g. volume of load or congestion level · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12323320B2 cover?
Technologies for spreading a burst of data across multiple network paths in remote direct memory access (RDMA) over converged Ethernet (RoCE) and InfiniBand are described. A network interface controller sends a first burst of a transport flow directed to a second node over a first network path. The network interface controller determines that a second burst is to be sent over a different networ…
Who is the assignee on this patent?
Mellanox Technologies Ltd
What technology area does this patent fall under?
Primary CPC classification H04L45/124. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 03 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).