Remote memory access functionality in a cluster of data processing nodes
US-9262225-B2 · Feb 16, 2016 · US
US10686729B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10686729-B2 |
| Application number | US-201815939227-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 28, 2018 |
| Priority date | Mar 29, 2017 |
| Publication date | Jun 16, 2020 |
| Grant date | Jun 16, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A network system for a data center is described in which a switch fabric provides full mesh interconnectivity such that any servers may communicate packet data to any other of the servers using any of a number of parallel data paths. Moreover, according to the techniques described herein, edge-positioned access nodes, optical permutation devices and core switches of the switch fabric may be configured and arranged in a way such that the parallel data paths provide single L2/L3 hop, full mesh interconnections between any pairwise combination of the access nodes, even in massive data centers having tens of thousands of servers.
Opening claim text (preview).
What is claimed is: 1. A network system comprising: a plurality of servers; a switch fabric comprising a plurality of core switches; and a plurality of access nodes, each of the access nodes coupled to a subset of the core switches, wherein the plurality of access nodes includes a first access node coupled to a source server included within the plurality of servers and a second access node coupled to a destination server included within the plurality of servers, and wherein the first access node and the second access node are configured to establish a logical tunnel over a plurality of parallel data paths across the core switches included within the switch fabric between the first access node and the second access node, wherein, when communicating a packet flow of packets between the source server and the destination server, the first access node is configured to encapsulate the packets and send the packets over the logical tunnel by spraying the packets of the packet flow across the plurality of parallel data paths to the second access node by directing each of the packets to one of the parallel data paths selected based on bandwidth characteristics of the one of the parallel data paths, and wherein the second access node is configured to reorder the packets into an original sequence of the packet flow and deliver the reordered packets to the destination server. 2. The network system of claim 1 , wherein the access nodes and the core switches are configured to provide full mesh connectivity between any pairwise combination of the servers, and wherein the full mesh connectivity of the switch fabric between the servers is non-blocking and drop-free. 3. The network system of claim 1 , wherein the access nodes and the core switches are configured to connect any pairwise combination of the access nodes by at most a single layer three (L3) hop. 4. The network system of claim 1 , wherein the access nodes and the core switches are configured to provide a plurality of parallel data paths between each pairwise combination of the access nodes. 5. The network system of claim 1 , wherein the first access node sprays the packets of the packet flow across the plurality of parallel data paths by directing each of the packets to a randomly or round-robin selected one of the parallel data paths. 6. The network system of claim 1 , wherein to direct each of the packets, the first access node sprays the packets of the packet flow across the plurality of parallel data paths by directing each of the packets to one of the parallel data paths selected based on a bandwidth weight associated with the one of the parallel data paths. 7. The network system of claim 1 , wherein the first access node has full mesh connectivity to a subset of the access nodes included in a logical rack as a first-level network fanout, and wherein the first access node is configured to spray the packets of the packet flow across the first-level network fanout to the subset of the access nodes included in the logical rack. 8. The network system of claim 7 , wherein each of the access nodes has full mesh connectivity to the subset of the core switches as a second-level network fanout, and wherein each of the subset of access nodes included in the logical rack is configured to spray the packets of the packet flow across the second-level network fanout to the subset of the core switches. 9. The network system of claim 1 , wherein each of the access nodes comprises: a source component operable to receive traffic from a server; a source switching component operable to switch source traffic to other source switching components of different access nodes or toward the core switches; a destination switching component operable to switch inbound traffic received from other source switching components or from the core switches; and a destination component operable to reorder packet flows received via the destination switching component and provide the packet flows to a destination server coupled to the access node. 10. The network system of claim 1 , further comprising a plurality of intermediate network devices, wherein each of the access nodes is coupled to the subset of the core switches via a subset of the intermediate network devices. 11. The network system of claim 10 , wherein the plurality of intermediate network devices comprises one of a plurality of top of rack (TOR) Ethernet switches, a plurality of electrical permutation devices, or a plurality of optical permutation devices. 12. The network system of claim 1 , further comprising a plurality of optical permutation devices optically coupling the access nodes to the core switches by optical links to communicate the data packets between the access nodes and the core switches as optical signals, wherein each the optical permutation devices comprises a set of input optical ports and a set of output optical ports to direct optical signal between the access nodes and the core switches to communicate the data packets, and wherein each of the optical permutation devices is configured such that optical communications received from the input optical ports are permuted across the output optical ports based on wavelength so as to provide full-mesh optical connectivity between the input optical ports and the output optical ports without optical interference. 13. The network system of claim 1 , wherein two or more of the access nodes are arranged as an access node group including storage devices coupled to each of the two or more access nodes, and wherein the access node group and the subset of the servers coupled to each of the two or more access nodes of the access node group are arranged as a network storage compute unit (NSCU). 14. The network system of claim 13 , wherein two NSCUs are arranged as a logical rack, and four NSCUs are arranged as a physical rack. 15. The network system of claim 1 , wherein four of the access nodes are arranged as a first access node group, and another four of the access nodes are arranged as a second access node group, wherein the subset of the servers coupled to each of the access nodes of the first and second access node groups are arranged as a logical rack, and wherein the eight access nodes of the logical rack are interconnected via a full mesh of Ethernet connections. 16. The network system of claim 15 , wherein, for a given one of the access nodes of the first access node group, the full mesh of Ethernet connections comprises three intra-access node group Ethernet connections to each of the other three access nodes of the first access node group, and four inter-access node group Ethernet connections to each of the four access nodes of the second access node group. 17. The network system of claim 1 , wherein a first set of the access nodes are arranged as a first access node group, a second set of the access nodes are arranged as a second access node group, a third set of the access nodes are arranged as a third access node group, and a fourth set of the access nodes are arranged as a fourth access node group, and wherein the first access node group, the second access node group, the third access node group, the fourth access node group, and the subset of the servers coupled to each of the access nodes of the access node groups are arranged as a physical rack. 18. The network system of claim 1 , wherein one or more of the access nodes comprise storage devices configured to provide network accessible storage for use by applications executing on the servers. 19. The network system of claim 18 , wherein the
Arrangements for redundant switching, e.g. using parallel planes · CPC title
Switch control, e.g. arbitration · CPC title
Switch interfaces, e.g. port details · CPC title
using M+N parallel active paths · CPC title
Multipath · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.