Technologies for dividing work across accelerator devices
US-2024143410-A1 · May 2, 2024 · US
US10067900B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10067900-B2 |
| Application number | US-201514835646-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 25, 2015 |
| Priority date | Aug 25, 2015 |
| Publication date | Sep 4, 2018 |
| Grant date | Sep 4, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system that includes a switched fabric hierarchy (e.g., a PCIe hierarchy) may realize efficient utilization of a shared I/O device (e.g., a network or storage switch) across multiple physically separate processing nodes (endpoints). For example, each processing node (endpoint) in a distributed processing system may be allocated a portion of the address map of a shared I/O device and may host a device driver for one of multiple virtual functions implemented on the shared device. Following enumeration and initialization of the hierarchy by the root complex, the endpoints may access the virtual functions directly (without intervention by the root complex). Data and interrupt traffic between endpoints and virtual functions may take place over peer-to-peer connections. Interrupt reception logic in each endpoint may receive and handle interrupts generated by the virtual functions. The root complex may host a device driver for a physical function on the shared device.
Opening claim text (preview).
What is claimed is: 1. An apparatus, comprising: one or more processors; a memory comprising program instructions that when executed on the plurality of processors cause the plurality of processors to perform at least a portion of a distributed application; a network interface that connects the apparatus to a switched fabric hierarchy; interrupt reception logic configured to receive interrupts generated by one of a plurality of virtualized functions of a shared endpoint device in the switched fabric hierarchy; and two or more device drivers; wherein a first one of the two or more device drivers is configured to exchange communication traffic with a root complex component in the switched fabric hierarchy during initialization of the apparatus; and wherein, during execution of the distributed application, a second one of the two or more device drivers is configured to provide access, by the at least a portion of the distributed application, to the one of the plurality of virtualized functions of the shared endpoint device in the switched fabric hierarchy, wherein the one of the plurality of virtualized functions is allocated to the apparatus. 2. The apparatus of claim 1 , wherein other ones of the plurality of virtualized functions of the shared endpoint device are not allocated to the apparatus. 3. The apparatus of claim 1 , wherein, during execution of the distributed application, data traffic is communicated between the apparatus and the shared endpoint device via a peer-to-peer connection through a network switch of the switched fabric network. 4. The apparatus of claim 1 , wherein the interrupts are received from the shared endpoint device via a peer-to-peer connection between the apparatus and the shared endpoint device. 5. A method, comprising: assigning, by a root complex component in a switched fabric hierarchy, one of a plurality of virtualized functions of a shared endpoint device in the switched fabric hierarchy to one of a plurality of processing endpoint devices in the switched fabric hierarchy, wherein said assigning comprises allocating a portion of an address map associated with the shared endpoint device to the one of the plurality of processing endpoint devices; initializing interrupt reception logic in the one of the plurality of processing endpoint devices, wherein initializing the interrupt reception logic comprises configuring the interrupt reception logic to receive interrupts from the shared endpoint device on behalf of the one of the plurality of virtualized functions; assigning, by the root complex component, another one of the plurality of virtualized functions of the shared endpoint device to another one of the plurality of processing endpoint devices in the switched fabric hierarchy, wherein said assigning comprises allocating a portion of an address map associated with the shared endpoint device to the other one of the plurality of processing endpoint devices; accessing, by a portion of a distributed application executing on the one of the plurality of processing endpoint devices, a location within the portion of the address map that is allocated to the one of the plurality of processing endpoint devices; wherein said accessing is performed over a peer-to-peer connection between the one of the plurality of endpoint devices and the shared endpoint device and is performed without intervention from the root complex component. 6. The method of claim 5 , wherein said accessing is performed using a device driver that is hosted on the one of the plurality of processing endpoint devices for accessing the one of the plurality of virtualized functions of a shared endpoint device. 7. The method of claim 6 , wherein said initializing comprises dividing the address map associated with the shared endpoint device among the plurality of processing endpoint devices. 8. The method of claim 6 , wherein said initializing comprises performing an enumeration operation to discover devices within the switched fabric hierarchy. 9. The method of claim 5 , wherein the method further comprises initializing the switched fabric hierarchy; and wherein said assigning and said allocating are performed during said initializing. 10. The method of claim 5 , wherein the method further comprises receiving, by the interrupt reception logic, an interrupt that was generated by the one of the plurality of virtualized functions; and wherein the interrupt is received from the shared endpoint device via a peer-to-peer connection between the one of the plurality of processing endpoint devices and the shared endpoint device. 11. The method of claim 5 , further comprising: generating, by the shared endpoint device in response to an error or exception condition involving communication traffic between the one of the plurality of processing endpoint devices and the shared endpoint device, an error message; and communicating the error message to the root complex component. 12. The method of claim 11 , further comprising: performing, by the root complex component in response to receiving the error message, an exception handling operation; and communicating, by the root complex component to the one of the plurality of processing endpoint devices, an indication of the error or exception condition. 13. A system, comprising: a computing node configured as a root complex component in a switched fabric hierarchy; two or more computing nodes configured as processing endpoints in the switched fabric hierarchy; a shared endpoint device in the switched fabric hierarchy; and a network switch for the switched fabric network that connects the root complex component, the processing endpoints, and the shared endpoint device; wherein the shared endpoint device implements multiple virtual functions, each of which is accessible by a respective single one of the processing endpoints through a device driver hosted on the processing endpoint; wherein each of the processing endpoints comprises interrupt reception logic configured to receive interrupts generated by one of the multiple virtual functions; and wherein accesses to each of the multiple virtual functions by the respective single one of the processing endpoints are performed via peer-to-peer connections. 14. The system of claim 13 , wherein the shared endpoint device is an input/output (I/O) device, a network adapter, or a storage adapter. 15. The system of claim 13 , wherein the shared endpoint device further comprises a physical function; and wherein the root complex component comprises a device driver for the physical function. 16. The system of claim 15 , wherein accesses to the physical function by the two or more computing nodes configured as processing endpoints are made by the root complex component on behalf of the two or more computing nodes configured as processing endpoints. 17. The system of claim 13 , wherein the shared endpoint device in the switched fabric hierarchy implements single root input/output (I/O) virtualization. 18. The system of claim 13 , wherein the two or more computing nodes configured as processing endpoints collectively execute a distributed application; and wherein, during execution of the distributed application, one of the two or more computing nodes configured as processing endpoints accesses the one of the multiple virtual functions that is accessible by the one of the two or more computing nodes.
using switching circuits, e.g. switching matrix, connection or expansion network (G06F13/4009 takes precedence) · CPC title
Electrical coupling · CPC title
with centralised access control · CPC title
where the program performs an input/output emulation function · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.