Techniques to manage non-disruptive SAN availability in a partitioned cluster
US-9639437-B2 · May 2, 2017 · US
US10484472B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10484472-B2 |
| Application number | US-201514840512-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 31, 2015 |
| Priority date | Jul 31, 2015 |
| Publication date | Nov 19, 2019 |
| Grant date | Nov 19, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Exemplary embodiments provide methods, mediums, and systems for efficiently moving data between cluster nodes. Upon receiving a request to read or write data at a first cluster node that is in communication with a client, the first node effects the transfer to or from a second cluster node. The transfer is carried out using a combination of remote data memory access (“RDMA”), or a similar technique that bypasses a part of the network stack, and transport control protocol (“TCP”), or a similar technique that does not bypass a part of the network stack. The data is transferred using RDMA, while certain control messages are sent using TCP. By combining RDMA content transfers and TCP control messages, data transfers can be carried out faster, more efficiently, and with less processing overhead. Other embodiments are described and claimed.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: receiving, via a transport control protocol (TCP) by a first node, a read request from a client device to read data, comprising metadata and content, that is stored on a remote volume associated with a second node content; allocating, by the first node, a buffer within memory of the first node for receiving the content using remote direct memory access (RDMA) based upon a size of the content exceeding a size threshold and resource consumption for allocating the buffer being below a threshold; transmitting an address of the buffer to the second node via the TCP to trigger the second node to perform an RDMA write operation to write the content into the buffer using the address; receiving, via the TCP by the first node from the second node, the metadata comprising instructions for reconstructing the data using the content within the buffer, wherein a response header comprises an indication of whether the RDMA write operation was successful; and reconstructing and transmitting the data to the client device using the metadata and the content based upon the instructions. 2. The method of claim 1 , comprising: deallocating the buffer from the memory based upon transmitting the data to the client device. 3. The method of claim 1 , further comprising: extract the content from the buffer based upon the flag indicating that the RDMA write operation by the second node wrote the content into the buffer within the memory of the first node. 4. The method of claim 1 , wherein the reconstructing comprises: combining the metadata received via the TCP and the content received through the buffer via the RDMA to construct the data. 5. The method of claim 1 , further comprising: receiving the content via TCP as opposed to the RDMA when the size of the content is less than the size threshold. 6. The method of claim 1 , wherein the size threshold is between about 16 kilobytes and about 32 kilobytes. 7. The method of claim 1 , further comprising: reverting to data transmission via the TCP when communication via the RDMA is impossible. 8. The method of claim 1 , wherein the RDMA write operation is performed by the second node to facilitate execution of the read request by the first node. 9. A non-transitory computer readable medium storing instructions that, when executed, cause circuitry of a computing device to: receive, via a transport control protocol (TCP) by a first node, a read request from a client device to read data, comprising metadata and content, that is stored on a remote volume associated with a second node; allocate, by the first node, a buffer within memory of the first node for receiving the content using remote direct memory access (RDMA) based upon a size of the content exceeding a size threshold and resource consumption for allocating the buffer being below a threshold; transmit an address of the buffer to the second node using the TCP to trigger the second node to perform an RDMA write operation to write the content into the buffer using the address; receive, via the TCP by the first node from the second node, the metadata comprising instructions for reconstructing the data using the content within the buffer, wherein a response header comprises an indication of whether the RDMA write operation was successful; and reconstruct and transmit the data to the client device using the metadata and the content based upon the instructions. 10. The medium of claim 9 , wherein the instructions cause the computing device: deallocate the buffer from the memory based upon transmitting the data to the client device. 11. The medium of claim 9 , wherein the instructions cause the computing device to: extract the content from the buffer based upon the flag indicating that the RDMA write operation by the second node wrote the content into the buffer within the memory of the first node. 12. The medium of claim 9 , wherein the instructions cause the computing device: combine the metadata received via the TCP and the content received through the buffer via the RDMA to construct the data. 13. The medium of claim 9 , wherein the instructions cause the computing device: receiving the content via TCP as opposed to the RDMA when the size of the content is less than the size threshold. 14. The medium of claim 9 , wherein the instructions cause the computing device: revert to data transmission via the TCP when communication via the RDMA is impossible. 15. The medium of claim 9 , wherein the RDMA write operation is performed by the second node to facilitate execution of the read request by the first node. 16. A computing device, comprising: a memory comprising machine executable code; and a processor coupled to the memory, the processor configured to execute the machine executable code to cause the processor to: receive, via a transport control protocol (TCP) by a first node, a read request from a client device to read data, comprising metadata and content, that is stored on a remote volume associated with a second node; allocate, by the first node, a buffer within memory of the first node for receiving the content using remote direct memory access (RDMA) based upon a size of the content exceeding a size threshold and resource consumption for allocating the buffer being below a threshold; transmit an address of the buffer to the second node using the TCP to trigger the second node to perform an RDMA write operation to write the content into the buffer using the address; receive, via the TCP by the first node from the second node, the metadata comprising instructions for reconstructing the data using the content within the buffer, wherein a response header comprises an indication of whether the RDMA write operation was successful; and reconstruct and transmit the data to the client device using the metadata and the content based upon the instructions. 17. The computing device of claim 16 , wherein the buffer is deallocated from the memory based upon transmitting the data to the client device. 18. The computing device of claim 16 , wherein the content is transmitted via the TCP when the size of the content is less than the size threshold. 19. The computing device of claim 18 , wherein the size threshold is between about 16 kilobytes and about 32 kilobytes. 20. The computing device of claim 16 , wherein the machine executable code causes the processor to revert to data transmission via the TCP when communication via the RDMA is impossible.
Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields · CPC title
Multiprotocol handlers, e.g. single devices capable of handling multiple protocols · CPC title
for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title
Threshold monitoring · CPC title
using a common memory, e.g. mailbox · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.