RDMA transfers in mapreduce frameworks

US9923726B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9923726-B2
Application numberUS-201414559266-A
CountryUS
Kind codeB2
Filing dateDec 3, 2014
Priority dateDec 3, 2014
Publication dateMar 20, 2018
Grant dateMar 20, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present invention provide methods, systems, and computer program products for transferring data in a MapReduce framework. In one embodiment, MapReduce jobs are performed such that data spills are stored by mapper systems in memory and are transferred to reducer systems via one-sided RDMA transfers, which can reduce CPU overhead of mapper systems and the latency of data transfer to reducer systems.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for transferring data in a MapReduce framework comprising a mapper system and a reducer system, the method comprising: receiving, by one or more computer processors, a data split assigned to a mapper system; registering, by one or more computer processors, a first fixed-address memory region for the mapper system to be used for a remote direct memory access (RDMA) reducer buffer, wherein the first fixed-address memory region is a locked memory region, expressed as a fixed memory address followed by a specified byte range, that is separated from dynamic memory regions on a virtual machine used by the mapper system, and wherein registering includes determining a size for the RDMA reducer buffer to be created based on a size of an RDMA mapper buffer of the mapper system to which the reducer system is assigned, wherein the RDMA reducer buffer is sized such that no data is spilled to disk; executing, by one or more computer processors, one or more mapper tasks on the data split to generate output results; spilling, by one or more computer processors, generated output results to the first fixed-address memory region, such that no data is spilled to disk; transferring, by one or more computer processors, generated output results from the first fixed-address memory region to the reducer system using RDMA, wherein transferring includes performing an RDMA transfer of generated output results from the first fixed-address memory region to a second fixed-address memory region; sorting, by one or more computer processors, the output results in the second fixed-address memory region; and transferring, by one or more computer processors, the sorted output results from the second fixed-address memory region to a primary memory buffer, such that no data is spilled to disk. 2. The method of claim 1 , further comprising: registering, by one or more computer processors, the second fixed-address memory region for the reducer system. 3. The method of claim 1 , wherein the RDMA transfer is performed using both InfiniBand and RDMA over Converged Ethernet (RoCE). 4. The method of claim 1 , further comprising: transferring, by one or more computer processors, the generated output results from the second fixed-address memory region to a dynamic memory region; and sorting, by one or more computer processors, the generated output results in the dynamic memory region. 5. A computer program product for transferring data in a MapReduce framework comprising a mapper system and a reducer system, the computer program product comprising: one or more computer readable storage memory and program instructions stored on the one or more computer readable storage memory, the program instructions comprising: program instructions to receive a data split assigned to a mapper system; program instructions to register a first fixed-address memory region for the mapper system to be used for a remote direct memory access (RDMA) reducer buffer, wherein the first fixed-address memory region is a locked memory region, expressed as a fixed memory address followed by a specified byte range, that is separated from dynamic memory regions on a virtual machine used by the mapper system, and wherein registering includes determining a size for the RDMA reducer buffer to be created based on a size of an RDMA mapper buffer of the mapper system to which the reducer system is assigned, wherein the RDMA reducer buffer is sized such that no data is spilled to disk; program instructions to execute one or more mapper tasks on the data split to generate output results; program instructions to spill generated output results to the first fixed-address memory region, such that no data is spilled to disk; program instructions to transfer generated output results from the first fixed-address memory region to the reducer system using RDMA, wherein transferring includes performing an RDMA transfer of generated output results from the first fixed-address memory region to a second fixed-address memory region; sorting, by one or more computer processors, the output results in the second fixed-address memory region; and transferring, by one or more computer processors, the sorted output results from the second fixed-address memory region to a primary memory buffer. 6. The computer program product of claim 5 , wherein the program instructions stored on the one or more computer readable storage memory further comprise: program instructions to register a second fixed-address memory region for the reducer system. 7. The computer program product of claim 5 , wherein the RDMA transfer is performed using both InfiniBand and RDMA over Converged Ethernet (RoCE). 8. The computer program product of claim 5 , wherein the program instructions stored on the one or more computer readable storage memory further comprise: program instructions to transfer the generated output results from the second fixed-address memory region to a dynamic memory region; and program instructions to sort the generated output results in the dynamic memory region. 9. A computer system for transferring data in a MapReduce framework comprising a mapper system and a reducer system, the computer system comprising: one or more computer processors; one or more computer readable storage memory; program instructions stored on the one or more computer readable storage memory for execution by at least one of the one or more processors, the program instructions comprising: program instructions to receive a data split assigned to a mapper system; program instructions to register a first fixed-address memory region for the mapper system to be used for a remote direct memory access (RDMA) reducer buffer, wherein the first fixed-address memory region is a locked memory region, expressed as a fixed memory address followed by a specified byte range, that is separated from dynamic memory regions on a virtual machine used by the mapper system, and wherein registering includes determining a size for the RDMA reducer buffer to be created based on a size of an RDMA mapper buffer of the mapper system to which the reducer system is assigned, wherein the RDMA reducer buffer is sized such that no data is spilled to disk; program instructions to execute one or more mapper tasks on the data split to generate output results; program instructions to spill generated output results to the first fixed-address memory region, such that no data is spilled to disk; program instructions to transfer generated output results from the first fixed-address memory region to the reducer system using RDMA, wherein transferring includes performing an RDMA transfer of generated output results from the first fixed-address memory region to a second fixed-address memory region; sorting, by one or more computer processors, the output results in the second fixed-address memory region; and transferring, by one or more computer processors, the sorted output results from the second fixed-address memory region to a primary memory buffer. 10. The computer system of claim 9 , wherein the program instructions stored on the one or more computer readable storage memory further comprise: program instructions to register a second fixed-address memory region for the reducer system. 11. The computer system of claim 9 , wherein the RDMA transfer is performed using both InfiniBand and RDMA over Converged Ethernet (RoCE). 12. The computer system of claim 9 , wherein the program instructions stored on the one or more computer readable storage memory further comprise: program instructions to transfer the generated output results from the second fixed-address memory region to a dynamic memory re

Assignees

Inventors

Classifications

  • G06F9/5066Primary

    Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs (mappping at compile time, see G06F8/451) · CPC title

  • H04L12/06Primary

    Answer-back mechanisms or circuits · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9923726B2 cover?
Embodiments of the present invention provide methods, systems, and computer program products for transferring data in a MapReduce framework. In one embodiment, MapReduce jobs are performed such that data spills are stored by mapper systems in memory and are transferred to reducer systems via one-sided RDMA transfers, which can reduce CPU overhead of mapper systems and the latency of data transf…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/5066. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 20 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).