This blog post gives an overview of the InfiniBand functionality that is offered by the transport stack of KIARA. KIARA is a new and advanced middleware and a part of the FI-WARE project which is in turn part of the very large European FI-PPP programme. Several team members of the ICCLab are currently working on the implementation of this middleware.

In an earlier blog-post I gave an introduction to the InfiniBand technology and also described the structure and function of a simple InfiniBand application. It is advised to read that blog post or at least have a basic understanding of the technology to get something out of this blog post. When first researching the InfiniBand technology it quickly became clear that although general information is available, tutorials or basic examples on how to write InfiniBand applications are quite rare – even though there is some very good material like the tutorials from “The Geek in the Corner”. The problem with these tutorials for me was, that they make use of the so called RDMA Connection Manager (CM) which does not work on the systems that are available to me. The RDMA CM is event driven and for example takes care of connecting two queue pairs with each other and readying them for sending and receiving. Since the RDMA CM is not available for me I extracted the information of how to connect queue pairs – via TCP – and ready them for sending and receiving from the source code of the ‘ib_rdma_bw’ application. More on this can be found in the blog post mentioned earlier.

Requirements for InfiniBand Application / Functionality

In a next step that application had to be improved and expanded as well as integrated into the transport stack of the KIARA middleware. That meant offering all transport functionality over a generic API with the following functions:

  • bind / unbind
  • connect / disconnect
  • send / receive
  • register_callback / callback
  • get_context / set_context
  • get_session / set_session
  • get_configuration / set_configuration

Bind, unbind, register_callback and callback are primarily used by a server while connect and disconnect are primarily used by a client. Send and receive are used by both client and server. The context, session and configuration getters and setters are important but more complimentary to the other functions of the KIARA API.

Other requirements were that a server has to be able to handle multiple connections and that use should be made of the RDMA write or read functionality that the InfiniBand technology offers to provide high throughput and low latency.

KIARA InfiniBand: High Level Workflow

These requirements led to the following concept / sequence-diagram about how the flow of sending and receiving should look like:

kiara infiniband application workflow

It includes the following steps:

  1. With the bind()-Function the server starts to listen for incoming client-requests / connections.
  2. With the connect()-Function the client establishes an initial TCP connection and over it exchanges the information necessary to later send or receive over InfiniBand ( Local Identifier (LID), Queue Pair Number (QPN) and Packet Sequence Number (PSN) ).
  3. The client-side user then passes the data he wants to send to the server to the send()-Function. This function itself does not send the data, but rather the information of where the data lies in memory. This information is sent to the server via InfiniBand. As soon as the server receives such a message, he will perform an RDMA-read to get the client’s data.
  4. When the server-side user executes the callback()- or the recv()-Function the application checks if the RDMA-read has already finished and if so returns the data to the user.
  5. When the server-side user has processed the client-data and computed a result/answer he passes this data to the send()-Function to send it back to the client. What follows is a repetition of steps 3 and 4 with the server being the initiating peer.
  6. At the end the client finishes with the disconnect()-Function and the server with the unbind()-Function.

KIARA InfiniBand: Detailed Workflow

The following depiction of the workflow goes a little further beneath the surface and shows what internal functions are called and what they do.

detailed kiara infiniband application

This should give you an idea of how data is transferred within the InfiniBand part of the KIARA transport stack. The code of the actual implementation of the library as well as an example implementation of a client and a server can be found at the end of this blog. A more detailed explanation of that code will follow in a later blog-post.

Source Code

Related Blog Post