With the advent of large scale architectures came a need to improve the distribution of requests to optimize the throughput of the system while keeping a minimum response time. This is especially true for large web services. Load balancing is the ability to make many servers participate in the same service and do the same tasks.
The goal of this post is to explain the different approaches in traditional load balancing as well as a list of existing software. The last section will be about the integration of these approaches in a cloud-environment as nowadays the large scale architecture described in the previous paragraph may be entirely cloud-based. This blog post is not meant to be an exhaustive study of Load balancing as this is a mature topic with a lot of research and available products, but rather tries to be an introduction for someone who might need to use Load Balancing in his project and would like to have knowledge of the basic types of load balancers as well as a list of the most well-known products. To investigate further, a list of useful links is provided at the end of the post.
Load Balancing is often confused with high-availability as with the growing number of servers, risk of failure anywhere increases and must be addressed, and the ability to maintain unaffected services during these failures is also part of a load-balancer’s job, redirecting requests to working resources.
The focus of this post will be on Load-Balancing HTTP applications, which is one of the most classic applications of load balancing.
Load balancing approaches
DNS load balancing is probably the technique which is the easiest to implement. When accessing a service through an address, a DNS server is tasked to translate the address into a comprehensible IP. Through this URL translation, the DNS can select any node from the cluster it manages based on its scheduling policy. It also provides a validity period (Time-To-Live), used to cache the translation. After the expiry of this TTL, the next request is routed again to the DNS server. Round-Robin is the simplest policy to implement, so the addresses are returned by the server in a rotating order.
Example of DNS load-balancing
host -t a www.google.com
www.google.com has address 18.104.22.168
www.google.com has address 22.214.171.124
www.google.com has address 126.96.36.199
Using a round-robin algorithm, each request is routed to one of these different IP.
In this approach, the load-balancing architecture consists of a hardware or software equipment installed in a dedicated frond-end server that will work at the network packets level. This type of LB is also called Layer 3/4 LB, distributing requests based upon data found in network and transport layer protocols such as TCP or UDP. They will act on routing, using one of the following methods: Direct Routing (the LB routes the same service address through different local, physical servers on the same network segment), Tunneling (tunnels are established between the LB and the servers, so they can be located on remote networks) or NAT (the user connects to a virtual destination address, which the load balancer translates to one of the servers’ addresses).
Application level LBs, also called Layer 7 LBs, act as reverse proxies and distribute requests based upon data found in application layer protocols such as HTTP. They provide a first level of security by only forwarding what they understand. They can also be combined with the previous type of Load Balancer to ensure a fine-grain request distribution.
Example of an architecture using both Layer 4 and Layer 7 LB.
Historically, most of the offers in the Load Balancing sectors come from major hardware network vendors such as Big-IP, Juniper and F5 but recently software load-balancers are increasingly used, especially in a cloud environment where the network might be virtual. As the number of existing Load-Balancers is huge we chose to focus on a handful of them, especially those released under an Open Source license.
Layer-4 capable software LB
IPVS is built in the Linux kernel, and thus does not suffer from context switching between user space and kernel space, which introduces delays, especially under heavy traffic with many short lived connections.
HAProxy is an hybrid load balancer both capable of Layer 4 (TCP) and Layer 7 (HTTP) Load-Balancing. It implements an event-driven, single-process model which enables support for very high number of simultaneous connections. The idea behind this choice, which dates back to the early versions of the tool, is that because of memory limits, system scheduler limits and lock contention, multi-process/multi-threaded models are not able to cope with thousands of simultaneous connections. Since version 1.5 it supports SSL connections.
Layer-7 capable software LB
Primarily built as a lightweight HTTP server, nginx also serves quite well as an HTTP(S) load balancer. Of the listed options, nginx provides the most number of features, including many options for caching and file serving.
Through the module mod_proxy_balancer available since Apache 2.1, Apache can be used an HTTP Load Balancer retrieving requested pages from two or more backend web servers and delivering them to users, while keeping track of sessions, which allows a single user to always deal with the same backend webserver.
Between nginx and HAProxy, Pound is a lightweight HTTP-only load balancer. It offers many of the load balancing features of nginx without any of the web server capabilities and can thus be used behind any web server. This keeps Pound small and efficient.
Although primarily used as a reverse proxy cache, Varnish also includes functionality to act as a load balancer. It does not offer a great deal of configuration, but, if already using Varnish for caching, it is possible to also make use of its load balancing abilities to simplify an architecture and avoid using too many different components.
Load-Balancing in the Cloud
Many of the Infrastructure-as-a-Service management suites provide their own component dedicated to Load Balancing, among them Apache CloudStack and Openstack. This component is in fact a connector between the virtual instances and a real load balancer such as the ones described in the previous paragraph. For instance OpenStack Neutron LoadBalancing works together with HAProxy. Cloud providers such as Amazon also provide their own LB services. The common point in all these LB are that they work “as a service”, that is a tenant can dynamically add a LB to a set of virtual servers to optimize request routing.