How To: Load Balancing & Failover With Dual/ Multi WAN / ADSL / Cable Connections on Linux
By Angsuman Chakraborty, Gaea News NetworkSaturday, October 20, 2007
In many location, including but definitely not limited to India, single ADSL / Cable connections can be unreliable and also may not provide sufficient bandwidth for your purposes. One way to increase reliability and bandwidth of your internet connection is to distribute the load (load balancing) using multiple connections. It is also imperative to have transparent fail-over so routes are automatically adjusted depending on the availability of the connections. With load balancing and fail-over you can have reliable connectivity over two or more unreliable broadband connections (like BSNL or Tata Indicom in India). I present you with the simplest solution to a complex problem with live examples.
Note: Load balancing doesn’t increase connection speed for a single connection. Its benefits are realized over multiple connections like in an office environment. The benefits of fail-over are however realized even in a single user environment.
The load balancing mechanism, to be discussed with example below, in Linux caches routes and doesn’t provide transparent fail-over support. There are two solutions to incorporate transparent fail over - 1. compiling and using a custom Linux kernel with Julian Anastasov’s kernel patches for dead gateway detection or 2. user space script to monitor connections and dynamically change routing information.
Julian Anastasov’s patches have two problems:
1. They work only when the first hop gateway is down. In many cases, including ours, the first hop gateway is the adsl modem cum router which is always up. So we need a more robust solution for our purposes.
2. You have to compile a custom kernel with patches. This is somewhat complex procedure with reasonable chances of screwing up something. It also forces you to re-patch the kernel every time you decide to update your kernel. Overall I wouldn’t recommend anyone going for kernel patching route unless that is the only option. Also in that case you should look for a rpm based solution (like livna rpm for nVidia drivers) which does it automatically for you.
A better solution is to use a userspace program which monitors your connection and updates routes as necessary. I will provide a script which we use to constantly monitor our connections. It provides transparent fail over support with two ADSL connections. It is fully configurable and can be used for any standard dual ADSL / Cable connections to provide transparent fail over support. It can also be easily modified to use for more than two connections. You can also use it to log uptime / downtime of your connections like we did.
Let’s first discuss load balancing with two ADSL / Cable connections and then we will see how to provide transparent fail-over support. The ideas and script provided here can be easily used for more than two connections with minor modifications.
Requirements for Load Balancing multiple ADSL / Cable Connections
1. Obviously you need to have multiple (A)DSL or Cable connections in the first place. Login as root for this job.
2. Find out the LAN / internal IP address of the modems. They may be same like 1921.168.1.1.
Check if the internal / LAN IP address of both (or multiple) modems are same. In that use the web / telnet interface of the modems to configure one of the modems to have a different internal IP address preferably in different networks like 192.168.0.1 or 192.168.2.1 etc. If you are using multiple modems then you should configure each of them to have different subnets. This is important because now you can easily access the different modems from their web interface and you don’t have to bother connecting to a modem through a particular interface. It is also important because now you can easily configure the interfaces to be associated with different netmasks / sub-network.
3. Connect each modem to the computer using a different interface (eth0, eth1 etc.). You may be able to use the same interface but this guide doesn’t cover that. In short you will make your life complicated using the same interface or even different virtual interface. My recommendation is that you should use one interface per modem. Don’t scrimp on cheap ethernet adapters. This has the added benefit of redundancy should one adapter go bad down the road.
4. Configure the IP address of each interface to be in the same sub-network as the modem. For example my modems have IP addresses of 192.168.0.1 and 192.168.1.1. The corresponding addresses & netmasks of the interfaces are: 192.168.0.10 (netmask: 255.255.255.0) and 192.168.1.10 (netmask: 255.255.255.0).
5. Find out the following information before you proceed with the rest of the guide:
- IP address of external interfaces (interfaces connected to your modems). This is not the gateway address.
- Gateway IP address of each broadband connections. This is the first hop gateway, could be your DSL modem IP address if it has been configured as the gateway following the tip below.
- Name, IP address & netmask of external interfaces like eth1, eth2 etc. My external interfaces are eth1 & eth2.
- Relative weights you want to assign to each connection. My Tata connection is 4 times faster than BSNL connection. So I assign the weight of 4 to Tata and 1 to BSNL. You must use low positive integer values for weights. For same connection speeds weights of 1 & 1 are appropriate. The weights determine how the load is balanced across multiple connections. In my case Tata is 4 times as likely to be used as route for a particular site in comparison with BSNL.
Note: Refer to Netmask guide for details on netmasks.
Optional step
Check the tips on configuring (A)DSL modems. They are not required for using this guide. However they are beneficial in maximizing your benefits.
How to setup default load balancing for multiple ADSL / Cable connections
Unlike other guides on this topic I will use a real example - the configuration on our internal network. So to begin with here are the basic data for my network:
#IP address of external interfaces. This is not the gateway address.
IP1=192.168.1.10
IP2=192.168.0.10#Gateway IP addresses. This is the first (hop) gateway, could be your router IP
#address if it has been configured as the gateway
GW1=192.168.1.1
GW2=192.168.0.1# Relative weights of routes. Keep this to a low integer value. I am using 4
# for TATA connection because it is 4 times faster
W1=1
W2=4# Broadband providers name; use your own names here.
NAME1=bsnl
NAME2=tata
You must change the example below to use your own IP addresses and other details. Even with that inconvenience a real example is much easier to understand than examples with complex notations. The example given below is copy-pasted from our intranet configuration. It works perfectly as advertised.
Note: In this step fail-over is not addressed. It is provided later with a script which runs on startup.
First you need to create two (or more) routes in the routing table ( /etc/iproute2/rt_tables ). Open the file and make changes similar to what is show below. I added the following for my two connections:
1 bsnl
2 tata
To add a default load balancing route for our outgoing traffic using our dual internet connections (ADSL broadband connections from BSNL & Tata Indicom) here are the lines I included in rc.local file:
ip route add 192.168.1.0/24 dev eth1 src 192.168.1.10 table bsnl
ip route add default via 192.168.1.1 table bsnl
ip route add 192.168.0.0/24 dev eth2 src 192.168.0.10 table tata
ip route add default via 192.168.0.1 table tata
ip rule add from 192.168.1.10 table bsnl
ip rule add from 192.168.0.10 table tata
ip route add default scope global nexthop via 192.168.1.1 dev eth1 weight 1 nexthop via 192.168.0.1 dev eth2 weight 4
Adding them to rc.local ensures that they are execute automatically on startup. You can also run them manually from the command line.
This completes the load balancing part. Let’s now see how we can achieve fail-over so the routes are automatically changed when one or more connections are down and then changed again when one or more more connections come back up again. To do this magic I used a script.
How to setup fail-over over multiple load balanced ADSL / Cable connections
Please follow the steps below and preferably in the same order:
- First download the script which checks for and provides fail-over over dual ADSL / Cable internet connections and save it to /usr/sbin directory (or any other directory which is mounted available while loading the OS).
- Change the file permissions to 755:
chmod 755 /usr/sbin/gwping
- Open the file (as root) in an editor like vi or gedit and edit the following parameters for your environment:
#IP Address or domain name to ping. The script relies on the domain being pingable and always available
TESTIP=www.yahoo.com#Ping timeout in seconds
TIMEOUT=2# External interfaces
EXTIF1=eth1
EXTIF2=eth2#IP address of external interfaces. This is not the gateway address.
IP1=192.168.1.10
IP2=192.168.0.10#Gateway IP addresses. This is the first (hop) gateway, could be your router IP
#address if it has been configured as the gateway
GW1=192.168.1.1
GW2=192.168.0.1# Relative weights of routes. Keep this to a low integer value. I am using 4
# for TATA connection because it is 4 times faster
W1=1
W2=4# Broadband providers name; use your own names here.
NAME1=BSNL
NAME2=TATA#No of repeats of success or failure before changing status of connection
SUCCESSREPEATCOUNT=4
FAILUREREPEATCOUNT=1Note: Four consecutive success indicates that the gateway is up and one (consecutive) failure indicates that the gateway went down for my environment. You may want to modify it to better match your environment.
- Add the following line to the end of /etc/rc.local file:
nohup /usr/sbin/gwping &
In the end my /etc/rc.local file has the following lines added in total:
ip route add 192.168.1.0/24 dev eth1 src 192.168.1.10 table bsnl
ip route add default via 192.168.1.1 table bsnl
ip route add 192.168.0.0/24 dev eth2 src 192.168.0.10 table tata
ip route add default via 192.168.0.1 table tata
ip rule add from 192.168.1.10 table bsnl
ip rule add from 192.168.0.10 table tata
ip route add default scope global nexthop via 192.168.1.1 dev eth1 weight 1 nexthop via 192.168.0.1 dev eth2 weight 4
nohup /usr/sbin/gwping &
An astute reader may note that the default setup with dual load balanced routing (7th line) is really not required as the script is configured to force routing based on the current status the very first time. However it is there to ensure proper routing before the script forces the routing for the first time which is about 40 seconds in my setup (can you tell why it takes 40 second for the first time?).
Concluding thoughts
In the process of finding and coding the simple solution above, I read several documents on routing including the famous lartc how-to (many of whose commands didn’t work as described on my Fedora Core system) & nano.txt among several others. I think I have described the simplest possible solution for load balancing and transparent failover of two or more DSL / Cable connections from one or more providers where channel bonding is not provided upstream (requires cooperation from one or more DSL providers); which is the most common scenario. I would welcome suggestions and improvements to this document.
The solution has been well tested in multiple real and artificial load condition and works extremely well with users never realizing when a connection went down or came back up again.
Networking is a complex thing and it is conceivable that you may run into issues not covered here. Feel free to post your problems and solutions here. However, while I would like to, I will not be able to debug and solve individual problems due to time constraints.
I may however be able to offer useful suggestions to your unique problems. It may however be noted that I respond well to Café Estima Blend™ by Starbucks and move much quicker on my todo list. It is also great as a token of appreciation for my hard work. The “velvety smooth and balanced with a roasty-sweet flavor this blend of coffees is a product of the relationships formed between” us.
In a followup article I discussed how to configure single / dual / multiple ADSL / cable connections, firewall, gateway / NAT With Shorewall Firewall.
Tags: Broadband, Cases, Dataone Broadband, Dead, DSL, Firewall, Traffic Shaping, Why
July 13, 2010: 8:09 pm
Hi Angsuman, Is there a way to maintain failover feature but disable load balancing ? Meaning I only allow I am thinking to increase the weight of first line Pls advise ..Thanks Lai |
Ajayan |
June 15, 2010: 4:04 pm
Very good Article..But when i am trying to download the Script ,its showing a new page with Script.I am not able to view the Script in proper manner…Any body having Exact copy Of the Script,Please upload it once again….. |
May 22, 2010: 5:53 am
Hello, |
Gabriel |
May 10, 2010: 7:41 pm
Congratulations for the great article, it helped me a lot!!! |
May 10, 2010: 8:34 am
If you want to look at a product-based approach there is Elfiq - https://www.elfiq.com - simpler to have a product deliver this in many scenarios |
RichWalu |
May 4, 2010: 1:21 pm
This is a great piece of work. I am newbie in Linux and my server is running Suse10. Please, can someone point out the equivalent of /etc/rc.local and /usr/sbin in Suse10 environment. Thanx |
Jimmy |
March 25, 2010: 7:15 am
Hi amitbiswa, What part do you mean? I’ve set up the load balancing and fail over just like the howto from Angusman. besides that i have shorewall running and used the tcrules from shorewall (just google tcrules + shorewall) to mark specific data, which is then routed through a specific internet connection. The only problem i see is that when you specify to many things in there you don’t use the load balancing really optimal because i’ve set it up myself and the load balancer does all the not specified traffic. |
amitbiswas |
March 16, 2010: 6:00 am
Hi Angsuman, Great howto, perfect logic. Hi Jimmy, Could you please give me some details how you have deploy this into your ubuntu. Thanks in advance. |
Jimmy |
January 27, 2010: 8:33 am
Hello, Thanks for the great howto. I have set up a ubuntu server and used the howto to set up load balancing and fail over just like the howto. Everythig works great, except for one thing, when i use an internet radio or my accounting software with citrix or MSN messenger, the connection is lost every now and then (happens a couple of times an hour, totaly randomly) one day more then another. Because i have 1 slow ADSL connection and one fast cable broadband connection i have set the weight to 1 to 20, we have around the 14 computers connected to the lan. I als put a line in the tcrules from my shorewall firewall which ( i think) routes, for example, all the internet radio using port 8000 throug one specific internet connection. do you have any idea how i can fix this, or what the problem may cause? |
Joe |
January 26, 2010: 8:46 am
My Messenger keeps reconnecting after i did this solutions for fail-over, what should i do ? |
M.Azath |
January 19, 2010: 4:45 am
can you give me some pdf material which is very relevant to load balancing in internet. |
Travis |
January 14, 2010: 8:26 pm
Thank you for the above scripts, I’ve found them very useful. I do however have a problem. My staff made use of remote desktop from our LAN out the broadband services to the greater internet. When using the above script ever 600 seconds or so the remote desktop session drops out and re-connects? I read that I need to: is this correct? whilst it resolved my drop out issue from ever 600 seconds to every 4 hours am I missing something or do you all experience the same problem? Thanks, |
Fausto |
December 14, 2009: 7:10 pm
Thank you very much for this! It was perfect with two dedicated links that I manage. Success for you! |
Polleke |
November 26, 2009: 11:25 am
I have used (a modified version) for a three-way connection.. it needed some hacking.. I will see if i can post it somewhere. Thnx for the script. It _would_ however be so much more configurable if it it was split up in a config file /etc/gwping/gwping.conf the script itself and forementioned init.d script.. I will rewrite my version next august and try to keep you guys posted |
yermet |
November 24, 2009: 3:19 am
Thank you for this post!! nohup /usr/sbin/lbinit 3 & And it works!! Thank you again!! |
balaji |
November 19, 2009: 11:48 pm
hi |
Jaime |
November 15, 2009: 4:10 pm
Thank you for this blog post. It really helped me a lot to setup load balancing and failover on my machine. |
Gregorio |
Pravin Oza |
September 29, 2009: 10:50 am
I have 10 mbps Internet Leased L. used in College. I want to divide bandwidth as 6:4 mbps in two building in the same campus. I have Cisco 2811 router. Which router and L2 switch configuration require for BW divide respectively in between 2 buildings. |
Jan Marc Hoffmann |
August 27, 2009: 1:21 pm
Heya! This guide is very nice, but its quite complicated. Why do you use so many interfaces and subnets? Its not necessary. Here is an easy setup: Balancer Gateway Config: echo “1″ > /proc/sys/net/ipv4/ip_forward And add some dns proxy or some iptables rules for dns… Thats all. Works like a charm. You wont need tons of interfaces, nics or routing tables. greetings |
zaw hkawng |
August 12, 2009: 6:52 am
Hi |
zaw hkawng |
August 12, 2009: 6:40 am
Hi |
Thiha |
July 1, 2009: 12:25 pm
Could you please tell me how to test whether our linux box is being as a load balancing server or not. Although i use traceroute command, the output do not show multiple path. |
Walter |
March 30, 2009: 8:16 pm
Hi, Thanks, |
Daniel |
March 10, 2009: 2:05 am
Hey. I accomplished to setup a load balancing between two interfaces too, made firewall rules, etc. and it all works pretty fine, except of one fact, which I’d like to share with you to see if somebody had the same problem. Thanks, Daniel (and thanks angsuman for your contribution to this topic) |
Bernardo Burnier |
February 28, 2009: 6:18 pm
Who wants to mess with his kernel and deal with complex and poorly documented solutions, or even to fork out some big buck$ to solve this problem, when it can be done, as you presented here, with a smart little bash script on a linux box? |
Wilson |
February 24, 2009: 4:53 am
Great job. I will like to implement it with 3 isp’s. How can it be done? Thanks in advanced. |
Martin Pusch |
February 20, 2009: 5:30 pm
Thank you for sharing your experience with others. I am working in Africa, where the Internet connections are not always easy to manage, and your informations helped me a lot. Martin |
Murali Krishna |
February 18, 2009: 11:36 am
Hi Angsuman, |
Lai