How To: Load Balancing & Failover With Dual/ Multi WAN / ADSL / Cable Connections on Linux

By Angsuman Chakraborty, Gaea News Network
Saturday, October 20, 2007

In many location, including but definitely not limited to India, single ADSL / Cable connections can be unreliable and also may not provide sufficient bandwidth for your purposes. One way to increase reliability and bandwidth of your internet connection is to distribute the load (load balancing) using multiple connections. It is also imperative to have transparent fail-over so routes are automatically adjusted depending on the availability of the connections. With load balancing and fail-over you can have reliable connectivity over two or more unreliable broadband connections (like BSNL or Tata Indicom in India). I present you with the simplest solution to a complex problem with live examples.

Note: Load balancing doesn’t increase connection speed for a single connection. Its benefits are realized over multiple connections like in an office environment. The benefits of fail-over are however realized even in a single user environment.

The load balancing mechanism, to be discussed with example below, in Linux caches routes and doesn’t provide transparent fail-over support. There are two solutions to incorporate transparent fail over - 1. compiling and using a custom Linux kernel with Julian Anastasov’s kernel patches for dead gateway detection or 2. user space script to monitor connections and dynamically change routing information.

Julian Anastasov’s patches have two problems:
1. They work only when the first hop gateway is down. In many cases, including ours, the first hop gateway is the adsl modem cum router which is always up. So we need a more robust solution for our purposes.

2. You have to compile a custom kernel with patches. This is somewhat complex procedure with reasonable chances of screwing up something. It also forces you to re-patch the kernel every time you decide to update your kernel. Overall I wouldn’t recommend anyone going for kernel patching route unless that is the only option. Also in that case you should look for a rpm based solution (like livna rpm for nVidia drivers) which does it automatically for you.

A better solution is to use a userspace program which monitors your connection and updates routes as necessary. I will provide a script which we use to constantly monitor our connections. It provides transparent fail over support with two ADSL connections. It is fully configurable and can be used for any standard dual ADSL / Cable connections to provide transparent fail over support. It can also be easily modified to use for more than two connections. You can also use it to log uptime / downtime of your connections like we did.

Let’s first discuss load balancing with two ADSL / Cable connections and then we will see how to provide transparent fail-over support. The ideas and script provided here can be easily used for more than two connections with minor modifications.

Requirements for Load Balancing multiple ADSL / Cable Connections

1. Obviously you need to have multiple (A)DSL or Cable connections in the first place. Login as root for this job.

2. Find out the LAN / internal IP address of the modems. They may be same like 1921.168.1.1.
Check if the internal / LAN IP address of both (or multiple) modems are same. In that use the web / telnet interface of the modems to configure one of the modems to have a different internal IP address preferably in different networks like 192.168.0.1 or 192.168.2.1 etc. If you are using multiple modems then you should configure each of them to have different subnets. This is important because now you can easily access the different modems from their web interface and you don’t have to bother connecting to a modem through a particular interface. It is also important because now you can easily configure the interfaces to be associated with different netmasks / sub-network.

3. Connect each modem to the computer using a different interface (eth0, eth1 etc.). You may be able to use the same interface but this guide doesn’t cover that. In short you will make your life complicated using the same interface or even different virtual interface. My recommendation is that you should use one interface per modem. Don’t scrimp on cheap ethernet adapters. This has the added benefit of redundancy should one adapter go bad down the road.

4. Configure the IP address of each interface to be in the same sub-network as the modem. For example my modems have IP addresses of 192.168.0.1 and 192.168.1.1. The corresponding addresses & netmasks of the interfaces are: 192.168.0.10 (netmask: 255.255.255.0) and 192.168.1.10 (netmask: 255.255.255.0).

5. Find out the following information before you proceed with the rest of the guide:

  1. IP address of external interfaces (interfaces connected to your modems). This is not the gateway address.
  2. Gateway IP address of each broadband connections. This is the first hop gateway, could be your DSL modem IP address if it has been configured as the gateway following the tip below.
  3. Name, IP address & netmask of external interfaces like eth1, eth2 etc. My external interfaces are eth1 & eth2.
  4. Relative weights you want to assign to each connection. My Tata connection is 4 times faster than BSNL connection. So I assign the weight of 4 to Tata and 1 to BSNL. You must use low positive integer values for weights. For same connection speeds weights of 1 & 1 are appropriate. The weights determine how the load is balanced across multiple connections. In my case Tata is 4 times as likely to be used as route for a particular site in comparison with BSNL.

Note: Refer to Netmask guide for details on netmasks.

Optional step
Check the tips on configuring (A)DSL modems. They are not required for using this guide. However they are beneficial in maximizing your benefits.

How to setup default load balancing for multiple ADSL / Cable connections

Unlike other guides on this topic I will use a real example - the configuration on our internal network. So to begin with here are the basic data for my network:

#IP address of external interfaces. This is not the gateway address.
IP1=192.168.1.10
IP2=192.168.0.10

#Gateway IP addresses. This is the first (hop) gateway, could be your router IP
#address if it has been configured as the gateway
GW1=192.168.1.1
GW2=192.168.0.1

# Relative weights of routes. Keep this to a low integer value. I am using 4
# for TATA connection because it is 4 times faster
W1=1
W2=4

# Broadband providers name; use your own names here.
NAME1=bsnl
NAME2=tata

You must change the example below to use your own IP addresses and other details. Even with that inconvenience a real example is much easier to understand than examples with complex notations. The example given below is copy-pasted from our intranet configuration. It works perfectly as advertised.

Note: In this step fail-over is not addressed. It is provided later with a script which runs on startup.

First you need to create two (or more) routes in the routing table ( /etc/iproute2/rt_tables ). Open the file and make changes similar to what is show below. I added the following for my two connections:

1 bsnl
2 tata

To add a default load balancing route for our outgoing traffic using our dual internet connections (ADSL broadband connections from BSNL & Tata Indicom) here are the lines I included in rc.local file:

ip route add 192.168.1.0/24 dev eth1 src 192.168.1.10 table bsnl
ip route add default via 192.168.1.1 table bsnl
ip route add 192.168.0.0/24 dev eth2 src 192.168.0.10 table tata
ip route add default via 192.168.0.1 table tata
ip rule add from 192.168.1.10 table bsnl
ip rule add from 192.168.0.10 table tata
ip route add default scope global nexthop via 192.168.1.1 dev eth1 weight 1 nexthop via 192.168.0.1 dev eth2 weight 4

Adding them to rc.local ensures that they are execute automatically on startup. You can also run them manually from the command line.

This completes the load balancing part. Let’s now see how we can achieve fail-over so the routes are automatically changed when one or more connections are down and then changed again when one or more more connections come back up again. To do this magic I used a script.

How to setup fail-over over multiple load balanced ADSL / Cable connections

Please follow the steps below and preferably in the same order:

  1. First download the script which checks for and provides fail-over over dual ADSL / Cable internet connections and save it to /usr/sbin directory (or any other directory which is mounted available while loading the OS).
  2. Change the file permissions to 755:
    chmod 755 /usr/sbin/gwping
  3. Open the file (as root) in an editor like vi or gedit and edit the following parameters for your environment:

    #IP Address or domain name to ping. The script relies on the domain being pingable and always available
    TESTIP=www.yahoo.com

    #Ping timeout in seconds
    TIMEOUT=2

    # External interfaces
    EXTIF1=eth1
    EXTIF2=eth2

    #IP address of external interfaces. This is not the gateway address.
    IP1=192.168.1.10
    IP2=192.168.0.10

    #Gateway IP addresses. This is the first (hop) gateway, could be your router IP
    #address if it has been configured as the gateway
    GW1=192.168.1.1
    GW2=192.168.0.1

    # Relative weights of routes. Keep this to a low integer value. I am using 4
    # for TATA connection because it is 4 times faster
    W1=1
    W2=4

    # Broadband providers name; use your own names here.
    NAME1=BSNL
    NAME2=TATA

    #No of repeats of success or failure before changing status of connection
    SUCCESSREPEATCOUNT=4
    FAILUREREPEATCOUNT=1

    Note: Four consecutive success indicates that the gateway is up and one (consecutive) failure indicates that the gateway went down for my environment. You may want to modify it to better match your environment.

  4. Add the following line to the end of /etc/rc.local file:
    nohup /usr/sbin/gwping &

In the end my /etc/rc.local file has the following lines added in total:

ip route add 192.168.1.0/24 dev eth1 src 192.168.1.10 table bsnl
ip route add default via 192.168.1.1 table bsnl
ip route add 192.168.0.0/24 dev eth2 src 192.168.0.10 table tata
ip route add default via 192.168.0.1 table tata
ip rule add from 192.168.1.10 table bsnl
ip rule add from 192.168.0.10 table tata
ip route add default scope global nexthop via 192.168.1.1 dev eth1 weight 1 nexthop via 192.168.0.1 dev eth2 weight 4
nohup /usr/sbin/gwping &

An astute reader may note that the default setup with dual load balanced routing (7th line) is really not required as the script is configured to force routing based on the current status the very first time. However it is there to ensure proper routing before the script forces the routing for the first time which is about 40 seconds in my setup (can you tell why it takes 40 second for the first time?).

Concluding thoughts
In the process of finding and coding the simple solution above, I read several documents on routing including the famous lartc how-to (many of whose commands didn’t work as described on my Fedora Core system) & nano.txt among several others. I think I have described the simplest possible solution for load balancing and transparent failover of two or more DSL / Cable connections from one or more providers where channel bonding is not provided upstream (requires cooperation from one or more DSL providers); which is the most common scenario. I would welcome suggestions and improvements to this document.

The solution has been well tested in multiple real and artificial load condition and works extremely well with users never realizing when a connection went down or came back up again.

Networking is a complex thing and it is conceivable that you may run into issues not covered here. Feel free to post your problems and solutions here. However, while I would like to, I will not be able to debug and solve individual problems due to time constraints.

I may however be able to offer useful suggestions to your unique problems. It may however be noted that I respond well to Café Estima Blend™ by Starbucks and move much quicker on my todo list. It is also great as a token of appreciation for my hard work. The “velvety smooth and balanced with a roasty-sweet flavor this blend of coffees is a product of the relationships formed between” us.

In a followup article I discussed how to configure single / dual / multiple ADSL / cable connections, firewall, gateway / NAT With Shorewall Firewall.

Discussion

Lai
July 13, 2010: 8:09 pm

Hi Angsuman,

Is there a way to maintain failover feature but disable load balancing ? Meaning I only allow
LAN users to use first internet line and failover
to second line if first line down.

I am thinking to increase the weight of first line
to very high value (may be 100) so that almost
all LAN users forced to use the first line ..
My purpose is to eliminate frequent disconnect issues.

Pls advise ..Thanks

Lai


Ajayan
June 15, 2010: 4:04 pm

Very good Article..But when i am trying to download the Script ,its showing a new page with Script.I am not able to view the Script in proper manner…Any body having Exact copy Of the Script,Please upload it once again…..


jay
May 22, 2010: 5:53 am

Hello,
i just wanted to merge the multiple(different ISP) bandwidth for my Internal LAN & the can able to use total bandwidth in a single pool.
If you have a script & the solution for this please let me know, i will be ver thankfull to you
thanks n regards,
Jay Dixit


Gabriel
May 10, 2010: 7:41 pm

Congratulations for the great article, it helped me a lot!!!
I’m having only one issue with the connection being load balanced. When I access a bank site, after the login, the page tells me that that session was closed and when I use applications like MSN, it connects and disconnects a lot of times. This happens when I’m using NAT or directly via a proxy on the same machine.
I tought that the connections would be persistent after they started over a link.
Could you shed some light over here?
Thanks in advance.


JP
May 10, 2010: 8:34 am

If you want to look at a product-based approach there is Elfiq - https://www.elfiq.com - simpler to have a product deliver this in many scenarios


RichWalu
May 4, 2010: 1:21 pm

This is a great piece of work. I am newbie in Linux and my server is running Suse10. Please, can someone point out the equivalent of /etc/rc.local and /usr/sbin in Suse10 environment.

Thanx


Jimmy
March 25, 2010: 7:15 am

Hi amitbiswa,

What part do you mean? I’ve set up the load balancing and fail over just like the howto from Angusman.

besides that i have shorewall running and used the tcrules from shorewall (just google tcrules + shorewall) to mark specific data, which is then routed through a specific internet connection.

The only problem i see is that when you specify to many things in there you don’t use the load balancing really optimal because i’ve set it up myself and the load balancer does all the not specified traffic.


amitbiswas
March 16, 2010: 6:00 am

Hi Angsuman, Great howto, perfect logic.

Hi Jimmy, Could you please give me some details how you have deploy this into your ubuntu.

Thanks in advance.


Jimmy
January 27, 2010: 8:33 am

Hello,

Thanks for the great howto.

I have set up a ubuntu server and used the howto to set up load balancing and fail over just like the howto.

Everythig works great, except for one thing, when i use an internet radio or my accounting software with citrix or MSN messenger, the connection is lost every now and then (happens a couple of times an hour, totaly randomly) one day more then another.

Because i have 1 slow ADSL connection and one fast cable broadband connection i have set the weight to 1 to 20, we have around the 14 computers connected to the lan. I als put a line in the tcrules from my shorewall firewall which ( i think) routes, for example, all the internet radio using port 8000 throug one specific internet connection.

do you have any idea how i can fix this, or what the problem may cause?


Joe
January 26, 2010: 8:46 am

My Messenger keeps reconnecting after i did this solutions for fail-over, what should i do ?


M.Azath
January 19, 2010: 4:45 am

can you give me some pdf material which is very relevant to load balancing in internet.


Travis
January 14, 2010: 8:26 pm

Thank you for the above scripts, I’ve found them very useful. I do however have a problem. My staff made use of remote desktop from our LAN out the broadband services to the greater internet. When using the above script ever 600 seconds or so the remote desktop session drops out and re-connects? I read that I need to:
echo “144000″ > /proc/sys/net/ipv4/route/secret_interval

is this correct? whilst it resolved my drop out issue from ever 600 seconds to every 4 hours am I missing something or do you all experience the same problem?

Thanks,
Trav
to set the

January 11, 2010: 8:36 am

came from google, your article is very helpful,
thank you very much.


Fausto
December 14, 2009: 7:10 pm

Thank you very much for this! It was perfect with two dedicated links that I manage. Success for you!


Polleke
November 26, 2009: 11:25 am

I have used (a modified version) for a three-way connection.. it needed some hacking.. I will see if i can post it somewhere. Thnx for the script. It _would_ however be so much more configurable if it it was split up in a config file /etc/gwping/gwping.conf the script itself and forementioned init.d script..

I will rewrite my version next august and try to keep you guys posted


yermet
November 24, 2009: 3:19 am

Thank you for this post!!
I needed to do this but for a unique installation in several computers for various environments from 1 to 5 WAN.
I managed to create 2 scripts, one for the initial configuration of the interfaces and the second one modifying your script, so indicating the quantity of WAN available by parameter to the script it will balance over 2-5 WAN.
Finally you only have to modify the file rc.local to configure the whole loadbalance and failover.

nohup /usr/sbin/lbinit 3 &
nohup /usr/sbin/gwping 3 &

And it works!! ;)

Thank you again!!


balaji
November 19, 2009: 11:48 pm

hi
Is there a way wherein in can load balance two tata photon usb modem on linux
pc?


Jaime
November 15, 2009: 4:10 pm

Thank you for this blog post. It really helped me a lot to setup load balancing and failover on my machine.


Gregorio
November 2, 2009: 7:45 am

Which linux distribution do you suggest to make this solution?

Greg


Pravin Oza
September 29, 2009: 10:50 am

I have 10 mbps Internet Leased L. used in College. I want to divide bandwidth as 6:4 mbps in two building in the same campus. I have Cisco 2811 router. Which router and L2 switch configuration require for BW divide respectively in between 2 buildings.


Jan Marc Hoffmann
August 27, 2009: 1:21 pm

Heya!

This guide is very nice, but its quite complicated. Why do you use so many interfaces and subnets? Its not necessary.

Here is an easy setup:
Router1: 192.168.0.2 (eth0)
Router2: 192.168.0.3 (eth0)
Balancer Gateway: 192.168.0.1 (eth0)

Balancer Gateway Config:
ip route add default scope global nexthop via 192.168.0.2 dev eth0 weight 1 nexthop via 192.168.0.3 dev eth0 weight 4

echo “1″ > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

And add some dns proxy or some iptables rules for dns…

Thats all. Works like a charm. You wont need tons of interfaces, nics or routing tables.

greetings
Jan Marc


zaw hkawng
August 12, 2009: 6:52 am

Hi
Some 1 help me ! Pls
I m linux beginner user. I want to test for internet share.I heard squid is very good for it,but I don’t know how to do it, could you send me step by step configuration by mail.My email is here zawhkawng@gmail.com
Thank u very much


zaw hkawng
August 12, 2009: 6:40 am

Hi
Some 1 help me !
I m linux beginner user. I want to test for internet share.I heard squid is very good for it,but I don’t know how to do it, could you send me step by step configuration by mail.My email is here zawhkawng@gmail.com
Thank u very much


Thiha
July 1, 2009: 12:25 pm

Could you please tell me how to test whether our linux box is being as a load balancing server or not. Although i use traceroute command, the output do not show multiple path.


Walter
March 30, 2009: 8:16 pm

Hi,
Excelent How To. I’m using it with shorewall and it works fine, buy I have one question.. if I have a service which is reached from the outside for example a security IP camera. How can I assure that this camera will be reached from only 1 of my 2 IP addresses.

Thanks,
Walter


Daniel
March 10, 2009: 2:05 am

Hey. I accomplished to setup a load balancing between two interfaces too, made firewall rules, etc. and it all works pretty fine, except of one fact, which I’d like to share with you to see if somebody had the same problem.
So, i load-balance with the same ip route commands like Angsuman showed above, except that my ISP speeds are the same, so i use a ‘weight 1′ twice. Now, every once in a while it happens that a user (fyi: we have about 20-25 users) gets stuck while browsing the web. once you hit refresh the site will be loaded though. It’s not bound to certain sites, it happens randomly. I suspect my DNS setup but the fact that i use a non-authoritative DNS (forward only) makes me think it’s another problem. Did anybody experience similar problem s and if yes, how did you fix it ? Any help is very appreciated.

Thanks, Daniel

(and thanks angsuman for your contribution to this topic)


Bernardo Burnier
February 28, 2009: 6:18 pm

Who wants to mess with his kernel and deal with complex and poorly documented solutions, or even to fork out some big buck$ to solve this problem, when it can be done, as you presented here, with a smart little bash script on a linux box?
After reading all the cluttter floating around the internet about how to LB (and there’s a lot) I can only say that this is a simple yet powerfull solution.
Thank you for showing the way.


Wilson
February 24, 2009: 4:53 am

Great job.

I will like to implement it with 3 isp’s. How can it be done?

Thanks in advanced.


Martin Pusch
February 20, 2009: 5:30 pm

Thank you for sharing your experience with others. I am working in Africa, where the Internet connections are not always easy to manage, and your informations helped me a lot.

Martin


Murali Krishna
February 18, 2009: 11:36 am

Hi Angsuman,
“fail-over over multiple load balanced ADSL/Cable connections” (gwping) script you have given only about 2 ISP’s. How can i change that script for 3 ISP’s?
can you pls help me out.
Thanks in advance.

YOUR VIEW POINT
NAME : (REQUIRED)
MAIL : (REQUIRED)
will not be displayed
WEBSITE : (OPTIONAL)
YOUR
COMMENT :