My Postmortem: A History of a Server Down
Issue Summary
Making my first configuration of my load balancer. I was trying to set and install the HAproxy load balancer, following the steps, and do everything correctly.
After that, I went to check in the browser the address of my server and I found this:
Then, I got a little scared but I quickly started to search on my server the reason of the issue in the HAProxy configuration, and when I checked the status of the service I discovered this:
Timeline
Root Cause
The root of the issue was the HAproxy service doesn’t have available backend servers. And I choose for help on the web documentation and asking for peers to get a clear vision of the issue.
After, asking several people (including TA Staff members) we were checking the config files and the haproxy (including remove the haproxy installation and setting again).
We found a key piece of this issue, when we checking carefully the haproxy.cfg
Resolution and recovery
Before resolving the error, a debugging process was started: Pings, ncat connection tests, traceroute, and restarting the servers (Hard and soft reboots).
With the help of my peer David Arias, we use the help section /usr/local/sbin/haproxy --help
and find the line with the error in the haproxy.cfg and modifying with the correct information of the server on the line like below
server server-a server-a:8080 check
server server-b server-b:8080 check
After this, we start the HAproxy process again and trying again on our browser:
And Everything OK.
Corrective and Preventative Measures
For a next haproxy issue, maybe I can start using the help section
/usr/local/sbin/haproxy --help
And using on of the two ways to check the haproxy.cfg :
1- /usr/local/sbin/haproxy -c -V -f /etc/haproxy/haproxy.cfg
which validates the file syntax. The -c switch in the command represents the Check, while the others denote "Verbose" & "file".
2- sudo service haproxy configtest
You can check your service status too:
I hope this helps anyone looking to check the syntax of the haproxy.cfg if you having a similar issue.
References: