For the record, I have implemented the HAProxy solution and it is working beautifully.
That’s very true, hopefully in a situation like this you would not have DCs going up and down with any regularity. If you do, that would create a lot of other problems and would explain why the DCs aren’t working as it is. DNS Round Robining is, in theory, how DCs are supposed to work by default, actually. It just has automated harvesting.
Which actually means that a load balancer like HAProxy would not actually do anything, as there is already load balancing in place that is not working properly.
A few years ago, we moved from a Linux based environment (Novell eDirectory) to Microsoft. And during this time we have seems some issues with Microsoft DCs.
First problem is a memory leak (small but noticeable).
Second, Microsoft Servers need a TON more resources!
Third, when the DCs are about to fail, they do so in weird ways. They still respond to network services and all of our SNMP and server monitoring tools can’t detect any problems, but DNS, LDAP and other requests start failing here and there!
HAProxy is intelligent enough to detect this and stop sending requests to that server for a period of time!