Home > VMWare > Configuring VMware High Availability fails

Configuring VMware High Availability fails

We had an interesting situation yesterday where we tried to add an ESX4.1 host to a cluster that had its vmkernel and Service Console ports on different VLANs to the existing cluster.

Configuring VMware High Availability fails with the error: Cannot complete the configuration of the HA agent on the host

which says:

This issue occurs if all the hosts in the cluster do not share the same service console or management network configurations. Some hosts may have service consoles using a different name or may have more service consoles than other hosts.
For example, this error may also occur if the VMkernel gateway settings are not the same across all hosts in the cluster. To reconfigure the setting, right-click on the hosts with this error and select Reconfigure for HA.
Address the network configuration differences between the hosts if you are going to use the Shut Down or Power Off isolation responses because these options trigger a VMware HA isolation in the event of Service Console or Management Network failures.
If you are using the Leave VM Powered on isolation response, the option to ignore these messages is available in VMware VirtualCenter 2.5 Update 3.
To configure VirtualCenter to ignore these messages, set the advanced option das.bypassNetCompatCheck to true:
Note: When using the das.bypassNetCompatCheck option, the heartbeat mechanism during configuration used in VirtualCenter 2.5 only pairs symmetric IP addresses within subnets across nodes. For example, in a two node cluster, if host A has vswif0 “Service Console” 10.10.1.x 255.255.255.0 and vswif1 “Service Console 2” 10.10.5.x and host B has vswif0 “Service Console” 10.10.2.x 255.255.255.0 and vswif1 “Service Console 2” 10.10.5.x, the heartbeats only happen on vswif1. Starting in vCenter Server 4.0, they can be paired across subnets if pings are allowed across the subnets. However, VMware recommends having them within subnets.
  1. Right-click the cluster, then click Edit Settings.
  2. Deselect Turn on VMware HA.
  3. Wait for all the hosts in the cluster to unconfigure HA.
  4. Right-click the cluster, and choose Edit Settings.
  5. Select Turn on Vmware HA, then select VMware HA from the left pane.
  6. Select Advanced options.
  7. Add the option das.bypassNetCompatCheck with the value True.
  8. Click OK on the Advanced Options screen, then click OK again to accept the cluster setting changes.
  9. Wait for all the ESX hosts in the cluster to reconfigure for HA.

We made the changes in a test environment, and tested and voila! problem solved.

Now I would not really advocate running  production environment using this, as a router / switch outage could cause false alerts, or even HA to be invoked – but for test environments that are sometimes not as ‘pretty’ as you’d like, It is worth considering using this to still be able to use/test  HA.

Categories: VMWare Tags:
  1. No comments yet.
  1. No trackbacks yet.
You must be logged in to post a comment.