There are several big motivators to moving over to vSphere 4.1 with respect to storage. Firstly, there’s support for vStorage APIs in new EqualLogic array firmwares (starting at v5.0.0 which sadly, together with 5.0.1 have been withdrawn pending some show-stopping bugs). VM snapshot and copy operations will be done by the SAN at no I/O cost to the hypervisor. Next there’s the support for vendor-specific Multipathing Extension Modules – EqualLogic’s one is available for download under the VMware Integration category. Finally, there’s the long overdue TCP Offload Engine (TOE) support for Broadcom bnx2 NICs. All of this means a healthy increase in storage efficiency.
If you’re upgrading to vSphere 4.1 and have everything set up as per Dell EqualLogic’s vSphere 4.0 best practice documents you’ll first need to:
- Upgrade vCenter and move it to a 64bit OS (which can fail)
- Upgrade the hypervisors using vihostupdate.pl as per VMware’s upgrade guide, taking care to backup their configs first with esxcfg-cfgbackup.pl
Once that’s done choose an ESXi host to update, and put it in Maintenance Mode.
Make a note of your iSCSI VMkernel port IP addresses.
Make sure your ScratchConfig (Configuration -> Advanced Settings) is set to local storage. Reboot and check the change has persisted.
If the server has any Broadcom bnx2 family adapters they will now be treated as iSCSI HBAs so they will each have a vmhba designation. So, to unassign the previous explicit bindings to the Software iSCSI Initiator you need to check for its new name in the Storage Adapters configuration page.
You can’t unbind the VMkernel ports while there is an active iSCSI session using them so edit the properties of the Software iSCSI Initiator and remove the Dynamic and Static targets, then perform a rescan. Find your bound VMkernel ports using the vSphere CLI (replacing vmhba38 with the name of your software initiator):
bin\esxcli --server svr --username user --password pass swiscsi nic list -d vmhba38
Remove each bound VMkernel port like so (assuming vmk1-4 were listed as bound in the last step):
bin\esxcli --server svr --username user --password pass swiscsi nic remove -n vmk1 -d vmhba38 bin\esxcli --server svr --username user --password pass swiscsi nic remove -n vmk2 -d vmhba38 bin\esxcli --server svr --username user --password pass swiscsi nic remove -n vmk3 -d vmhba38 bin\esxcli --server svr --username user --password pass swiscsi nic remove -n vmk4 -d vmhba38
Now you can disable the Software iSCSI Initiator using the vSphere Client and then remove all the VMkernel ports and your iSCSI vSwitches.
Take note at this point that, according to the release notes PDF for the EqualLogic MEM driver, the Broadcom bnx2 TOE-enabled driver in vSphere 4.1 does not support jumbo frames. This information is further on in the document and unfortunately I only read it after I had already configured everything with jumbo frames so I had to start again. Any improvement they offer is kind of moot here since the Broadcom TOE will take over all the strenuous TCP calculation duties from the CPU, and is probably able to cope with traffic at line speed even at 1500 bytes per packet. I guess it could affect performance at the SAN end so perhaps they will work on supporting a 9000 byte MTU in forthcoming releases.
Make sure you set the MTU back to 1500 for any software initiators running in your VMs that used jumbo frames!
Re-patch your cables so you’re using your available TOE NICs for storage. On a server like the Dell PowerEdge R710 the four Broadcom TOE NICs are in fact two dual chips. So if you want to maximize your fault tolerance, be sure to use vmnic0 & vmnic2 as your iSCSI pair, or vmnic1 & vmnic3.
Log in to your EqualLogic Group Manager and delete the CHAP user you were using for the Software iSCSI Initiator for this ESXi host. Create new entries for each hardware HBA you will be using. Copy the intiator names from the vSphere GUI, and be sure to grant them access in the VDS/VSS pane too. Add these users to the volume permissions, and remove the old one.
Using vSphere CLI install the Mutipath Extension Module:
setup.pl --install --server svr --username root --password pass --bundle dell-eql-mem-188.8.131.52413.zip
Reboot the ESXi host and run the setup script in interactive configuration mode. For multiple value answers, comma separate them:
setup.pl --server svr --username root --password pass --configure
If you have Broadcom TOE NICs say yes to hardware support. This script will set up the vSwitch and the VMkernel ports and take care of the bindings (thanks Dell!):
Configuring networking for iSCSI multipathing:
vswitch = vSwitchISCSI
mtu = 1500
nics = vmnic1 vmnic3
ips = 192.168.100.95 192.168.100.96
netmask = 255.255.255.0
vmkernel = iSCSI
EQL group IP = 192.168.100.112
Creating vSwitch vSwitchISCSI.
Setting vSwitch MTU to 1500.
Creating portgroup iSCSI0 on vSwitch vSwitchISCSI.
Assigning IP address 192.168.100.95 to iSCSI0.
Creating portgroup iSCSI1 on vSwitch vSwitchISCSI.
Assigning IP address 192.168.100.96 to iSCSI1.
Creating new bridge.
Adding uplink vmnic1 to vSwitchISCSI.
Adding uplink vmnic3 to vSwitchISCSI.
Setting new uplinks for vSwitchISCSI.
Setting uplink for iSCSI0 to vmnic1.
Setting uplink for iSCSI1 to vmnic3.
Bound vmk1 to vmhba34.
Bound vmk2 to vmhba36.
Refreshing host storage system.
Adding discovery address 192.168.100.112 to storage adapter vmhba34.
Adding discovery address 192.168.100.112 to storage adapter vmhba36.
Rescanning all HBAs.
Network configuration finished successfully.
Now go back to your active HBAs and enter the new CHAP credentials. Re-scan and you should see your SAN datastores.
Recreate a pair of iSCSI VM Port Groups for any VMs that may use their own software initiators (very convenient for off-host backup of Exchange or SQL), making sure to explicitly set only one network adapter active, and the other to unused. Reverse the order for the second VM port group. Notice that setup.pl has done this for the VMkernel ports which it created.
Reboot again for good measure since we’ve made big changes to the storage config. I noticed at this point that on my ESXi hosts the Path Selection Policy for my EqualLogic datastore reset itself to Round Robin (VMware). I had to manually set it back to DELL_PSP_EQL_ROUTED. Once I had done that it persisted after a reboot.