Some months ago I implemented these steps and saw a striking improvement in the performance of our applications (between 2x and 4x depending on the query):
- Firstly, if you’re using iSCSI, make sure the network switches are ones which have been validated as ok by your storage vendor. I’ve run into poor performance using ones which ought to have worked and offered all the required features (Flow Control, Jumbo Frames), but which in reality were causing problems.
- If you’re using iSCSI with a software initiator (be it at hypervisor or guest OS level), consider using Jumbo Frames to reduce I/O related CPU activity.
- Move your VMDKs to a new SAN VMFS volume that is Thick-Provisioned. Although in my environment the EqualLogic array extends LUNs by 16MB at a time, over time this can fragment things appreciably. With a 1TB LUN this can get pretty bad.
- Use Storage vMotion to make the VMDK files Thick Provisioned too. This eliminates fragmentation of the VMDK since it’s no longer growing in small increments. I think this made quite a big difference, despite a whitepaper from VMware denying a performance impact. The reasoning is that since the storage array has a big cache, having the data fragmented all over the disks shouldn’t really matter that much. I don’t really believe it, and my own results seemed to prove otherwise (what about a backup operation which will need to read your data sequentially in one long pass?). My SQL server vMotion operations were very slow compared to other servers, suggesting they were heavily fragmented in their old location.
- Move all of your databases (including these system ones: msdb, model, master), their logs, and fulltext catalogs to a SAN LUN directly attached inside the Guest VM using Microsoft iSCSI initiator and your SAN vendor’s Integration Tools. If you use Vmxnet3 adapters then the TCP calculation overhead will be handled by the hypervisor which in turn can be passed to Broadcom bnx2 TOE NICs if you’re using vSphere 4.1. Having the databases on a separate LUN allows off-host backup of the databases using Backup Exec with your SAN vendors’ SQL-aware VSS Hardware Provider. Database backups can then occur at any time without any impact to the SQL server’s performance. I have written a dedicated post on this subject.
- Create that SAN partition with its NTFS blocks aligned with the SAN’s own disk blocks to ensure no needless multiplication of I/O (64KB offset for EqualLogic – full explanation here).
- Keep TempDB on the C: drive in its default location. That way I/O to that database is segregated and can be cached differently since it is using VMware’s iSCSI initiator and not the Microsoft initiator. Typically TempDB has high I/O, but it’s not a database that you need to back up so you don’t need to be able to snapshot it on the SAN.
- Create an SQL management task to rebuild and defragment the database indexes and update their statistics every week (say, Sunday at 3:00am).
- Change database file autogrow amounts from 1MB to 64MB to mitigate NTFS-level fragmentation of the database MDF files as they grow.