Last week, before the end of VMworld, we had a session with one of my customer’s to discuss ESX performance. This discussion was lead by one of VMware’s performance gurus Scott Drummonds. Scott works as the Manager of VMware’s Performance Marketing team, working with the VMware field teams and customers to provide and advise on product performance issues.
During this conversation, I wrote down some key quotes from Scott that I wanted to share. For those that have been around VMTN for a while, you might already know some or all of these performance related suggestions. However, they are key enough that I wanted to draw attention to them.
- “Administration work on the LUN has an impact on performance more than the number of hosts.”
When admin work is done on the LUN is the only time that SCSI reservation locking is used. So placing more VMDKs on the same LUN doesn’t directly impact the performance. See Scalable Storage Performance with VMware ESX Server 3.5 Vroom! blog post for additional details.
- “Put Windows VMs and Linux VMs in sperate clusters because they can share memory more efficiently…”
Part of the memory optimization doen within the ESX hyperadvisor is to share common memory used by multiple VMs. So if you are running 10 Windows VM, the memory used to store the majority of the Windows OS is shared amonst all 10 VMs since the memory contents are the same and do not change. This is what enables memory overcommitting. If you start to mix and match different OS VMs on the same host, this advantage can be minimized.
- “High storage latencies is the largest source of performance problems that I see…”
You can monitor the storage latencies within VC by changing the stats level. See Scott’s Understanding VirtualCenter Performance Statistics performance community doc for more details.
- “Once your swapping, you’re in trouble…”
Seeing swapping within your VM means that you are not allocating enough memory to the VM for the applications that are running. Watch this closely and add memory or migrate VMs when and where needed.
- “RHEL5 does 1000 interrupts per second for “greater precision” (versus most OS’ which only do 100) which can add up to un-wanted overhead in the OS…”
This is an issue with the Linux Timer Rates that Scott talks about. There is a small configuration change that you can make in RHEL5 SMP that will provide an across the board increase in system performance.
Check out all of Scott’s VMTN articles, there is a wealth of information in them.