I posted a tweet yesterday about a migration and upgrade I was doing and received a couple replies asking to let them know how everything goes. I decided to create a blog post for everyone to read rather than replying in 140 character tweets. =)
Background: This customer has a very simple SMB setup as far as VMware goes. 3 servers in a single cluster, only a few vSwitches per host, and a HDS AMS disk subsystem. They wanted to upgrade to vSphere and are capable of taking some downtime to do it. (Not that downtime is required.) They bought 3 new Dell 710 servers to run vSphere on.
There are at least 20 different ways to go about this and I wanted to keep this upgrade as simple as possible Since the customer can take some downtime for their VMs I decided to do a Cold Migration of the VMs. This is by far the simplest and runs the least amount of risk for having problems with the VMs. The customer, while understanding VMware and the administration of it, could not troubleshoot issues once I was no longer onsite.
Here is the process that we walked through for a successful upgrade and migration:
Upgrade vCenter to version 4 - You need to read through the upgrade guide before you attempt this upgrade. There are some specific permissions changes to networks and datastores that could bite you after an upgrade if you don't understand the way that vCenter changes read-only attributes. We'll leave it at that. RTFM!
Install new ESXi servers on new hardware and add them to the existing cluster. Ensure all updates and patches have been applied. - Let's discuss a few key reasons that I pushed this user to go with ESXi. All new implementations I have done over the past year have all been done with ESXi.
A) They were not running any fancy monitoring tools in the Service Console. B) ESXi is a much smaller attack surface for hackers since RedHat is no longer running underneath. C) Because there is no RedHat, VMware has full control over all patches which will allow for much quicker turn around on bug fixes, etc. D) The user didn't have any scripts or jumpstart servers that required the Service Console to run. Even if they did, we could have re- written them to run against the vMA (Virtual Management Appliance). E) No Service Console! Personally I see more people get on the Service Console and make a mistake that causes a lot of problems. "Let's see what this command does." F) THERE IS NO TECHNICAL REASON NOT TO GO TO ESXi!!!!!!
Map out all networks and datastores attached to 3.5 cluster - Self explanatory. You need to know what you need to know.
Create identical networks on new vSphere cluster - Again SE. The VMs will have to connect to the network after the migration. If the networks aren't there, migration validation will fail.
Test cold migration to local storage and test networks with a test VM - Just test things out prior to the big outage.
Remediate any changes that need to be made. - SE. Fix any problems that pop up.
Present all datastores that reside on the old ESX servers to new ESXi servers. Verify connectivity, specifically LUN IDs. - Since the new ESXi boxes are part of the same cluster we can present the same LUNs/Datastores to them without any potential problems. Scan them and bring in the VMFS Datastores and verify everything looks OK.
Shut down all VMs - SE.
Cold migrate all VMs to new ESXi servers. - SE.
Remove all ESX servers from cluster. - This will ensure that during the next step DRS won't try and spread the workload to the old servers. Just make things cleaner and less checking on hosts for DRS as a whole.
Upgrade the VMware Tools and Virtual Machine Hardware on all VMs. - Using Update Manager go ahead and upgrade things prior to putting them back online for your users. A reboot is required so the users would see interruption anyway. I'm not going to walk you through this. RTFM!
Install the vMA for command line management of the ESXi hosts. - SE.
On the ESX vs. ESXi front, I truly believe that once you try it you won't even notice a difference and you'll probably like ESXi better. If you're an old ESX shop that loves the Service Console, then I would challenge you to use the vMA for a month and I would bet that you won't go back.