Estimating the time needed for NSX upgrades and maintenance windows has been a topic that’s needed attention for some time now. Many of the VMware NSX field engineers know from experience how long an NSX upgrade may take based on environment size, but I’ve found that there’s little documentation around how to determine the time required to perform an upgrade, based the size of the environment.
VMware NSX-v upgrades are performed in order of NSX Managers, then Controllers, onto Edge Gateways and then the vSphere hosts themselves. So, a good method of determining how long an upgrade will take, is by calculating all the individual component upgrade times, adding some buffer for the unexpected and then summing it all up. I’ve detailed the NSX upgrade process here in a previous blog post, with step-by-step screenshots, to provide you with what to expect. Official VMware NSX documentation should be used to perform the actual upgrade.
*As a special note, NSX-t upgrades are done in reverse order, starting with hosts / transport node first and then on to Edge Gateways, Controllers and then NSX Manager
After performing a fair amount of upgrades in the field, NSX Managers and Controllers have been very reliable in terms of component upgrades. Edge Service Gateways in an HA pair, on occasion, will fail an NSX component upgrade, but the resolution of powering the VM off, powering it back on, waiting for services to start and then retrying the upgrade, has been fairly quick remediation.
NSX component upgrade times as follows:
- NSX Manager – 30 minutes
- NSX Controller – 5-10 minutes (each)
- NSX Edge Service Gateway – 15 minutes (each)
- NSX vSphere Host – 15 minutes (each)
*Ensure to add time for DRS evacuations and reboot to each host time if applicable. NSX host upgrades after 6.3 are reboot-less, but evacuation still applies.
Each of these times have a small buffer for testing return to service of each component. Conditions can vary based on load and scale. If you have a test NSX deployment, you’ll be better able to see how your environment performs and tune in times a bit closer doing a dry run there. Disk I/O and performance on the Manager and Edge VMs take a fair amount of time, but the number of NSX vSphere upgrades are usually the biggest single factor in upgrade times. Remember, host density and host memory have a lot to do with estimating NSX vSphere upgrade times. Hosts with high VM densities can take in excess of an hour to evacuate and physical servers with >1TB of memory take quite a bit of time to “count up” at BIOS boot. All things to consider and add in to your estimate.
Here’s an example time estimate calculation for an NSX 6.3 upgrade on a five (5) host cluster:
- NSX Manager (1) – 30 minutes
- NSX Controller (3) – 30 minutes
- NSX Edge Service Gateway HA Pair (2) – 30 minutes
- NSX vSphere Hosts (5) – 75 minutes
The estimated time for this example would be 165 minutes or 2 hours and 45 minutes, which is very close to the actual 2.5 hours it took to perform the upgrade in this lab. As I mentioned, make sure to check out the preview of the upgrade and (please) use the official documentation to create your upgrade “runbook”. As always, opening a support ticket with VMware support containing the version details of your upgrade, number of components, and an architectural drawing will greatly reduce the time needed to engage support, should you need it.