Proxmox - Implementing Live Migration of Virtual Machines
Intro
Live migration in Proxmox VE allows you to move running virtual machines (VMs) between nodes in a cluster with minimal downtime. This feature is essential for load balancing, hardware maintenance, and high availability. In this guide, we’ll explore advanced concepts in live migration, including prerequisites, shared storage configurations, troubleshooting common issues, and optimizing the migration process for large VMs.
Step 1: Prerequisites for Live Migration
1.1 Cluster Setup
Live migration requires a Proxmox cluster. To create a cluster:
- On the primary node:
1
pvecm create my-cluster
- On additional nodes, join the cluster:
1
pvecm add <primary-node-ip>
Verify the cluster status:
1
pvecm status
1.2 Shared Storage
Shared storage is required so that VM disk images are accessible by all nodes in the cluster. Supported options include:
- NFS:
1 2
apt install nfs-common mount <nfs-server-ip>:/shared-storage /mnt/nfs
- iSCSI with LVM:
1 2 3 4
iscsiadm -m discovery -t sendtargets -p <iscsi-server-ip> iscsiadm -m node --login pvcreate /dev/sdX vgcreate vg_iscsi /dev/sdX
- Ceph RBD (for distributed storage):
1 2
pveceph install ceph-deploy new <node-names>
Add the shared storage to Proxmox via Datacenter > Storage > Add.
1.3 Resource Availability
Ensure the target node has sufficient CPU, memory, and storage to accommodate the migrating VM. Proxmox automatically checks these requirements before migration.
Step 2: Initiating a Live Migration
2.1 Using the Web Interface
- Navigate to the VM you want to migrate.
- Click on Migrate.
- Select the target node from the list.
- Click Migrate to start the process.
2.2 Using the Command Line
Run the following command to migrate a VM (e.g., VM ID 103
) from one node to another:
1
qm migrate 103 <target-node>
Step 3: Understanding the Migration Process
3.1 Pre-Copy Phase
Proxmox begins by copying memory pages from the source node to the target node while the VM continues running.
3.2 Stop-and-Copy Phase
Once most memory pages are copied, Proxmox briefly pauses the VM to synchronize remaining memory pages and CPU state.
3.3 Resume on Target Node
The VM resumes operation on the target node with minimal downtime (usually milliseconds).
Step 4: Advanced Configurations
4.1 Optimizing Migration for Large VMs
For VMs with large memory allocations or high disk I/O:
- Use a high-speed network (e.g., 10GbE) for faster data transfer.
- Enable compression during migration:
1
qm migrate --with-local-disks --online --compress <vmid> <target-node>
4.2 Handling Local Disks
If a VM uses local storage instead of shared storage, add --with-local-disks
during migration:
1
qm migrate <vmid> <target-node> --with-local-disks
This transfers local disk data over the network.
Step 5: Post-Migration Verification
After migration:
- Confirm that the VM is running on the target node via the web interface or CLI:
1
qm status <vmid>
- Check resource usage on both nodes to ensure proper load balancing.
- Verify application functionality within the VM.
Step 6: Troubleshooting Common Issues
6.1 Shared Storage Not Accessible
Ensure that all nodes can access shared storage:
- Test NFS mounts:
1
ls /mnt/nfs
- Verify iSCSI connections:
1
iscsiadm -m session
6.2 CD/DVD Drive Attached
Live migration fails if a CD/DVD drive is attached to the VM. Detach it before migrating:
1
qm set <vmid> -ide2 none
6.3 CPU Compatibility Issues
Ensure CPUs on source and target nodes are compatible:
- Enable CPU flags in
/etc/pve/qemu-server/<vmid>.conf
:1
cpu: host
Step 7: High Availability and Automation
7.1 High Availability (HA)
Enable HA to automatically migrate VMs in case of node failure:
- Assign HA roles via Datacenter > HA > Add.
- Configure HA policies (e.g.,
restart
,migrate
).
Verify HA status:
1
ha-manager status
7.2 Automate Migrations with Scripting
Use Proxmox’s API or CLI tools to automate migrations during maintenance windows.
Example script for migrating all VMs from one node to another:
1
2
3
4
5
6
7
8
#!/bin/bash
SOURCE_NODE="node1"
TARGET_NODE="node2"
for VMID in $(qm list | grep running | awk '{print $1}'); do
echo "Migrating VM $VMID from $SOURCE_NODE to $TARGET_NODE..."
qm migrate $VMID $TARGET_NODE --online
done
Best Practices for Live Migration
Test Before Production
Test live migration on non-critical VMs before deploying it in production environments.Use Redundant Networks
Configure multiple network interfaces for migration traffic to avoid disruptions.Monitor Performance
Use tools likehtop
or Proxmox’s built-in monitoring to track resource usage during and after migration.Plan Maintenance Windows
Schedule migrations during low-traffic periods to minimize user impact.
Conclusion
Live migration in Proxmox VE is a powerful feature that enables seamless movement of VMs between cluster nodes with minimal downtime, ensuring flexibility and high availability in virtualized environments. By following these best practices, you can optimize your live migration process for even large-scale deployments while maintaining performance and reliability.