Hot-swapping virtio disks on QEMU

Many of the articles that cover hot-swapping disks on QEMU or QEMU-KVM either assume you’re using virt-manager/virtsh, cover USB disks and not virtio, or are for an old version of QEMU. The QEMU monitor has changed quite a bit over the years, apparently! This article will explain how to hot-remove and hot-add virtio-blk drives using QEMU 3.1 (the version in Debian Buster).

Before we begin, keep in mind that QEMU separates the concept of devices into two halves: the host part, and the guest part. For storage devices, the host part (called the drive) is properties like the image file, format, caching/AIO settings, etc. The guest part (called the device) is the virtio-blk PCI device. The device uses the drive. This separation allows you to use any kind of drive “backing” (file, memory, iSCSI target, etc.) with any kind of guest-visible device (virtio-blk, virtualized SATA, etc.).

The virtio-blk PCI device does not actually support hot swapping a drive. If you try to remove the drive, you’ll be informed that virtio-blk doesn’t support hot-swapping and you may incorrectly conclude that you can’t hot-swap virtio-blk drives.

This isn’t true! Instead of removing the drive (remember, this is the host part) you have to remove the PCI device (the guest part).

When you remove a virtio-blk PCI device, the guest will see a PCI hot-remove and QEMU will automatically destroy both the device and drive objects.

Hot-adding a disk works similarly, in the opposite order: first we create the drive object in QEMU and then we create the PCI device, giving it the ID of the drive it should use.

With that conceptual model, let’s actually hot-swap a disk!

First, we have to gain access to the QEMU monitor. Presumably you already know how to do this, but if you do not, the easiest way is to press CTRL+a and then c from the VM console in your terminal.

Once you are at the monitor prompt, you need to get a list of the block devices by typing info block. You’ll see a human-readable dump of all of the block storage drives (not devices!) that exist in QEMU. Here’s the output for a VM I was using to test LVM caching:

virtio0 (#block145): root1.img (raw)
    Attached to:      /machine/peripheral-anon/device[1]/virtio-backend
    Cache mode:       writeback, direct

virtio1 (#block307): root2.img (raw)
    Attached to:      /machine/peripheral-anon/device[2]/virtio-backend
    Cache mode:       writeback, direct

virtio2 (#block598): cache1.img (raw)
    Attached to:      /machine/peripheral-anon/device[3]/virtio-backend
    Cache mode:       writeback, direct

virtio3 (#block787): cache2.img (raw)
    Attached to:      /machine/peripheral-anon/device[4]/virtio-backend
    Cache mode:       writeback, direct

ide1-cd0: [not inserted]
    Attached to:      /machine/unattached/device[24]
    Removable device: not locked, tray closed

floppy0: [not inserted]
    Attached to:      /machine/unattached/device[17]
    Removable device: not locked, tray closed

sd0: [not inserted]
    Removable device: not locked, tray closed

Each pair of disks (root1+root2 and cache1+cache2) are in an md-raid mirror, so we can yank one from the system. We do that using device_del to remove the PCI device… but how do we know what device to remove?

Thankfully, the output includes this as part of the attachment information. Let’s remove the PCI device for the cache2.img disk. Note that it is attached to /machine/peripheral-anon/device[4]/virtio-backend. You might conclude that we need to remove this device — but that won’t work. We don’t want to remove the virtio backend, we want to remove the device itself: device_del /machine/peripheral-anon/device[4].

Viola! The disk is gone, and the guest operating system notices. The removed device is automatically kicked from the mirror, which becomes degraded.

Now, let’s put the disk back. We’ll use the QEMU command-line argument that initially created this disk as a reference for its settings, so we can recreate it exactly:

-drive file=cache2.img,media=disk,cache=none,aio=native,if=virtio,format=raw

Easy enough, right? Let’s use drive_add:

(qemu) drive_add 0 file=cache2.img,media=disk,cache=none,aio=native,if=virtio,format=raw
Can't hot-add drive to type 7

Well, that’s not very helpful. It turns out, you can’t do this in one step. You have to first create the drive without an attachment, then create the device in a second step. To do this, we need to give the drive an ID so we can refer to it later.

(qemu) drive_add 0 file=cache2.img,media=disk,cache=none,aio=native,if=none,id=cache2,format=raw
OK
(qemu) device_add virtio-blk-pci,drive=cache2

Let’s look in the guest and… yep! The guest sees the disk. We can re-add it to the mirror and be on our way.

(Wait, what’s that 0 in the drive_add command? That’s just the slot number for the drive. For drives to be used with a virtio-blk device it’s always zero, because a virtio-blk controller always manages exactly one device. Presumably you’d use other numbers with virtualized controllers that manage multiple drives, such as an IDE controller with a master and slave.)

Leave a Reply

Your email address will not be published. Required fields are marked *