PCI passthru is a technique to preserve PCI slots at boot time and later on give these ressources to a (qemu) VM. Common usage is for VGA cards (then called “GPU passthru” often). But it is not limited to a GPU. I used it more extensive here . All is done on a Gentoo Linux with kernel-4.4.26 and qemu-2.8.0. I assume that some knowledge about qemu configuration and how to setup a standard VM is available.
The most important thing is that the hardware MUST support IOMMU. Unfortunately it is not a common feature in consumer hardware as of today. I think I don't have to mention virtualization as a requirement separately. I used the following hardware:
- Mainboard MSI C236M (Mini ATX) with chipset C236 (graphic, network, audio on-board)
- Intel Xeon E3-1235L v5
- Nvidia GeForce GT630 (GK208) as second graphic card (PCIe)
- some secondary PCIe cards (not important here)
At the BIOS level IOMMU and Virtualization (VT/x) have to be activated. The on-board graphic have to be initialized as primary device also. For sure - with an AMD CPU the related AMD features have to be used.
Configure the kernel
Some features have to be compiled into the kernel:
- kvm support
- iommu support
- virtio drivers
I recommend to build the virtio drivers as modules. Unfortunately the features are in different submenus, e.g. the virtio-scsi driver is located under “SCSI low-level drivers” (alone). Rebuild the kernel if necessary, but don't reboot yet.
To check if IOMMU is available put the following into a script (e.g.
for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d); do echo "IOMMU group $(basename "$iommu_group")"; for device in $(ls -1 "$iommu_group"/devices/); do echo -n $'\t'; lspci -nns "$device"; done; done
Make it executable and run it. The output should be like this (shortened):
$ ./find-iommu-groups IOMMU group 1 00:01.0 PCI bridge : Intel Corporation Sky Lake PCIe Controller (x16) [8086:1901] (rev 07) 01:00.0 VGA compatible controller : NVIDIA Corporation GK208 [GeForce GT 630 Rev. 2] [10de:1284] (rev a1) 01:00.1 Audio device : NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1) IOMMU group 2 00:02.0 VGA compatible controller : Intel Corporation Device [8086:191d] (rev 06) IOMMU group 3 00:08.0 System peripheral : Intel Corporation Sky Lake Gaussian Mixture Model [8086:1911] IOMMU group 4 ...
First it shows that IOMMU is available. Second, the important group in our case is IOMMU group 1. It contains the hardware we want to separate. The Nvidia Audio device will not be used further here. But fore sure it could be used.
Beware that the IOMMU implementation is broken on some hardware!
The vfio-pci driver
Let's check out the hardware with
lspci -k . I shortened the output to the graphic cards hardware:
00:02.0 VGA compatible controller: Intel Corporation Device 191d (rev 06) Subsystem: Micro-Star International Co., Ltd. [MSI] Device 7972 Kernel driver in use: i915 Kernel modules: i915 ... 01:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 630 Rev. 2] (rev a1) Subsystem: CardExpert Technology GK208 [GeForce GT 630 Rev. 2] Kernel driver in use: nvidia Kernel modules: nvidia_drm, nvidia ...
The important lines are
kernel driver in use: <driver name> . What we want to achieve is to replace the nvidia driver by
This will be reached by adding some options to the kernel command line. Thus we have to identify the hardware id's before (for sure we need this for the Nvidia card only):
$ lspci -nn | grep NVIDIA 01:00.0 VGA compatible controller : NVIDIA Corporation GK208 [GeForce GT 630 Rev. 2] [10de:1284] (rev a1) 01:00.1 Audio device : NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
The id is given in the second to last field above. Now we add the following options to the kernel command line:
kernel <image> ... intel_iommu=on vfio-pci.ids=10de:1284,10de:0e0f
vfio-pci  driver can be loaded dynamically at runtime too, but I don't prefer this way!
Now it is time to reboot. Afterwards it is a good idea to verify if the vfio-pci driver is used. Check with
lspci -k  again:
00:02.0 VGA compatible controller: Intel Corporation Device 191d (rev 06) Subsystem: Micro-Star International Co., Ltd. [MSI] Device 7972 Kernel driver in use: i915 Kernel modules: i915 ... 01:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 630 Rev. 2] (rev a1) Subsystem: CardExpert Technology GK208 [GeForce GT 630 Rev. 2] Kernel driver in use: vfio-pci Kernel modules: nvidia_drm, nvidia ...
If something goes wrong check
dmesg  and logs. Common failures are typos of the id or a module was not loaded.
The following is not a full qemu start script. It shows the important options only!
I like to use a variable to group options. The
$GPU  variable contains the options I use. Add the following (or similar) to the qemu start script:
GPU="-device vfio-pci,host=01:00.0,multifunction=on,x-vga=on -vga none" qemu-system-x86_64 -cpu host,kvm=off ... $GPU ... -cdrom <bootable-cd-image>
-cdrom  option is for an initial test. After connecting the output of the second VGA card (here: GT630) with a monitor (screen) the VM can be started for a initial test. It could take a few seconds but the second screen should show something. Otherwise check out the above careful again. To stop the VM simply kill it at the host system.
As maybe realized (or maybe not) there was no keyboard/mouse available during the initial test, because this depends on the emulated qemu VGA driver. Thus we have to bind a keyboard/mouse to the VM through the device id. Assuming USB use
$ lsusb Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 003: ID 10d5:000d Uni Class Technology Co., Ltd Bus 001 Device 002: ID 046d:c330 Logitech, Inc. Bus 001 Device 127: ID 046d:c52b Logitech, Inc. Unifying Receiver Bus 001 Device 126: ID 05e3:0608 Genesys Logic, Inc. Hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
In the output above is the keyboard the device in line 3, the mouse device is line 4. Add the id to the qemu start script like (example: keyboard only):
KBD="-usb -usbdevice host:046d:c330" qemu-system-x86_64 -cpu host,kvm=off ... $GPU $KBD... -cdrom <bootable-cd-image>
Again, this is just an example. The mouse could be bound similar with its own id. However,
I don't use this technique, because the USB device is not available at the host anymore as long the VM is up and running!
But it is good enough for testing purposes. If you bind the keyboard and/or the mouse it is a good idea to have a second one available.
The above shows how to use some PCI hardware from the qemu host inside a VM exclusive. Using the VGA card is just an example, but it is not limited to. Other PCI devices can be separated this way as well.