I surrender, AMD MxGPU GIM on XCP-ng

Hello!
New to the forum (and glad it exists), longtime fan of the channel. I’m killing myself trying to get AMD GIM 2.0 running on XCP-ng 8.1. I’ve googled, I’ve bookmarked, I’ve got nothing to show for it (except for a few insights). Project is for a proof of concept, AMD S7150X2 on 3900X/B450 (consumer grade but with required features needed to execute). I’ve got about 4 solid days into the project and its about to get the axe due to the cost of time. I’m really hoping someone could shed some light on what the deal is.

Yes, SR-IOV, IOMMU, ARI, Above 4G, UEFI and even set MMIO High Size 2T (41bit) just in case are all enabled. XCP-ng Center fully shows the GPU as a vGPU and has assignable configurations when creating a VM. Trying to start a VM with a vGPU assigned results in a “No free virtual functions found” because the GPU is not actually initalized. lspci -k shows no kernel driver is in use since amdgpu is blacklisted and GIM failes to load.

Grub: I’ve removed xen-pciback because GIM wouldn’t detect the GPUs when they were allocated.
I’ve added amd_iommu but it hasn’t made any difference.

	search --label --set root root-brgkot
	multiboot2 /boot/xen.gz dom0_mem=2656M,max:2656M watchdog ucode=scan dom0_max_vcpus=1-8 crashkernel=256M,below=4G vga=mode-0x0311
	module2 /boot/vmlinuz-4.19-xen root=LABEL=root-brgkot ro nolvm hpet=disable console=hvc0 quiet vga=785 splash plymouth.ignore-serial-consoles pci=realloc pci=assign-busses amd_iommu=on
	module2 /boot/initrd-4.19-xen.img

modprobe GIM: everything looks great until it errors unexpectedly. SR-IOV is detected and enabled, The VFs are created then it fails/dumps and moves onto the next GPU (S7150X2 is dual GPU) to repeate the failure all over again.

[  488.525608] AMD Virt GIM API
[  488.528239]        gim info:(gim_init:197) *******AMD GIM init
[  488.528241]        gim info:(print_gim_version:62) GPU IOV MODULE (GIM) - version 2.00.0000
[  488.528242]        gim info:(gim_init:200) Copyright (c) 2014-2016 AMD Corporation.
[  488.528257]        gim info:(parse_config_file:295) AMD GIM fb_option = 0
[  488.528258]        gim info:(parse_config_file:295) AMD GIM sched_option = 0
[  488.528259]        gim info:(parse_config_file:295) AMD GIM vf_num = 0
[  488.528260]        gim info:(parse_config_file:295) AMD GIM pf_fb = 0
[  488.528261]        gim info:(parse_config_file:295) AMD GIM vf_fb = 0
[  488.528263]        gim info:(parse_config_file:295) AMD GIM sched_interval = 7
[  488.528264]        gim info:(parse_config_file:295) AMD GIM fb_clear = 1
[  488.528265]        gim info:(parse_config_file:295) AMD GIM hang_detect_timeout = 100
[  488.528266]        gim info:(parse_config_file:295) AMD GIM max_quanta = 1000
[  488.528268]        gim info:(parse_config_file:295) AMD GIM self_switch = 500
[  488.528269]        gim info:(parse_config_file:295) AMD GIM exclusive = 1600
[  488.528270]        gim info:(parse_config_file:295) AMD GIM fair_scheduling = 0
[  488.528272]        gim info:(parse_config_file:295) AMD GIM debug_level = 3
[  488.528273]        gim info:(parse_config_file:295) AMD GIM clear_fb_on_flr = 0
[  488.528274]        gim info:(parse_config_file:295) AMD GIM clear_fb_on_free_vf = 1
[  488.528276]        gim info:(init_config:445) INIT CONFIG
[  488.541744]        gim error:(gim_probe:123) gim_probe(09:00.0)
[  488.541754]        gim info:(alloc_adapter:454) allocate adapter for PF 0x0900
[  488.541755]        gim info:(alloc_adapter:457) Found free adapter at index 0
[  488.541759] PF0    gim info:(SetNewAdapter:1096) curr allocated at 00000000fcfb74fe
[  488.541760] PF0    gim info:(SetNewAdapter:1102) Can't disable ATS --> Not enabled in the first place
[  488.541761] PF0    gim info:(SetNewAdapter:1113) SRIOV is supported
[  488.541762] PF0    gim info:(SetNewAdapter:1121) found PCI bridge device
[  488.541763] PF0    gim info:(SetNewAdapter:1124) found: 08:8.0
[  488.541821] PF0    gim info:(SetNewAdapter:1147) mmio_base = 0000000005501ffd
[  488.542117] PF0    gim info:(SetNewAdapter:1149) doorbell = 000000008524b443
[  488.571753] PF0    gim info:(SetNewAdapter:1151) pf.fb_va = 00000000e3e3d491
[  488.571774]        gim info:(sriov_is_ari_enabled:180) PCI_SRIOV_CAP = 0x00000002
[  488.571775]        gim info:(sriov_is_ari_enabled:190) PCI_SRIOV_CTRL = 0x00000010
[  488.571776]        gim info:(sriov_is_ari_enabled:194) PCI_SRIOV_CTRL_ARI is set --> ARI is supported
[  488.571778] PF0    gim info:(program_ari_mode:957) Read bif_strap8 = 0x00000004
[  488.571778] PF0    gim info:(program_ari_mode:963) program_ari_mode - Set ARI_Mode = PF_BUS
[  488.571779] PF0    gim info:(program_ari_mode:978) Write bif_strap8 = 0x00000004
[  488.571779] PF0    gim info:(gim_read_rom_from_reg:634) Reading VBios from ROM
[  488.571946] PF0    gim info:(gim_read_VBIOS:695) VBIOS starts:  0x55, 0xaa
[  488.571947] PF0    gim info:(gim_read_VBIOS:698) VBios size is 0x10000
[  488.571968] PF0    gim info:(gim_read_VBIOS:708) pVBIOS allocated at 000000004aad0df3 for size of 0x80000
[  488.571968] PF0    gim info:(gim_read_rom_from_reg:634) Reading VBios from ROM
[  490.035704] PF0    gim info:(gim_read_VBIOS:718) BIOS Version Major 0xF Minor 0x31
[  490.035770] PF0    gim info:(gim_read_VBIOS:729) VBios Checksum = 0x541800
[  490.035771] PF0    gim info:(gim_read_VBIOS:738) Valid video BIOS image, size = 0x10000, check sum is 0x541800
[  490.035771] PF0    gim info:(gim_read_VBIOS:739) Read in full Vbios image of size = 0x80000
[  490.035774] PF0    gim info:(SetNewAdapter:1248) Scheduler Time interval set to 7 msec
[  490.035775]        gim info:(EnableSriov:398) Enable SRIOV
[  490.035775]        gim info:(EnableSriov:399) Enable SRIOV vfs count = 16 
[  490.139921] pci 0000:09:02.0: [1002:692f] type 00 class 0x030000
[  490.139955] pci 0000:09:02.0: enabling Extended Tags
[  490.140197] pci 0000:09:02.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.140273] pci 0000:09:02.1: [1002:692f] type 00 class 0x030000
[  490.140303] pci 0000:09:02.1: enabling Extended Tags
[  490.140505] pci 0000:09:02.1: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.140565] pci 0000:09:02.2: [1002:692f] type 00 class 0x030000
[  490.140597] pci 0000:09:02.2: enabling Extended Tags
[  490.140810] pci 0000:09:02.2: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.140874] pci 0000:09:02.3: [1002:692f] type 00 class 0x030000
[  490.140912] pci 0000:09:02.3: enabling Extended Tags
[  490.141121] pci 0000:09:02.3: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.141167] pci 0000:09:02.4: [1002:692f] type 00 class 0x030000
[  490.141199] pci 0000:09:02.4: enabling Extended Tags
[  490.141406] pci 0000:09:02.4: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.141471] pci 0000:09:02.5: [1002:692f] type 00 class 0x030000
[  490.141508] pci 0000:09:02.5: enabling Extended Tags
[  490.141721] pci 0000:09:02.5: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.141770] pci 0000:09:02.6: [1002:692f] type 00 class 0x030000
[  490.141806] pci 0000:09:02.6: enabling Extended Tags
[  490.142010] pci 0000:09:02.6: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.142058] pci 0000:09:02.7: [1002:692f] type 00 class 0x030000
[  490.142091] pci 0000:09:02.7: enabling Extended Tags
[  490.142329] pci 0000:09:02.7: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.142381] pci 0000:09:03.0: [1002:692f] type 00 class 0x030000
[  490.142415] pci 0000:09:03.0: enabling Extended Tags
[  490.142623] pci 0000:09:03.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.142671] pci 0000:09:03.1: [1002:692f] type 00 class 0x030000
[  490.142702] pci 0000:09:03.1: enabling Extended Tags
[  490.142904] pci 0000:09:03.1: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.142954] pci 0000:09:03.2: [1002:692f] type 00 class 0x030000
[  490.142991] pci 0000:09:03.2: enabling Extended Tags
[  490.143231] pci 0000:09:03.2: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.143280] pci 0000:09:03.3: [1002:692f] type 00 class 0x030000
[  490.143312] pci 0000:09:03.3: enabling Extended Tags
[  490.143520] pci 0000:09:03.3: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.143569] pci 0000:09:03.4: [1002:692f] type 00 class 0x030000
[  490.143602] pci 0000:09:03.4: enabling Extended Tags
[  490.143811] pci 0000:09:03.4: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.143860] pci 0000:09:03.5: [1002:692f] type 00 class 0x030000
[  490.143928] pci 0000:09:03.5: enabling Extended Tags
[  490.144161] pci 0000:09:03.5: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.144212] pci 0000:09:03.6: [1002:692f] type 00 class 0x030000
[  490.144248] pci 0000:09:03.6: enabling Extended Tags
[  490.144452] pci 0000:09:03.6: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.144501] pci 0000:09:03.7: [1002:692f] type 00 class 0x030000
[  490.144531] pci 0000:09:03.7: enabling Extended Tags
[  490.144747] pci 0000:09:03.7: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  490.144804]        gim info:(EnumerateVFs:128) vf found: 09:2.0
[  490.144806]        gim info:(EnumerateVFs:128) vf found: 09:2.1
[  490.144807]        gim info:(EnumerateVFs:128) vf found: 09:2.2
[  490.144810]        gim info:(EnumerateVFs:128) vf found: 09:2.3
[  490.144811]        gim info:(EnumerateVFs:128) vf found: 09:2.4
[  490.144812]        gim info:(EnumerateVFs:128) vf found: 09:2.5
[  490.144813]        gim info:(EnumerateVFs:128) vf found: 09:2.6
[  490.144814]        gim info:(EnumerateVFs:128) vf found: 09:2.7
[  490.144816]        gim info:(EnumerateVFs:128) vf found: 09:3.0
[  490.144817]        gim info:(EnumerateVFs:128) vf found: 09:3.1
[  490.144818]        gim info:(EnumerateVFs:128) vf found: 09:3.2
[  490.144819]        gim info:(EnumerateVFs:128) vf found: 09:3.3
[  490.144820]        gim info:(EnumerateVFs:128) vf found: 09:3.4
[  490.144821]        gim info:(EnumerateVFs:128) vf found: 09:3.5
[  490.144822]        gim info:(EnumerateVFs:128) vf found: 09:3.6
[  490.144823]        gim info:(EnumerateVFs:128) vf found: 09:3.7
[  490.145512]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.0
[  490.145515]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.145526]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.146217]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.1
[  490.146220]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.146231]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.146940]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.2
[  490.146943]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.146953]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.147654]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.3
[  490.147657]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.147667]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.148415]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.4
[  490.148418]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.148428]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.149185]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.5
[  490.149188]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.149198]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.149973]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.6
[  490.149976]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.149986]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.150793]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:2.7
[  490.150796]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.150806]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.151645]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.0
[  490.151648]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.151660]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.152526]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.1
[  490.152529]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.152539]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.153415]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.2
[  490.153420]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.153432]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.154340]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.3
[  490.154343]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.154353]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.155287]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.4
[  490.155290]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.155300]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.156272]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.5
[  490.156276]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.156286]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.157265]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.6
[  490.157268]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.157278]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.158291]        gim info:(pci_disable_error_reporting:830) Disable error reporting for device: 09:3.7
[  490.158294]        gim info:(pci_disable_error_reporting:834)     Mask before -> corr = 0x00000000, uncorr = 0x00000000
[  490.158304]        gim info:(pci_disable_error_reporting:844)     Mask after -> corr = 0x00000000, uncorr = 0x00000000
[  490.158324]        gim info:(pci_gpu_iov_init:117) totalFBAvailable = 8190
[  490.158324]  
[  490.158325]        gim info:(pci_gpu_iov_init:118) AMD GIM pci_gpu_iov_init pos = 400
[  490.158326]        gim info:(pci_gpu_iov_init:119) AMD GIM pci_gpu_iov_init totalFBAvailable = 1ffe
[  490.158327]        gim info:(init_frame_buffer_partition:232) PCI defined PF FB size = 256 MB
[  490.158327]        gim info:(init_frame_buffer_partition:236) PCI defined VF FB size = 256 MB
[  490.158329]        gim info:(init_frame_buffer_partition:239) Total FB Available = 8190 MB, CSA = 8 MB, Max remaining FB size = 8160 MB
[  490.158329]        gim info:(init_frame_buffer_partition:240) max_fb_size = 8160
[  490.158330]        gim info:(init_frame_buffer_partition:253) PF FB size after checking limits from config file = 256 MB
[  490.158331]        gim info:(init_frame_buffer_partition:255) PF rounded down to nearest 16MB boundary = 256
[  490.158332]        gim info:(init_pf_fb:99) total framebuffer available = 1ffe
[  490.158333]        gim info:(init_pf_fb:100) pf framebuffer = 100
[  490.158334]        gim info:(init_pf_fb:101) total framebuffer consumed = 1efe
[  490.158336]        gim info:(init_frame_buffer_partition:262) CSA starts at offset 256MB
[  490.158337]        gim info:(init_context_save_area:84) AMD GIM init_context_save_area: base =100 size=1.
[  490.158340]        gim info:(init_frame_buffer_partition:267) VF FB base = 272MB (256 + 8)
[  490.158341]        gim info:(init_frame_buffer_partition:270) VF FB Size = 7904MB (8160 - 256)
[  490.158343]        gim info:(init_fb_static:160) AMD GIM init_fb_static: num_vf = 10, base= 110, total_size=1ee0, mini_size=100
[  490.158344]        gim info:(init_fb_static:189) AMD GIM init_fb_static: vf_fb_size = 1e0, base= 110
[  490.158344]        gim info:(init_fb_static:194) num_vf = 16
[  490.158345]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 0 base = 110, size= 1e0
[  490.158348]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 1 base = 2f0, size= 1e0
[  490.158351]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 2 base = 4d0, size= 1e0
[  490.158354]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 3 base = 6b0, size= 1e0
[  490.158356]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 4 base = 890, size= 1e0
[  490.158359]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 5 base = a70, size= 1e0
[  490.158362]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 6 base = c50, size= 1e0
[  490.158365]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 7 base = e30, size= 1e0
[  490.158367]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 8 base = 1010, size= 1e0
[  490.158370]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 9 base = 11f0, size= 1e0
[  490.158373]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 10 base = 13d0, size= 1e0
[  490.158376]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 11 base = 15b0, size= 1e0
[  490.158378]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 12 base = 1790, size= 1e0
[  490.158381]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 13 base = 1970, size= 1e0
[  490.158384]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 14 base = 1b50, size= 1e0
[  490.158386]        gim info:(init_fb_static:200) AMD GIM init_fb_static: partition 15 base = 1d30, size= 1e0
[  490.158389] PF0    gim info:(init_scheduler_cycle:303) Setting cycle time = 112msec
[  490.158393] PF0    gim info:(SetNewAdapter:1304) enable MSI
[  490.158477] PF0    gim info:(ih_iv_ring_disable:446) disable iv ring successfully
[  490.158478] PF0    gim info:(alloc_iv_ring:144) ih->ivRingNumEntries = 256
[  490.158479] PF0    gim info:(alloc_iv_ring:147) ih->ivRingSizeInBytes = 4096
[  490.158480] PF0    gim info:(alloc_iv_ring:151) ih->ivRingAllocSizeInBytes = 4100
[  490.158481] PF0    gim info:(alloc_iv_ring:153) iv ring page_cnt = 2
[  490.158486] PF0    gim info:(alloc_iv_ring:183) ih->ivRing_VA = 00000000dd8dae9a
[  490.158487] PF0    gim info:(alloc_iv_ring:186) ih->ivRing_MA.QuadPart = 0x52ea56000
[  490.158488] PF0    gim info:(alloc_iv_ring:189) ih->ivRingWptrWB = 000000006d365f1a
[  490.158489] PF0    gim info:(alloc_iv_ring:192) ih->ivRingWptrWB_MA.QuadPart = 0x81e5a6000
[  490.158490] PF0    gim info:(alloc_iv_ring:236) update rptr via doorbell
[  490.158491] PF0    gim info:(ih_iv_ring_init:354) ih->rptrDoorbell = 00000000929502e5
[  490.158492] PF0    gim info:(ih_iv_ring_init:355) ih->rptrDoorbellOffset = 0x1e8
[  490.158494] PF0    gim info:(ih_iv_ring_hw_init:255) the physical address of ring buffer: 0x52ea560
[  490.158507] PF0    gim info:(ih_iv_ring_setupRPTR:507) write mmBIF_DOORBELL_APER_EN: 0x1
[  490.158508] PF0    gim info:(ih_iv_ring_enable:413) ih->ivRingWptr_Reg = 0x0
[  490.158509] PF0    gim info:(ih_iv_ring_enable:415) ih->ivRingWptr = 0
[  490.158510] PF0    gim info:(ih_iv_ring_enable:417) ih->ivRingRptr_Reg = 0x0
[  490.158510] PF0    gim info:(ih_iv_ring_enable:419) ih->ivRingRptr = 0
[  490.158512] PF0    gim info:(ih_iv_ring_enable:421) *(ih->rptrDoorbell) = 0x0
[  490.158515] PF0    gim info:(ih_iv_ring_init:362) init iv ring successfully
[  490.158595] PF0    gim info:(SetNewAdapter:1326) init work
[  490.158596] PF0    gim info:(SetNewAdapter:1334) register interrupt
[  490.158629] PF0    gim info:(ih_irq_source_enable:653) IH: read 0x00000000 from maskReg 0x14d1
[  490.158630] PF0    gim info:(ih_irq_source_enable:658) IH: write 0x00000001 to maskReg 0x14d1
[  490.158631] PF0    gim info:(ih_irq_source_enable:660) irq sourceID 0x89 get enabled
[  490.158634] PF0    gim info:(ih_irq_source_enable:653) IH: read 0x00000001 from maskReg 0x14d1
[  490.158635] PF0    gim info:(ih_irq_source_enable:658) IH: write 0x00000003 to maskReg 0x14d1
[  490.158636] PF0    gim info:(ih_irq_source_enable:660) irq sourceID 0x88 get enabled
[  490.158637] PF0    gim info:(init_vf:2428) Operation on PF!
[  490.161671]        gim error:(wait_cmd_complete:2387)  wait_cmd_complete -- time out after 0.003009083 sec
[  490.161689]        gim error:(wait_cmd_complete:2390)   Cmd = 0x17, Status = 0x0, cmd_Complete=0
[  490.161694] Current function = 
[  490.161695] PF0    gim warning:(dump_function_state:252) NULL
[  490.161695] PF0    gim warning:(dump_function_state:254) Last known states:
[  490.161696] PF0    gim warning:(dump_function_state:255) PF = Undefined
[  490.161697] VF0-0  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161698] VF0-1  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161699] VF0-2  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161700] VF0-3  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161701] VF0-4  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161702] VF0-5  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161703] VF0-6  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161704] VF0-7  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161705] VF0-8  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161706] VF0-9  gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161707] VF0-10 gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161707] VF0-11 gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161708] VF0-12 gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161709] VF0-13 gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161710] VF0-14 gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161711] VF0-15 gim warning:(dump_function_state:259) Undefined, Marked as Not Runable
[  490.161712]        gim warning:(dump_gpu_status:1987) **** dump gpu status begin for Adapter 9:00.00
[  490.161717] PF0    gim info:(check_base_addrs:1974) CP_MQD_BASE_ADDR = 0x0:00000000
[  490.161753]        gim warning:(dump_gpu_status:2029)  mmGRBM_STATUS = 0x3028
[  490.161755]        gim warning:(dump_gpu_status:2032)  mmGRBM_STATUS2 = 0x8
[  490.161757]        gim warning:(dump_gpu_status:2035)  mmSRBM_STATUS = 0x20000040
[  490.161758]        gim warning:(dump_gpu_status:2038)  mmSRBM_STATUS2 = 0x0
[  490.161760]        gim warning:(dump_gpu_status:2041)  mmSDMA0_STATUS_REG = 0x46dee557
[  490.161762]        gim warning:(dump_gpu_status:2044)  mmSDMA1_STATUS_REG = 0x46dee557
[  490.161773] PF0    gim info:(check_ME_CNTL:1945) CP_ME_CNTL = 0x15000000 GPU dump
[  490.161773]        gim error:(check_ME_CNTL:1948)   ME HALTED!
[  490.161777]        gim error:(check_ME_CNTL:1952)   PFP HALTED!
[  490.161781]        gim error:(check_ME_CNTL:1956)   CE HALTED!
[  490.161786]        gim warning:(dump_gpu_status:2209) **** dump gpu status end
[  490.161786]        gim error:(init_register_init_state:4643) Failed to INIT PF for initial register 'init-state'
[  490.161789] PF0    gim info:(dump_pf_vm_regs:207) 0xf4000000 - HDP_NONSURFACE_BASE
[  490.161796] PF0    gim info:(dump_pf_vm_regs:207) 0xf5fff400 - MC_VM_FB_LOCATION
[  490.161798] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - MC_VM_FB_OFFSET
[  490.161800] PF0    gim info:(dump_pf_vm_regs:207) 0x0f5fffff - MC_VM_SYSTEM_APERTURE_HI
[  490.161802] PF0    gim info:(dump_pf_vm_regs:207) 0x0f400000 - MC_VM_SYSTEM_APERTURE_LO
[  490.161804] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - MC_VM_SYSTEM_APERTURE_DEF
[  490.161807] PF0    gim info:(dump_pf_vm_regs:207) 0x00000503 - MC_VM_MX_L1_TLB_CNTL
[  490.161809] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - RLC_GPU_IOV_ACTIVE_FCN_ID
[  490.161811] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SMU_ACTIVE_FCN_ID
[  490.161813] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - IH_ACTIVE_FCN_ID
[  490.161815] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - MC_SHARED_ACTIVE_FCN_ID
[  490.161817] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SDMA0_ACTIVE_FCN_ID
[  490.161819] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SDMA1_ACTIVE_FCN_ID
[  490.161821] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SEM_ACTIVE_FCN_ID
[  490.161823] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - VM_CONTEXT0_PROTECTION_FAULT_DEFAULT_ADDRESS
[  490.161824]        gim error:(SetNewAdapter:1359) Failed to init register state(ih) !!!!
[  490.161830] PF0    gim info:(dump_pf_vm_regs:207) 0xf4000000 - HDP_NONSURFACE_BASE
[  490.161832] PF0    gim info:(dump_pf_vm_regs:207) 0xf5fff400 - MC_VM_FB_LOCATION
[  490.161834] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - MC_VM_FB_OFFSET
[  490.161836] PF0    gim info:(dump_pf_vm_regs:207) 0x0f5fffff - MC_VM_SYSTEM_APERTURE_HI
[  490.161838] PF0    gim info:(dump_pf_vm_regs:207) 0x0f400000 - MC_VM_SYSTEM_APERTURE_LO
[  490.161841] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - MC_VM_SYSTEM_APERTURE_DEF
[  490.161843] PF0    gim info:(dump_pf_vm_regs:207) 0x00000503 - MC_VM_MX_L1_TLB_CNTL
[  490.161845] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - RLC_GPU_IOV_ACTIVE_FCN_ID
[  490.161847] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SMU_ACTIVE_FCN_ID
[  490.161849] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - IH_ACTIVE_FCN_ID
[  490.161851] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - MC_SHARED_ACTIVE_FCN_ID
[  490.161853] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SDMA0_ACTIVE_FCN_ID
[  490.161855] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SDMA1_ACTIVE_FCN_ID
[  490.161857] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - SEM_ACTIVE_FCN_ID
[  490.161859] PF0    gim info:(dump_pf_vm_regs:207) 0x00000000 - VM_CONTEXT0_PROTECTION_FAULT_DEFAULT_ADDRESS
[  490.161862] PF0    gim info:(ih_irq_source_disable:698) disabled irq sourceID 0x89
[  490.161864] PF0    gim info:(ih_irq_source_disable:698) disabled irq sourceID 0x88
[  490.161893] PF0    gim info:(free_iv_ring:307) unmap the iv ring
[  490.161942]        gim info:(DisableSriov:424) Disable SRIOV
[  491.235218]        gim error:(gim_probe:126) Failed to create new adapter
[  491.235231] gim: probe of 0000:09:00.0 failed with error -1
[  491.235239]        gim error:(gim_probe:123) gim_probe(0b:00.0)
*THE SECOND GPU INITIALIZATION WAS HERE, BUT REMOVED DUE TO CHARACTER LIMIT*
[  493.918339]        gim info:(gim_ioctl_init:567) IOCTL device created and ready for use
[  493.918339] Running Kaveri version of GIM

Have you gone through this thread/

Thanks for taking a look into it but yes I have. I mainly posted this here because the channel talks a lot about XCP-ng and was hoping another enthusiast might of encountered a similar issue.

If it is a new deployment you could always go 7.6 until it gets sorted out.