Dell R730xd TrueNAS Build for XCP-NG - Performance Numbers

Hello All,

I’m currently in the process of integrating a new full SSD TrueNAS storage server into my production XCP-ng cluster.

I thought this would be a straight forward process but it doesn’t appear I’m getting the full performance out of the server. I’ve tried many different RAIDZ/mirror configurations and I’m beating my head against a wall trying to get the numbers which I think I should be getting.

The build is as follows:
Dell R730xd - 2.5in 24-drive + flex bay
TrueNAS-12.0-U1
2x Intel® Xeon® CPU E5-2620 v3
2x 16GB DDR4 1866MHz (32GB total)
2x Intel 10Gbe x540-t2 (4-ports total)
Dell HBA330 Mini
10x 3.84TB SAMSUNG MZILS3T8HMLH0D3 SSD

The configuration I tested in was 2 x 5 RAIDZ2 with the above 10 3.84TB drives. I did test a whole bunch of configurations (5 x 2-way mirror, 1 x 10 RAIDZ3, 1 x 10 stripe) and all achieved similar results over the network.

The results below are from a phoronix test ran from within a jail on TrueNAS (2 x 5 RAIDz2):

From the numbers with sync turned off I should have no problem saturating a 10Gb NIC but that is just not what I’m seeing. iSCSI and NFS have almost identical performance numbers on the virtual machines coming in around 220-260MBps and adding an SLOG doesn’t seem to affect read/write performance.

My iperf tests from the hypervisor to the NAS seem normal:

-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.251.150, port 41570
[  5] local 192.168.251.231 port 5201 connected to 192.168.251.150 port 41572
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   902 MBytes  7.56 Gbits/sec
[  5]   1.00-2.00   sec  1.15 GBytes  9.89 Gbits/sec
[  5]   2.00-3.00   sec   916 MBytes  7.69 Gbits/sec
[  5]   3.00-4.00   sec  1.14 GBytes  9.76 Gbits/sec
[  5]   4.00-5.00   sec   981 MBytes  8.23 Gbits/sec
[  5]   5.00-6.00   sec  1.09 GBytes  9.38 Gbits/sec
[  5]   6.00-7.00   sec  1.08 GBytes  9.24 Gbits/sec
[  5]   7.00-8.00   sec  1.15 GBytes  9.90 Gbits/sec
[  5]   8.00-9.00   sec  1.15 GBytes  9.90 Gbits/sec
[  5]   9.00-10.00  sec  1.15 GBytes  9.90 Gbits/sec
[  5]  10.00-10.00  sec  1.01 MBytes  9.90 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
------------------------------

What irks me the most about the results is the base performance with sync set on or to standard. I feel like the results I’m getting without an SLOG are poor for the SSDs that I have. The results are better with a dedicated NVMe SLOG yes, but shouldn’t the base numbers be much higher? I see people playing with ashift and the recordsize to eek out more performance but it just seems like I’m missing something bigger that is affecting my performance.

Any insight or guidance would be much appreciated!

What is the OS and test you are running inside the hypervisor? I have a few SSD’s in a TrueNAS mini that are giving me about 700 MB/S with this quick (but I know not very in depth) test of writing out 2GiB of data
time sh -c "dd if=/dev/zero of=testfile bs=1900k count=1k && sync"

The hypervisor is a Dell R620 running XCP-ng and the VM I’m using for testing is a fresh install of Debian 10.7.

The results of your dd command are below which seem to be a little better:

root@testing:~# time sh -c "dd if=/dev/zero of=testfile bs=1900k count=1k && sync"
1024+0 records in
1024+0 records out
1992294400 bytes (2.0 GB, 1.9 GiB) copied, 6.27079 s, 318 MB/s

real    0m7.457s
user    0m0.005s
sys     0m2.248s

The dd command I was using on the VM for testing was dd if=/dev/zero of=test bs=1M count=1024 oflag=dsync. I just ran that same dd command on the hypervisor storage repository mount and here are those results:

[21:29 xcp-ng-bench1 897f568a-4e99-83d2-c1a8-92e2440f9830]# dd if=/dev/zero of=test bs=1M count=1024 oflag=dsync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.62474 s, 232 MB/s

And your dd test ran directly on the storage repository mount:

[21:34 xcp-ng-bench1 897f568a-4e99-83d2-c1a8-92e2440f9830]# time sh -c "dd if=/dev/zero of=testfile bs=1900k count=1k && sync"
1024+0 records in
1024+0 records out
1992294400 bytes (2.0 GB) copied, 3.29378 s, 605 MB/s

real    0m3.578s
user    0m0.008s
sys     0m3.287s

So am I missing something? Does the oflag=dsync on the dd command actually force ZFS to do synchronous writes even if sync is turned completely off?

I know plenty of folks using these drives but always front ended with a NVME such as the Samsung PM1725a.

While this drive is SAS 12G SSD and should theoretically hit 1500 MB/s on sequential reads and writes it rated random writes cap at around 35,000 IOPs (to be fair its reads are 200K). Far below what a 6G Samsung Evo can do.

This maybe a simple case of good but incomplete build. Guess it would really comedown to what you were using as your SLOG.

Your setup with the PM1725a NVME was actually a internally tested and certified by VMware for vSAN.

My gut tells me it is the HBA. I’ve built many whitebox SANs and this was one of the first hurdles I had to design around.

Sorry for not including that information! The SLOG device I have is an Intel 900p 260GB. The phoronix/openbenchmark tests labelled “NVMe” are tests that were ran with the NVMe as an SLOG.

I did see a decent performance increase with sync writes using an NVMe SLOG but like I said, something irks me about getting a bit over 200MBps write on 10 enterprise SSDs with no work load on the NAS.

I thought getting the HBA330 was a recommended route but I’d be willing to accept that is where the limitation lies. I was looking at the RAID/HBA cards for the Dell 13th generation and it doesn’t seem like they make any other HBA alternatives? Or can you setup the H830 or H730 to be HBAs?

I was looking at the SAS3008 (HBA330) vs the SAS3108 (H730) and am I missing something or do they have the same exact specs besides one having RAID ability and the other not? Same PowerPC CPU, same clock speed, same PCIe speeds.

SAS3008: https://docs.broadcom.com/doc/12351998
SAS3108: https://docs.broadcom.com/doc/LSISAS3108

What’s your configuration on the 720xd? I’d be interested to see your numbers to understand if I’m really being bottlenecked somewhere.

The HBA330 is a solid choice, perhaps you may need to make sure that the firmware is up to date on everything. I rechecked you benchmarks and if you are only pulling 532.24 MB with the 900P installed you may need to make sure your tuned correctly for the NVME.

Okay just as a little update I went out and got a H730P and set it up in HBA mode with caching completely disabled and I’m getting just about the same read/write speeds as with the HBA330.

I made sure all the firmware (BIOS, HBA/RAID card, etc.) was up-to-date before the tests were performed.

What tuning should I be performing on the NVMe? Are there PCIe options in BIOSes for tuning?

EDIT: One other thing I find particularly disturbing is that with the H730P, FreeBSD is reporting transfers of 150MBps which is way lower than the 1200MBps that the HBA330 was reporting. I can see the drives registering via iDRAC as 12Gbps so I’m not sure why they are coming up like that.

I would not have expected better performance with the H730P. The only way that would have been possible is if it was flashed with IT mode firmware. Which based on your reduced performance I am guessing it is not.

When I was talking about tuning, I specifically was referring to TrueNAS tuning. I would recommend looking at the TrueNAS documentation, and forums for recommended performance tuning. I have yet to make the jump to version 12 yet myself, so I don’t have any specific recommendations that I could make at this time.

I would try a different OS to see if you get the same results, but I personally don’t use any of the Dell HBAs or RAID cards. Only LSIs for me.

Hello @FredFerrell which exact LSI card in IT mode would you recommend? Thanks!

It depends on the number of drives and the interface types and speed of those drives. Also, you need to make sure the PCI connection is a fit with your motherboard. The 9500s are the latest ones, but if if you are using SATA 6Gb or less you could go with a used or older card too to save some money.