TrueNAS Replication Task Newbie Questions

Dear Community, I recently successfully replicated some data from one TrueNAS system to another. While the task was successful, I am a beginner and have some questions about TrueNAS’ replication. I hope you can help me answer them. Thanks in advance!

Performance

My setup: I was replicating data from machine A (located in North America) to machine B (located in Europe). In other words, I was moving data across the Atlantic Ocean. Speed isn’t that important here, as this is my worst-case offsite backup with data I don’t need to access quickly in an emergency. Still, I wondered why I was only getting about 50Mbps consistently over several days (I was moving several terabytes of data).

The upload capacity of machine A is 250Mbps, while the download speed of machine B is 100-150Mbps. Both machines are quite powerful (modern AMD CPUs, fast drives, 1Gbps network, etc.). My pfSense routers at both sites were also not heavily loaded in terms of CPU, and no other traffic was present during the replication time. The site-to-site connection is through Tailscale, which can also transfer much more data.

  1. Is 50Mbps realistic, as the transfer latency will eat up the rest of the speed?
  2. I assume that my bottlenecks should be the 100-150Mbps (double to triple speed missing) download speed of site B. Is this true?
  3. How can I increase my transfer speed?
  4. Are there any notable settings I should change in the replication task settings in terms of performance? What about transport (currently I have it set to SSH)?

Security

I first tried to connect my machines using the Admin Password method, which failed. Then I manually configured an SSH connection in the Backup Credentials section.

  1. Is it possible to change the admin password after a successful connection?
  2. What security steps should I take to minimize the impact of a machine compromise on one of the sites (I am concerned that malware will be able to delete replicated data on the other site)?
  3. I have enabled (or it was the default) “Use sudo for ZFS commands”. I don’t really understand what this means, can you please explain?
  4. I want to replicate my dataset so that it cannot be read on machine B for privacy reasons. To do this, I set up encryption on my dataset and did not manually share the password with site B. Is this the right way to do this, or are passwords shared automatically?
  5. What is the additional “Encryption” setting in the replication task settings under Destination?
  6. Which user should be set in the SSH connection settings? I currently use admin instead of root. Any tips here?

Stability

My replication task failed in the middle of the sync process.

  1. Does TrueNAS replicate all data in a fail-safe manner even if the process fails and reboots or one of the machines crashes/reboots?
  2. Are there any settings for additional replication stability?
  3. Is it recommended to use the “Replicate from scratch” setting?

Thank you for your help, even if I have to write down a lot of questions! If you don’t understand my questions or need additional information to answer them correctly, please ask again!

I have a video that answers all those questions except the hardening one of which get’s complicated and there is a write up here that dives into that Improving Replication Security With OpenZFS Delegation | Klara Inc

Thanks for your answer. I missed that video. It explains a lot of questions already, thanks. I still have some open questions I would like to discuss here.

  1. What about my performance issues? Am I correct that I should be getting 100Mbps, as this is the download speed of site B and the slowest point in the connection? I am currently only getting 60Mbps. Why is this happening?

  2. When I switch to SSH+NETCAT I get an error that I cannot have any TCP ports open. Do I need to change my Tailscale ACLs for this? Still, I think my system can handle more than 60Mbps SSH crypto, so where could the bottleneck be?

  3. I read the ZFS delegation article you linked. If I understand correctly, I could set up a ransomware secure system with TrueNAS, but only with terminal configs and not via the web interface. Please confirm if this is correct. What other ransomware precautions can I take or is TrueNAS Scale not really ready for ransomware secure replication?

I am still not sure about this setting. Can you please help me out?

This basically ensures that if my replication data is accidentally modified (partially deleted, etc.), TrueNAS will simply re-sync? What about performance? Would a small change trigger a full sync, or just the blocks that have changed (I ask because syncing all blocks would not be so much in the spirit of ZFS, although the name of the setting suggests it)?

  1. Using SSH can be a big performance issue because you are slowing things down to the speed at which SSH can handle the tunnel.
  2. Not completely sure why it does not work
  3. Yup, lots of command line

You would use sudo if you want a user other than root to be used, but that user has to be able to use sudo. I suggest reading up on sudo.

Replicate from scratch only occurs when the system can not sync because something on the target has become corrupted and it has to send ALL the data again.

Thanks for your help! For now, I will wait until my replication is complete and then continue to play around and test. I just want to make sure my data is replicated first.

I will come back to the question of setting up SSH+NETCAT. I am assuming some pfsense or Tailscale ACL misconfiguration.

I will also say that I know what sudo is, it is just that the wording and description in TrueNAS sometimes seems to only make sense if you already know what it means. It is not as beginner friendly for the general tech-savvy person as other projects, in my opinion.

Still, I cannot get over the fact that there is no ransomware protection built into TrueNAS. Shouldn’t this be a first-class feature? Like some kind of data immutability feature? What alternative strategies would you recommend for ransomware data protection? My naive idea would be to have a second independent machine that I replicate data to, but keep offline the rest of the time.

Also, I’d like to know if latency in general has a significant impact on throughput, or if TrueNAS/ZFS replication isn’t much affected by it. In other words, will the Atlatic eat up my bandwidth?

Finally, I would like to know if there are any recommendations regarding full server replication/backup. Is it advisable to run one replication task for all datasets or is it better to run one replication task for each dataset? Maybe there is no benefit at all. I would like to find out.

One assumption is that it might be impossible/harder to recover individual datasets if they were handled by a replication task? Is this true, or can I run a full server (multiple datasets) replication task and partially restore individual datasets in an emergency?

Immutability in tech means that the data control plane at which the data is created, such as an NFS or SMB share can not be deleted by that control plane. This means that TrueNAS snapshots are immutable and good for ransomeware protection.

Latency is a factor, just not sure how much and I prefer the granular per dataset snapshots and replication.

Why do you prefer per-snapshot replication? Does it make recovery easier? Can you do partial restores?

Considering TrueNAS immutability from the client side (SMB/NFS connection device) is a good point. Still, I would like to be able to compartmentalize between the storage servers themselves. I think security should always be a practice where multiple layers of defense are applied. These layers are a bit lacking in TrueNAS. It would be great if ixSystems could bring an easy to manage ZFS delegation setup to TrueNAS.

I have more snapshot and replication tasks so I can be granular on frequency, retention, and restore.

If there was a simple way to do it, they would implement it.