Truenas - Dataset replication validation

vyre · May 11, 2024, 6:21pm

Hey, I “successfully” replicated bit large dataset from one truenas server to another and I’m unsure if the data on the receiving side is intact because first replication task ran for around 6 days and transmitted 90% of data and replication task crashed with " Error: [EFAULT] Active side: Command failed with code -1" error.

Then i restarted the replication and remainging 10% was sent successfully, at least I assume it did. Now whenever I press run on the replication task it finishes in a minute or so with no data sent.

vyre · May 11, 2024, 6:22pm

(sorry for double post - limitation of 1 embedded media in the post)

I then ran zfs list space command on both systems and the UsedDDS value differs
Original dataset

vyre · May 11, 2024, 6:22pm

(actually sorry for triple post)
Replicated dataset

What could cause this discrepancy and is there any way to validate if all data is intact? If that first replication didn’t fail I probably wouldn’t be so nervous but unfortunately it did and now i’m unsure.

I don’t want to delete dataset “public” and try again because it takes around 6 days to transmit around 5tb over the internet, origin side has 1gb/300mb down/up and replication side has 1gb/1gb internet.

LTS_Tom · May 12, 2024, 10:57am

If you post the text from the results it’s easier to read, easier to index, and get’s around image post limitations.

Run a scrub on the destination system as that will validate the system and make sure there is not corruption. There are some scripts people have made, but none I have tested, to do further validation, but you could also use RSYNC to validate that there are not differences between the two.

vyre · May 12, 2024, 11:36am

Thanks for the reply Tom, valid points about screenshots vs text. I’ll keep it in mind for next time.

I started a scrub on destination system as you suggested and also rsync command as follows to compare the data:
rsync --recursive --verbose --checksum --dry-run --delete root@10.0.1.3:/mnt/tank/public/ /mnt/tank/xxxx/public | tee rsync.txt