I run a (primarily) break/fix shop. I’m trying to streamline some of our processes to increase efficiency and with the ADD brain I have, I’m needing to rely on more logical thinkers to point out the obvious - so bear with me.
One thing we do on a daily basis is replace HDD’s with SSD’s and either clean install Windows then copy user folders back, or clone drives outright, but mostly the former. Here’s a snapshot of a typical workflow:
- Copy user folders and other necessary items to network share(Synology)
- Replace the HDD (or reuse) then clean install Windows and copy the user folders back from the share.
We spend a lot of time sitting and waiting for data to move. BTW, this is all done through a Gigabit switch. Anyhow, my initial AHA! thought was that I would upgrade everything to 10Gbe - problem solved right? No… I’m still limited to the throughput of the source devices/drives. I asked about this on Reddit and was told that I should scrap the whole network idea and instead get into “data ingestion”. The user said I should build a machine with a threadripper so I’ve got plenty of PCIe lanes… install Linux(which I’m very familiar with) and have someone write a script which would more or less DD whichever drive I connected via SATA dock to a local storage pool.
That sounds great… but how is that better than what I’m doing now? I wanted to come here for a close and personal discussion on this.
If you were me, how would you improve the process of offloading data temporarily, then reloading it?
For reference, here is their input:
Hm, why not ditch the entire over the network copy process and setup a data ingest station? That way you could load a bunch of HDDs/SSDs have the station dump the contents into a NFS/CIFS share or even directly onto an SSD. The station will be some semi-fast PC with enough USB 3.1 ports (for the USB to SATA readers) - Threadripper gen 1 is dirt cheap and has a ton of PCIE lanes and USB 3.1 ports. Studios use that approach since they have a ton of footage that needs to be copied and they hate spending money and time waiting. There are ready made solutions as well - Copy station COPYLynx ATX4 for easy handling of very large storage data but I prefer building my own since it’s cheaper and fun. The actual software is very simple - Linux of some kind with AutoFS and a bash script to create a directory set a date and dump the data from the disk, you can even have it send you a Telegram or Email when done. To be safe always mount the source device with -o ro (read-only).
Another approach is to setup a data mirror station - you basically have two slots for source and destination plug the old device and the new one and the station does dd between them and has format detection - if you mix the source and destination it will refuse to dd since the destination has a valid partition table for example. That way you avoid the liability of storing customers data.
Given that you have the source and destination devices literally in your hands I don’t see a reason to use a really expensive network solution to copy them. When the data has to travel a few inches use USB, PCIE, SATA or similar transport, no need to get complicated with networking.
and…
The most profitable and efficient way.
Shelves beneath the TRipper-PC, on every shelf a LED signaling the status of the copy-operation and done. Setting the HW up and letting someone code you the copy-operation (with LOG) and status via LED for a 1/5 regarding the other upgrade - and you’ll be incredibly more effective.