We have the issue where we can’t create new snapshots as the chain is too long.
We have started to resolve this but it would be really nice to see a video on youtube about it. Just an idea for you.
So our SR was at 340 VDIs (now down to 18) for coalesce. This was caused by our read/write to the SAN being too slow. (a missed fault). This means the backups and scheduled snapshots didn’t have time to coalesce in time before more backups and snapshots were created.
Even though this is now down to 18, the woods are still thick as disks have been marked as orphaned. Now we thought this meant they were outside of the chain. it seems not. These need clearing too before we can start to backup again.
As you delete these, it causes more coalescing jobs. Our storage isn’t still at 100% but we need to get a round of backups in before we start to investigate these systems.
ANYWAY, feel feel to ask for more input for a video etc… but it would be good for a breakdown of real case of what happened and what orphaned files actually are etc…
p.s to check your coalesce job is working, you can check if a task is running with command.
ps axf | grep coalesce