My Unraid NAS Backup Solution

Post by **Pauven** » Thu Nov 16, 2023 11:04 am

Pauven wrote: ↑Wed Apr 06, 2022 7:05 pm The btrfs drive pool I created is not striped or RAID'ed in any way, making it very similar to Unraid in that each file only exists on a single drive, and if I loose a backup drive I only loose the data that was backed up to that drive, plus potentially 1 or 2 files that happened to be spanning the gap from one drive to the next. If I do lose and replace a drive, with the way the rsync works then the next backup will simply re-copy those files that no longer exist in the backup, making it very easy to maintain this backup solution.

These last few days have been a brutal lesson that I don't always know what I'm talking about...

My homebuilt backup drive carrier has 7 bays, five of which I had filled with 16TB drives for an 80TB backup solution, mounted easily via a BTRFS non-RAIDed pool. With only a couple slots free, I've been waiting for larger drives to become available, and I really wanted a 20TB drive to bring my backup capacity up to an even 100TB. Well, the stars finally aligned and I procured said 20TB drive and added it to my backup carrier.

My next step was to simply expand my existing BTRFS pool with the new drive. Ironically, I did this correctly on the first attempt, but the on-screen results seemed off (I probably just needed to power cycle the backup pool again to refresh to the new config), and I hastily made some "corrections". I started by blowing away the partition on the new drive so I could try again, but I failed to first remove the partition from the BTRFS pool.

Which promptly corrupted the entire pool. I could mount the pool in a degraded, read-only state, so I would have been able to retrieve my data if I had needed to, but because BTRFS JBOD pools have zero redundancy, it flat-out refused to allow me to mount it in a read-write state, and also refused to allow me to update the config (to replace or remove the deleted partition/drive) unless mounted read-write. A real nice catch-22.

After much research, I determined that the only two options were to either boot up a custom Linux Kernel that allowed mounting this corrupted pool in read-write mode (but with the specter that this could still leave the pool corrupted even after my fixes), or simply start over with a brand new pool and redo all my backups.

I chose to start over. I created the pool same as before, but now in the back of my head I kept thinking about how a single drive failing would put me back in this same situation. Yet I figured that since this is a backup of my Unraided movie data, which itself is a backup of my physical movie collection, that this was still an acceptable compromise.

With my new 100TB BTRFS JBOD pool, I started the array backup knowing this would likely takes 1-2 weeks to complete. Interestingly I noted that all the data was writing to disk #6, the new 20TB drive, when I expected it to start at disk #1. No biggie, I thought, as the data was definitely writing to just the single drive as I wanted.

That was yesterday. This morning my world opened up to a new reality. My backup had completed almost 10TB overnight, pretty nice progress. But looking at the drive activity I could see the writes jumping around to all 6 drives. Not quite striping, it wasn't writing simultaneously, but it was hitting each drive in turn, one after another, round and round.

So I investigated further with the btrfs filesystem show command:

Code: Select all

root@Tower:/boot/config# btrfs fi show /mnt/disks/Frankenstore/
Label: none  uuid: 10205afb-0e7f-4470-b3c9-6d40df3e437f
        Total devices 6 FS bytes used 9.39TiB
        devid    1 size 14.55TiB used 990.02GiB path /dev/sdx1
        devid    2 size 14.55TiB used 991.00GiB path /dev/sdy1
        devid    3 size 14.55TiB used 991.00GiB path /dev/sdz1
        devid    4 size 14.55TiB used 991.00GiB path /dev/sdaa1
        devid    5 size 14.55TiB used 991.00GiB path /dev/sdab1
        devid    6 size 18.19TiB used 4.61TiB path /dev/sdac1

Sure enough, data was being written to all 6 drives, almost equally, except the larger drive 6 had an extra 4TB written to it. What in the world was going on.

After some research, I learned that BTRFS always writes to the drive with the MOST free space. Each file is written whole, but the drive it ends up on is determined in real-time as the largest free space available. Since drive 6 was 4TB larger, the first 4TB were written there, and from then on, each file was written nearly randomly to one of the 6 drives to keep free space equalized.

This is NOT what I thought would happen, and would also render my backup nearly useless should a single drive fail.

I don't have a solution yet, but I wanted to go ahead and post this in case anyone else was using a similar backup approach.

I'm currently investigating alternative solutions - I still want a single filesystem, with no redundancy to maximize storage space, but the data must fill up each drive in whole, before moving on to the next drive, so that a single drive failure doesn't corrupt the surviving data on other drives.

My leading candidate is MergerFS: https://github.com/trapexit/mergerfs#readme

Here's an Unraid forum thread on using MergerFS with Unraid: https://forums.unraid.net/topic/144999- ... ort-topic/

Post by **Pauven** » Thu Nov 16, 2023 2:45 pm

How to use MergerFS to create a USB based JBOD Backup Pool
Instead of using a single BTRFS JBOD pool, which will be corrupted by a single failed drive, I am now using MergerFS to create a unified filesystem that operates as a middleman, merging separate independent filesystems into one for simplified backup copying.

The benefit of MergerFS is that if a drive fails, only the data on that drive is lost, and you can still easily access the remaining drives in the MergerFS unified filesystem. Add in new or replacement drives as needed.

This is similar to the BTRFS pool, but in this case the pool is virtual, so the pool cannot become corrupted by a failed drive. Additionally, MergerFS has options for where to create new files, and by using the epff option I can force each drive to fill up, one-by-one, before moving on to the next drive, so that data is not split up and spread out across multiple drives, maximizing recoverability in case of a failed backup drive.

These two features, virtual pooling and one-by-one drive filling, address the two issues I outlined above that makes BTRFS JBOD pools (and likely most other pools) dangerous.

Step 1:
For initial setup only, format each drive, creating a BTRFS partition. It was a tossup on the filesystem, but I chose BTRFS over XFS mainly just for personal preference. The partition should be renamed as desired, I'm naming them FS1 through FS6, as this will help me identify my various drives in my Frankenstore backup pool. This will come into play later when using MergerFS to fuse them together. The shorter partition name here will help make the MergerFS command shorter.

image.png

Step 2:
Mount the drives. Unlike before where I only had to mount the first drive in a BTRFS pool, I will have to mount each drive individually. Also, partition sharing must be turned on for all drives, so that MergerFS has access to the mount points for fusing.

Step 3:
Create all main paths on all drives. For me, these are the shares I'm backing up, "DVDs", "Blu-rays", "4K", and "TV_Series". The reason I'm creating these directories now and on all drives is that the "epff" MergerFS option (described down below) will only create a new path if an existing path doesn't exist, and otherwise will continue filling up until free space is gone, potentially causing errors. I want to make sure these paths already exist to hopefully prevent MergerFS from refusing to split a huge share across multiple drives. I'm thinking this will keep individual movies whole on a single drive, regardless of which drive it ultimately resides on.

Step 4:
Install MergerFS plugin via URL: https://raw.githubusercontent.com/deser ... gerfsp.plg

NOTE: This requires at least Unraid 6.10

Step 5:
Run MergerFS command to mount new merged filesystem:

Code: Select all

mkdir /mnt/disks/Frankenstore
chown nobody:users /mnt/disks/Frankenstore
mergerfs -o cache.files=off,dropcacheonclose=true,category.create=epmfs,minfreespace=4G,moveonenospc=true,fsname=Frankenstore /mnt/disks/FS1:/mnt/disks/FS2:/mnt/disks/FS3:/mnt/disks/FS4:/mnt/disks/FS5:/mnt/disks/FS6 /mnt/disks/Frankenstore

Explanation of Chosen MergerFS Options:
cache.files=off : Since I will be using this merged filesystem just for backup purposes, I want to disable caching as much as possible to ensure file integrity

dropcacheonclose=true : There's still some caching going on, so this will drop any cached data when closing the filesystem

category.create=epmfs : This determines where files are written. The default is epmfs, which is "existing path, most free space", which is similar to the BTRFS behavior which is problematic for my backup goals. epff is "existing path, first found", and first found is the order of the branches at creation time (aka the command line). The expectation here is that it will write to sdx1 first, then sdy1, then sdz1, etc... and finally sdac1 since it is the last drive in the mergerfs command line statement. The minfreespace parameter should control when it rolls to the next drive, but only for new paths. The existing path parameter will try and keep content together for future updates. The "ep" part means paths are preserved, so writes will go to the existing path and new paths will only be created according to the "ff" found first logic.

UPDATE: I was orginally using epff here, but after modifying the mirror.sh script to handle directory creation, I discovered that epff causes more problems than solutions. Occasionally I have a movie collection folder that has a deeper subdirectory structure than normal (which breaks my mirror.sh script's logic), and has a ton of movies in it (i.e. the James Bond collection with over 25 movies springs to mind). My script isn't smart enough to evaluate the source directory's size to make sure it will fit, I'm just counting on dumb luck that 99% of my directories will only have 1 or 2 discs at the most. So these exception directories break my scripts logic, so directory creation auto-reverts to MergerFS control for them. But in this case, epff was forcing these directories to be created on the first couple disks, FS1 and FS2, which were already full (below 90G free) but since they were still above 4G min free space MergerFS chose them.

Then once FS1 or FS2 filled up (pretty quickly since they only had about 60-70GB free), the error handling logic in MergerFS would start writing to FS6 since it had the most free space.

Since these are exception directories that by their very nature typically have a lot more data in them, the right strategy here would be to put them on the disk with the most free space, so I've reverted back to epmfs. And since I'm filling up my drives sequentially from FS1 to FS6, and FS6 is my only larger 20TB drive, FS6 will always be my disk with the most free space - at least for the next few years until I expand my backup array again with FS7.

minfreespace=4G : In order to write to a drive, it must have at least this amount of space available. The default is 4G, and here I am explicitly setting it to the same 4G value, mainly as a placeholder for this value should anyone want to override it.

UPDATE: Originally, I was using 90G here, but I found it problematic to have this value control both where directories were created and where files were written, as a directory could be created on a nearly full branch, but then after a few files were copied into it, space would fill up and the directory would split, being recreated on the next branch and the files would span two drives.

I'm now using a customized mirror.sh script that creates directories based upon free space, and I have that set to 100GB min free space to create a disc title's parent directory, which leaves at least 96 GB of writing space left for the files to be copied into the directory before the 4GB MergerFS min free space limit is hit. This solution solved a major limitation of MergerFS, I'm no longer forced to use the same min free space limit for both creating directories and creating files.

moveonenospc=true : When enabled if a write fails with ENOSPC (no space left on device) or EDQUOT (disk quota exceeded) the policy selected will run to find a new location for the file. An attempt to move the file to that branch will occur (keeping all metadata possible) and if successful the original is unlinked and the write retried.

fsname=Frankenstore : If I don't use this, then the label in the system is an ugly hybrid of the partitions, i.e. "x1:y1:z1:aa1:ab1:ac1" when using something like "df -h" to list filesystem and space.

NOTE: For mounting the source partitions, I specified each full path separated by a colon, i.e. /mnt/disks/FS1:/mnt/disks/FS2:/mnt/disks/FS3 etc. An alternative would have been to use a wildcard to select them all, i.e. /mnt/disks/FS*, but I was afraid to use this because I wanted to make sure that the 6 drives were created in a specific order, so they would fill up in order from 1 to 6.

Step 6:
Check for successful mounting with "df -BM -H". I also piped through grep to filter on /mnt/disks/F so I could see just the 6 drives and MergerFS fused Frankenstore paths:

Code: Select all

root@Tower:~# df -BM -H | grep /mnt/disks/F
/dev/sdx1        17T  4.0M   16T   1% /mnt/disks/FS1
/dev/sdy1        17T  4.0M   16T   1% /mnt/disks/FS2
/dev/sdz1        17T  4.0M   16T   1% /mnt/disks/FS3
/dev/sdab1       17T  4.0M   16T   1% /mnt/disks/FS5
/dev/sdac1       21T  4.0M   20T   1% /mnt/disks/FS6
/dev/sdaa1       17T  4.0M   16T   1% /mnt/disks/FS4
Frankenstore    101T   24M  100T   1% /mnt/disks/Frankenstore

Here I can see the new 100TB Frankenstore fused partition mounted at /mnt/disks/Frankenstore. Looking inside, I see my previously created target backup directories for each of my shares:

Code: Select all

root@Tower:~# ls -l /mnt/disks/Frankenstore
total 0
drwxrwxrwx 1 root root 0 Nov 16 13:08 4K
drwxrwxrwx 1 root root 0 Nov 16 13:08 Blu-Rays
drwxrwxrwx 1 root root 0 Nov 16 13:07 DVDs
drwxrwxrwx 1 root root 0 Nov 16 13:09 TV_Series

Step 7:
Run modified backup job. I created a script last year (on the first page of this thread) but it doesn't work correctly with this backup strategy.

ISSUE: When running RSYNC, it created all directories in a folder before processing files/folders in each subdirectory. For example, when backing up the 4K share, it first created all 56 movie folders on drive FS1 before copying a single file. At some point, the drive will fill up with data, and may have empty directories that weren't copied to that drive, and worse may cause errors as the epff option will tell MergerFS to copy files to existing paths first, so this will cause an error.

The solution is a custom wrapper around the RSYNC process that manually creates each directory just before copying the files into it.

For full credit, the bash script below came from: https://github.com/ashishpandey/scaffol ... /mirror.sh

However, I did make one change, adding the "--archive" parameter on the rsync command line, since I wanted that option in my backup to preserve metadata as much as possible. Save the code below into a new file named "mirror.sh" and save it in the same directory with your backup job script:

Code: Select all

#!/bin/bash

set -e

# ensure unicode filenames are supported
export LANG="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"
export G_FILENAME_ENCODING="@locale"
export G_BROKEN_FILENAMES="1"

function usage() {
    echo "Usage: mirror.sh [OPTIONS]"
    echo "   OPTIONS includes:"
    echo "   -x | --dry-run - do not copy only files, only echo what will be done. default mode"
    echo "   -m | --mirror - copy files from src to dest. overrides dry run"
    echo "   -s | --src  - source directory"
    echo "   -d | --dest - destination directory"
    echo "   -h | --help - displays this message"
}

run_type="dry-run"
while [ "$1" != "" ]
do
  case $1 in
    -x | --dry-run )
        run_type="dry-run"
        ;;
    -m | --mirror )
        run_type="mirror"
        ;;
    -s | --src )
        shift
        if [ -d "$1" ]; then
          src="${1}"
        else
          echo "$0: $1 is not a valid directory" >&2
          exit
        fi
        ;;
    -d | --dest )
        shift
        dest="${1%/}" # dest without trailing slash
        ;;
    -h | --help ) 
        usage
        exit 0
        ;;
    * ) 
        echo "Invalid option: $1"
        usage
        exit 1
        ;;
  esac
  shift
done

function ensure_vars() {
  for v in "$@"
  do
    if [ -z "${!v}" ]; then 
      echo "ERROR: $v is not specified"
      usage
      exit 1
    fi
  done
}

ensure_vars "run_type" "src" "dest"

function log() {
  echo "$(date +'%Y-%m-%d %T'): $1"
}

function debug_log() {
  if [ "x$debug" == "xtrue" ]; then
    log "$1"
  fi
}

progress_inc=500
progress_idx=0
function progress() {
  ((progress_idx+=1))
  if [ "x$debug" == "xtrue" ]; then
    log "$1"
  else
    if ! ((progress_idx % progress_inc)); then
      log "done $progress_idx"
    fi
  fi
}

function exec_cmd() {
  if [ "${run_type}" == "dry-run" ]; then
    echo "dry-run: $@"
  elif [ "${run_type}" == "mirror" ]; then
    "$@"
  else
    echo "warning: unknown run type ${run_type}"
    exit 2
  fi
}

log "run mode: $run_mode"
log "sync $src => $dest"
log "using extra excludes => $EXTRA_EXCLUDES"
log "-----------------------------------------------"

rsync --dry-run --archive --recursive --itemize-changes --delete --delete-excluded --iconv=utf-8 \
	--exclude '@eaDir' --exclude 'Thumbs.db' --exclude '*.socket' --exclude 'socket' $EXTRA_EXCLUDES \
  "$src" "$dest" | while read -r line ; do
    progress "$line"
    read -r op file <<< "$line"
    debug_log "from $file"

    if [ "x$op" == "x*deleting" ]; then
      log "removing $dest/$file"
      exec_cmd rm -rf "$dest/$file"
    else
      op1=$(echo $op | cut -b 1-2)
      sizeTsState=$(echo $op | cut -b 4-5)
      case "$src" in
      */)
          src_file="${src}${file}"  # src end in slash, $file starts under it
          ;;
      *)
          src_file="$(dirname $src)/$file"  # $file contains the src itself as root of path 
          ;;
      esac

      if [ "x$op1" == "xcd" ]; then
	      debug_log "not eagerly creating $dest/$file"
      elif [ "x$op1" == "x>f" ]; then
        dest_file="$dest/$file"
        dest_dir=$(dirname "${dest_file}")
        if [ "x$sizeTsState" == "x.T"  ]; then
          log "update ${dest_file} timestamp only"
          exec_cmd touch -r "${src_file}" "${dest_file}"
        elif [ "x$sizeTsState" != "x.."  ]; then
          if [ ! -d "${dest_dir}" ]; then
            exec_cmd sudo -u nobody mkdir -v -m 777 -p "${dest_dir}"
          fi
          exec_cmd install -o nobody -g users -m 666 -p -D -v "${src_file}" "${dest_file}"
        fi
      fi
    fi
done

Then modify your backup job script to call the bash mirror.sh wrapper instead of rsync directly:

Code: Select all

#!/bin/bash

LogFile=/var/log/array_backup.log
BackupDir=/mnt/disks/Frankenstore
Notify=/usr/local/emhttp/webGui/scripts/notify

echo `date` "Starting Array backup to " $BackupDir >> $LogFile

#Backup 4K via rsync
sleep 2
$Notify -i normal -s "Beginning 4K Backup" -d " 4K Backup started at `date`"
sleep 2
bash mirror.sh -m -s /mnt/user/4K -d $BackupDir >> $LogFile
#rsync -avrtH --delete /mnt/user/4K $BackupDir  >> $LogFile
sleep 2
$Notify -i normal -s "Finished 4K Backup" -d " 4K Backup completed at `date`"

sleep 2
$Notify -i normal -s "Beginning Blu-Rays Backup" -d " Blu-Rays Backup started at `date`"
sleep 2
bash mirror.sh -m -s /mnt/user/Blu-Rays -d $BackupDir >> $LogFile
#rsync -avrtH --delete /mnt/user/Blu-Rays $BackupDir  >> $LogFile
sleep 2
$Notify -i normal -s "Finished Blu-Rays Backup" -d " Blu-Rays Backup completed at `date`"

sleep 2
$Notify -i normal -s "Beginning DVDs Backup" -d " DVDs Backup started at `date`"
sleep 2
bash mirror.sh -m -s /mnt/user/DVDs -d $BackupDir >> $LogFile
#rsync -avrtH --delete /mnt/user/DVDs $BackupDir  >> $LogFile
sleep 2
$Notify -i normal -s "Finished DVDs Backup" -d " DVDs Backup completed at `date`"

sleep 2
$Notify -i normal -s "Beginning TV_Series Backup" -d " TV_Series Backup started at `date`"
sleep 2
bash mirror.sh -m -s /mnt/user/TV_Series -d $BackupDir >> $LogFile
#rsync -avrtH --delete /mnt/user/TV_Series $BackupDir  >> $LogFile
sleep 2
$Notify -i normal -s "Finished TV_Series Backup" -d " TV_Series Backup completed at `date`"

## RESTORE
## /usr/bin/rsync -avrtH --delete  $BackupDir  /mnt/cache/

echo `date` "backup Completed " $BackupDir >> $LogFile

# send notification
sleep 2
$Notify -i normal -s "Array Backup Completed" -d " Array Backup completed at `date`"

Step 8:
Unmount the MergerFS filesystem when you're done by calling the umount command with the previously used MergerFS mount point:

Code: Select all

umount /mnt/disks/Frankenstore

Step 9:
Unmount the individual drives and put the backup drive carrier back into offline storage.

Post by **Pauven** » Thu Nov 16, 2023 3:58 pm

The instructive post above is mostly complete now. I'm currently running a new backup, and it is working as intended. The MergerFS is showing my 100TB backup virtual pool, data is currently writing to the 1st drive only (will be tomorrow before it fills up and moves to the next drive), and the new backup script calling the mirror.sh script is successfully creating each directory as each one is copied in full, rather than pre-creating all the empty directories before filling them up with data.

I'll report back as the backup progresses, but for now I'd say my issues are solved and I finally have "my" ideal backup solution working as intended.

UPDATE:
The backup is about 2.5 hours into the weeklong job, and I'm super happy about the progress.

First, all of the data is writing to FS1, perfect:

Code: Select all

root@Tower:~# df -BM -H  | grep /mnt/disks/F
/dev/sdx1        17T  1.6T   15T  10% /mnt/disks/FS1
/dev/sdy1        17T  4.0M   16T   1% /mnt/disks/FS2
/dev/sdz1        17T  4.0M   16T   1% /mnt/disks/FS3
/dev/sdab1       17T  4.0M   16T   1% /mnt/disks/FS5
/dev/sdac1       21T  4.0M   20T   1% /mnt/disks/FS6
/dev/sdaa1       17T  4.0M   16T   1% /mnt/disks/FS4
Frankenstore    101T  1.6T   99T   2% /mnt/disks/Frankenstore

Even better, something I've never seen happen in my backups before, the other 5 drives have spun down! (ignore the 4TB cctv drive, that's not part of this backup array)

: image.png (212.99 KiB) Viewed 8358 times

I guess I always thought the drives remained spun up in Unassigned Devices regardless of activity, and it never occurred to me that the previous BTRFS pool backup was incorrectly writing across all drives, keeping them spun up.

With the new MergerFS pool, the idle drives are able to spin down, though the active drive does seem to be a touch warmer than before - but I think that's just because it's seeing more constant write activity, and 26C is still really cool as far as drive temps go.

The next big milestone will probably be tomorrow afternoon, when the first drive should fill up with about 90GB of space remaining and roll over to the 2nd drive.

Jamie · Post by **Jamie** » Thu Nov 16, 2023 6:47 pm

Thanks, for the info on this Paul. I have to start thinking of an unraid backup solution soon. That can't happen til spring though. I have to have a shoulder replacement in a few weeks and I won't be completely healed physically and financially til April. I can plan it out though, so I will be researching it now and until I can purchase the parts. Right after I got my parity restored, 3 days later another data drive failed. I just finished the preclear for it today. Most of the data drives are over 10 years old and came from my drobos so these failures are not a surprise. My parity and new drives are all 10 tb drives. I have 74 tb of data with 15 tb free.

Post by **Pauven** » Fri Nov 17, 2023 9:25 am

Hey Jamie, that sucks. I hope the surgery goes well.

Luckily HDD's continue to grow larger and cheaper per GB, so when you're ready you may be able to create a backup solution with just a few drives.

The 20TB drive I just purchased was under $300. 4 of those would be enough for your array backup, possibly even just 3 if you only need to backup around 60TB.

I know 28TB drives are now coming out, 30TB are in the wings, and even larger sizes are in the works. While bigger drives are more expensive, they are often cheaper per GB (as long as you stay away from the newest/largest drives), so you might find that when you're ready that you can do a sub $1k backup pool.

EDIT: Actually, that would be possible right now. I just checked, and NewEgg has the WD Elements 18TB USB 3.0 drive for just $230: https://www.newegg.com/wd-elements-18tb ... 6822234436

That's under $13/TB! Four of those would give you 72TB of backup capacity for just $920, or just under $1k with tax. Amazing.

Jamie · Post by **Jamie** » Fri Nov 17, 2023 6:34 pm

Thanks for the info Paul,

I don't have a 3D printer so I can't make a backup enclosure like yours. Do you have any tips on what I should look for regarding a multiple drive enclosure that is not NAS? I think Manni mentioned an enclosure he bought earlier in this thread. I guess I'll have to go back to take another look. I am looking for a 4 - 5 drive bay or maybe more.

I went back 1 page and found out that you mentioned it not manni.

https://www.amazon.com/dp/B07MD2LNYX/re ... RydWU&th=1

It looks like what I am looking for in a backup drive enclosure

Post by **Pauven** » Fri Nov 17, 2023 7:02 pm

I forgot about those, they should do good, and allow you to buy cheaper bare drives instead of drives already in an enclosure. The link you posted was the 8 bay model, but there's a 2nd one on the same page that's 4 bays like you want and it's half the price. As long as you do 20TB or larger drives, I think that would do you fine.

Post by **Pauven** » Sat Nov 18, 2023 9:43 am

The backup job is continuing, though there has been a hiccup.

My backup job processes each of my shares independently, 4K, Blu-Rays, DVDs, and finally TV_Series. 4K completed without any issues, and because I only have about 10TB of 4K content, it all fit onto the first backup disk, FS1.

The vast majority of my content is in Blu-Rays, around 64TB worth, so it will - or perhaps should - span 5 drives, from FS1 to FS5. This did not happen, though. I watched with intent as FS1 dropped to around 100GB free, since that was the minimum free space limit I had configured in MergerFS, and since it was copying Blu-Rays I figured actual free space left would likely be somewhere in the 53GB to 100GB range. Sure enough, it dropped to 93.2GB, at which point the processing seemed to hang for a while, and all write activity stopped.

A few minutes later, notifications popped up that the Blu-Rays backup had completed (hah! far from it, though, since it left off in the B's), and the DVDs backup had commenced. I checked, and DVDs were being backup up to the second drive, FS2. Since then, DVDs have completed and TV_Series is currently progressing, still writing to FS2. Combined, I have about 14TB of data in DVDs and TV_Series, so this will all fit on FS2 and should complete without issue.

For whatever reason, writing did not roll over from FS1 to FS2 smoothly during the Blu-Rays backup. I've checked my backup log, and no error was recorded, it simply stopped midway on the title "Brooklyn", and jumped to doing the DVDs share with no helpful information on what transpired.

Looking in the Brooklyn backup folder, I can tell it is incomplete, there are several files that were not copied after the main ISO file pushed FS1 below the 100GB min free space limit. I'm not entirely surprised that Brooklyn failed to finish, but I though that the remaining Blu-Rays would have processed beginning on the next drive, FS2. I think the error on Brooklyn was somehow catastrophic to the RSYNC process, possibly even due to the extra logic in the new Mirror.sh script, and so the entire Blu-Rays backup step aborted.

I will say I was already expecting some kind of issue at the drive transition, and there's an option setting in MergerFS that I thought might be needed:

moveonenospc=BOOL|POLICY: When enabled if a write fails with ENOSPC (no space left on device) or EDQUOT (disk quota exceeded) the policy selected will run to find a new location for the file. An attempt to move the file to that branch will occur (keeping all metadata possible) and if successful the original is unlinked and the write retried. (default: false, true = mfs)

A branch in MergerFS is a filepath on a specific disk, so a new branch here would have been the same filepath but on a different disk. "ENOSPC" seems to be shorthand for Error: No Space, which I figured would happen. MoveOnENoSpc, or Move On Error: No Space sounds like it would have retried writing the file to the next drive, FS2. I'm not sure if it would move the entire folder (which would be nice) or simply split the folder across to the next drive and continue writing files there. By default, MoveOnENoSpc is disabled, and I had thought about enabling it for this first test, but decided I wanted to see what would happen with more default settings.

After the TV_Series backup completes, I'll make the change and try again. And because RSYNC processes the delta between source and target, it should pick up right where it left off, on the Brooklyn title.

One of the nice things about how MergerFS is working with 6 independent drives is that I can see drive activity and space utilization of each drive in the Unraid GUI, something that wasn't available in the BTRFS pool:

: image.png (208.08 KiB) Viewed 8344 times

You can see that FS1 (at the bottom of the list) is full to 15.9GB and has spun down, and that FS2 (2nd in the list) is approaching 7TB written, so TV_Series has another 7TB or so to go before it completes sometime tonight. You can also see the other 4 FS# drives are sleeping and empty.

In spite of the drive transition hiccup, I'm super pleased with the new solution. Fingers crossed that enabling MoveOnENoSpc solves that issue.

Post by **Pauven** » Sun Nov 19, 2023 9:37 am

I updated my MergerFS post above with the MoveOnENoSpc=true option.

The TV_Series backup finished successfully last night around midnight. So this morning I unmounted the MergerFS pool and remounted it with the MoveOnENoSpc option enabled. I then restarted my backup job.

The backup fairly quickly arrived back at the Brooklyn title which originally caused it to abort. This time, the backup wrote the Brooklyn.ISO file to FS6 (not FS2 as I expected, possibly MergerFS chose it because FS6 is the larger 20TB drive and had the most free space, despite the config settings I had chosen).

After the ISO was written to FS6, it was removed from FS1, which free up space. Something I hadn't noticed before was that the Brooklyn.ISO on FS1 was not complete, it was only 44.4GB when it should have been 47.2GB. This means that the backup didn't fail on the next file after the ISO plunged free space below 100GB, but on the actual file mid-write.

With free space available now on FS1, Blu-ray backups resumed writing there, with the next title, Brothers, actually fitting. I was disappointed that Brooklyn was split across two drives and yet writing continued on the now mostly full drive.

After Brothers, the next title was Bullet Train, which definitely wouldn't fit. Yet the writes proceeded on FS1, until free space again plunged below 100GB (this time dropping to 77.3GB), at which point the backup failed again, aborting the Blu-Rays and proceeding through the DVDs and TV_Series (which both had no work to do since they were backed up just yesterday).

I did see an error message on the console where I was running my backup script:

Code: Select all

install: cannot create regular file '/mnt/disks/Frankenstore/Blu-Rays/Bullet Train/BDMV/STREAM/00080.m2ts': No space left on device

I'm not sure if this is the ENoSpc error or something similar but uniquely different. Regardless, MoveOnENoSpc wasn't the easy solution I was hoping it would be.

As a temporary measure, I think I can get around this by raising the min free space parameter to about 125GB, which should prevent make FS1 appear full and prevent new writes from going there. I can also do a little clean up, moving the Brooklyn metadata files on FS1 to live with the ISO file on FS6. I would also go ahead and delete Bullet Train on FS1 just to clean up, so that the backup should restart with that title and write it to the next available drive.

I would prefer to have this handled automatically via a command line MergerFS option, but I might be asking too much. I'll have to do more research.

Post by **Pauven** » Sun Nov 19, 2023 11:16 am

After further review of the MergerFS options, the best I came up with is applying branch specific parameters for creation and minfreespace. For example, instead of simply mounting /mnt/disks/FS1:, I can apply options to that branch, i.e. /mnt/disks/FS1=NC for No Create, or /mnt/disks/FS1=RW,135G for ReadWrite with Min Free Space overridden to 135G.

The problem with this approach is that the backup job is going to fail at the end of each drive, and I will have to do some cleanup, unmount the MergerFS pool, and remount it with modified options to force the backup to resume on each next drive.

This isn't horrible, since I only have 6 drives this would only have to be done 5 times, and each drive takes around 30 hours to fill up.

But after much consideration, I decided I could do even better. I really only want to control the creation of the directories. MergerFS will write files to existing directories (the "ep" existing path parameter in epff), so if I can override where a directory is written, MergerFS will simply follow the leader.

This is where the Mirror.sh script is coming in handy. This little script is a wrapper around the rsync process that makes sure directories are only created right before they are written to, instead of all directories in advance. By tapping into this logic, I could also override which branch the directory is written to.

So I added the following code into the existing Mirror.sh logic:

Code: Select all

        if [ "x$sizeTsState" == "x.T"  ]; then
          log "update ${dest_file} timestamp only"
          exec_cmd touch -r "${src_file}" "${dest_file}"
        elif [ "x$sizeTsState" != "x.."  ]; then
          if [ ! -d "${dest_dir}" ]; then
          
		    # NEW CODE BEGINS TO OVERRIDE MergerFS /Frankenstore PATH WITH INDIVIDUAL DRIVE BRANCH
		    #Remove all characters except for forward / in the dest_dir, to get the total number of forward /'s
		    slashes="${dest_dir//[^\/]}"
		    #If the number of slashes is 6, then grab the parent directory and see if it exists
		    if [ "${#slashes}" -eq "6" ]; then
		      par_dir=$(dirname "${dest_dir}")
		      #If the parent directory doesn't exist, override the slashes value to 5 slashes to trigger the branch override logic below
		      if [ ! -d "${par_dir}" ]; then
		    	slashes="/////"
		      fi
		    fi
		    #If the number of slashes is 5, then override the creation branch based upon min free space
		    if [ "${slashes}" -eq "5" ]; then
		      fs1=$(df -P /mnt/disks/FS1 | tail -1 | awk '{print $4}') 
		      fs2=$(df -P /mnt/disks/FS2 | tail -1 | awk '{print $4}') 
		      fs3=$(df -P /mnt/disks/FS3 | tail -1 | awk '{print $4}') 
		      fs4=$(df -P /mnt/disks/FS4 | tail -1 | awk '{print $4}') 
		      fs5=$(df -P /mnt/disks/FS5 | tail -1 | awk '{print $4}') 
		      fs6=$(df -P /mnt/disks/FS6 | tail -1 | awk '{print $4}') 
		      if [ $fs1 -gt 104857600 ]; then
		        dest_dir="${dest_dir/Frankenstore/FS1}"
		      elif [ $fs2 -gt 104857600 ]; then
		        dest_dir="${dest_dir/Frankenstore/FS2}"
		      elif [ $fs3 -gt 104857600 ]; then
		        dest_dir="${dest_dir/Frankenstore/FS3}"
		      elif [ $fs4 -gt 104857600 ]; then
		        dest_dir="${dest_dir/Frankenstore/FS4}"
		      elif [ $fs5 -gt 104857600 ]; then
		        dest_dir="${dest_dir/Frankenstore/FS5}"
		      elif [ $fs6 -gt 104857600 ]; then
		        dest_dir="${dest_dir/Frankenstore/FS6}"
		      fi		
		    fi
		    log "creating directory ${dest_dir}"
		    # END OF NEW CODE
		    
		exec_cmd sudo -u nobody mkdir -v -m 777 -p "${dest_dir}"
          fi
          exec_cmd install -o nobody -g users -m 666 -p -D -v "${src_file}" "${dest_file}"
        fi

While this does make the code specific to my backup solution (refrencing the Frankenstore filesystem name as well as the FS1 > FS6 backup drive names), it adds a ton of functionality.

Each backup drive is evaluated, in order from FS1 to FS6, for the first drive that has more than 141,557,760 bytes available - this is 135GB. This works along with the MinFreeSpace setting of 90G that I am using. If a directory doesn't exist, and FS1 only has 112G free, then the directory will NOT be written to FS1, even though FS1 has more than 90G free for MergerFS file writing. FS2 is then checked, and it has a couple TB free, so the directory gets written to FS2. Then when the files for that directory are copied, they are written to FS2 automatically by MergerFS because it is using the path preservation option.

I chose 135G since it is 45G larger than my 90G min free space MergerFS setting, so in theory the directory can be created with >135G existing, and after copying a large Blu-ray there will still be >90G free, so it won't run into errors.

I'm now thinking I may lower the min free space parameter, since the new directory creation parameter is doing most of the heavy lifting. I had originally chosen 90G as that is large enough for a 4K UHD disk. In theory, I could revert min free space back to the default value (a small 4G, but plenty large for metadata), and simply use the new directory creation logic to control which branch the directory is created within, and the files will always follow the directory. If I did this, I could also lower the directory creation threshold below 135G, perhaps using 100G so I can fit another disc on each backup drive.

Testing now, fingers crossed the new logic works.

My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution

Re: My Unraid NAS Backup Solution