Can you please share your backup strategies for linux? I’m curious to know what tools you use and why?How do you automate/schedule backups? Which files/folders you back up? What is your prefered hardware/cloud storage and how do you manage storage space?
What’s a backup?
I use Borg Backup, automated with a bash script that Borg provides. A cron job runs the script at the desired frequency. I keep backups on different computers, ideally I would recommend one copy in the cloud and one copy on a local machine. Borg compresses and encrypts its backups.
Edit: I migrated a server once using the backups from this system and it worked great.
I should really cron my Borg script rather than waiting for a sinking anxiety to set it and doing backups at random intetvals
Make sure to check if it actually ran from the cron job, cron is a finnicky tool
Borg backup is gold standard, with Vorta as a very nice GUI on machines that need it. Otherwise, all my other Linux machines are running in proxmox hypervisors and have container/snapshot/vm backups regularly through proxmox backup server to another machine. All the backup data is then replicated regularly, remotely via truenas scale replication tasks.
Borg via Vorta handles the hard parts: encryption, compression, deduplication, and archiving. You can mount backup snapshots like drives, without needing to expand them. It splits archives into small chunks so you can easily upload them to your cloud service of choice.
Adding my “Me too” to Vorta/Borg. I use it with Borgbase, which I like because it’s legitimately cheap and they support Borg development. As well, you can set Borg backups with Borgbase to “append only,” which prevents ransomware or other unexpected “whoopsies” from wiping out your backup history.
I backup most of my computer every hour, but have pruning rules that make sure things don’t get too out of hand. I have a second backup that backs everything up to my NAS (using Vorta, again). This is helpful for things like my downloads folder, virtual machines, or STEAM library - things I wouldn’t want to backup over the network, but on occasion I do find myself going “whoops, I wanted that.”
I also have Vorta working on my Mom’s Macbook, then have Borgbase send me an email when there isn’t any activity for longer than a couple of days. Once I got automatic pruning working right I never had to touch this again.
I plug in an external drive every so often and drag and drop my home dir into it like it’s 1997. I’m not running a data center here. The boomer method is good enough.
yeah about the same, old coot here, I plug a USB3-SSD (encrypted with LUKS) and rsync from internal HD to this external HD. That’s it.
I do exactly this but with a little shell script that just has some
rsync -av
andmv -f
calls instead of dragging and dropping.
Hope.
Not to save stuff
I too am raw-dogging my Linux install
Shout out to all the homies with nothing, I’m still waiting to buy a larger disk in hopes of rescuing as much data from a failing 3TB disk as I can. I got some read errors and unplugged it about 3 months ago.
I was talking with a techhead from the 80s about what he did when his tape drives failed and the folly that is keeping data alive on a system that doesn’t need to be. His foolproof backup storage is as follows.
- At Christmas buy a new hard drive. If Moore’s law allows, it should be double what you currently have
- Put your current backup hardrive into a SATA drive slot. Copy over backup into new hard drive.
- Write with a sharpie the date at which this was done on the harddrive. The new hard drive is your current backup.
- Place the now old backup into your drawer and forget about it.
- On New Years Day, load each of the drives into a SATA drive slot and fix any filesystem issues.
- Put them back into the drawer. Go to step 1.
I use immutable nixos installs. Everything to redeploy my OS is tracked in git including most app configurations. The one exception are some GUI apps I’d have to do manually on reinstall.
I have a persistence volume for things like:
- Rollbacks
- Personal files
- Git repos
- Logs
- Caches / Games
I have 30 days (or last 5 minimum) of system rollbacks using BTRFS volumes.
The personal files are backed up hourly to a local server which then backs up nightly to B2 Backblaze using rclone in an encrypted volume using my private keys. The local server has a mishmash of drives in a mirrored LVM setup. While it works well for having mixed drives, I’ll warn I haven’t had a drive failure yet so I’m not sure the difficulty of replacing a drive.
My phone uses the same flow with RoundSync (rclone + GUI).
Git repos are backed up in git.
Logs aren’t backed up. I just persist them for debugging and don’t want them lost after every reboot.
Caches/Games are persisted but not backed up. Nixos uses symlinks and BTRFS to be immutable. That paradigm doesn’t work well for this case. The one exception is a couple game folders are part of my personal files. WoW plugin folder, EvE online layouts, etc.
I used to use Dropbox (with rclone to encrypt). It was $20/mo for 2Tb. It is cheaper on paper. I don’t backup nearly that much. Backblaze started at $1/mo for what I use. I’m now up to $2/mo. It will be a few years before I need to clean up my backups for cost reasons.
The local server is a PC in a case with 8 drive bays plus some NVME drives for fast storage. It has a couple older drives and for the last couple years I typically buy a pair of drives on sale (black Friday, prime day, etc). I have a little over 30TB mirrored, so slightly over 60TB in total. NVME is not counted in that. One NVME is for the system, the others are a caching layer (monero node) or temp storage (transcoding as it also my media server).
I like the case, but if I were to do it again, I’d probably get a rack mountable case.
You seem pretty organized in your strategy, I would suggest you just pull a drive in your LVM to check how that goes for you. I’ve had issues in JBOD style LVM volumes with drive swaps, but YMMV.
Frankly, I use ZFS now in anything that I would have use LVM in before. The feature set is way more robust.
Good call on a simulated failure. When I first set it up, it was LVM/BTRFS or ZFS as my top choices. It was a coin toss at the time because I hadn’t built this sort of setup before.
Hetzner’s storage boxes have caught my eye but i haven’t tried them yet.
All my code and projects are on GitHub/codeberg.
All my personal info and photos are on proton drive.
If Linux shits itself (and it does often) who cares. I can have it up and running again in a fresh install in ten minutes.
But proton drive soaent have a linux client yet, I suppose you just upload your files there once through the web interface and don’t sync?
Scuse the cut and paste, but this is something I recently thought quite hard about and blogged, so stealing my own content:
What to back up? This is a core question to ask when you start planning. I think it’s quite simply answered by asking the secondary question: “Can I get the data again?” Don’t back up stuff you downloaded from the public internet unless it’s particularly rare. No TV, no Movies, no software installers. Don’t hoard data you can replace. Do back up stuff you’ve personally created and that doesn’t exist elsewhere, or stuff that would cause you a lot of effort or upset if it wasn’t available. Letters you’ve written, pictures you’ve taken, code you authored, configurations and systems that took you a lot of time to set up and fine tune.
If you want to be able to restore a full system, that’s something else and generally dealt best with imaging – I’m talking about individual file backups here!
Backup Scenario Multiple household computers. Home linux servers. Many services running natively and in docker. A couple of windows computers.
Daily backups Once a day, automate backups of your important files.
On my linux machines, that’s things like some directories like /etc, /root, /docker-data, some shared files.
On my windows machines, then that’s some mapping data, word documents, pictures, geocaching files, generated backups and so on.
You work out the files and get an idea of how much space you need to set aside.
Then, with automated methods, have these files copied or zipped up to a common directory on an always-available server. Let’s call that /backup.
These should be versioned, so that older ones get expired automatically. You can do that with bash scripts, or automated backup software (I use backup-manager for local machines, and backuppc or robocopy for windows ones)
How many copies you keep depends on your preferences – 3 is a sound number, but choose what you want and what disk space you have. More than 1 is a good idea since you may not notice the next day if something is missing or broken.
Monthly Backups – Make them Offline if possible
I puzzled a long time over the best way to do offline backups. For years I would manually copy the contents of /backup to large HDDs once a month. That took an hour or two for a few terabytes.
Now, I attach an external USB hard drive to my server, with a smart power socket controlled by Home Assistant.
This means it’s “cold storage”. The computer can’t access it unless the switch is turned on – something no ransomware knows about. But I can write a script that turns on the power, waits a minute for it to spin up, then mounts the drive and copies the data. When it’s finished, it’ll then unmount the drive and turn off the switch, and lastly, email me to say “Oi, change the drives, human”.
Once I get that email, I open my safe (fireproof and in a different physical building) and take out the oldest of three usb Caddies. Swap that with the one on the server and put that away. Classic Grandfather/Father/Son backups.
Once a year, I change the oldest of those caddies to “Annual backup, 2024” and buy a new one. That way no monthly drive will be older than three years, and I have a (probably still viable) backup by year.
BTW – I use USB3 HDD caddies (and do test for speed – they vary hugely) because I keep a fair bit of data. But you can also use one of the large capacity USB Thumbdrives or MicroSD cards for this. It doesn’t really matter how slowly it writes, since you’ll be asleep when it’s backing up. But you do really want it to be reasonably fast to read data from, and also large enough for your data – the above system gets considerably less simple if you need multiple disks.
Error Check: Of course with automated systems, you need additional automated systems to ensure they’re working! When you complete a backup, touch a file to give you a timestamp of when it was done – online and offline. I find using “tree” to catalogue the files is worthwhile too, so you know what’s on there.
Lastly – test your backups. Once or twice a year, pick a backup at random and ensure you can copy and unpack the files. Ensure they are what you expect and free from errors.
Here’s one that probably nobody else here is doing. The backup goes on my mobile device. Yes, the thing in my pocket.
- Mount it over SSHFS on the local network
- Unlock a LUKS container in the form of a 30GB sparse file on the device
rsync
the files across- Lock, unmount
The backup is incremental but the container file never changes size, no matter what’s in it. Your data is in two places and always under your physical control. But the key is never stored on the remote device, so you could also do this with a VPS.
Highly recommended.
Where is the key stored?
Locally.
I use Duplicity to backup my home directory, excluding Steam and Downloads folders. It is setup to backup weekly to my NAS mounted as NFS. The NAS has a weekly cron task to upload the backups to pCloud using rclone. I backup this way, several computers (2 desktop, 2 laptop, the NAS as well). The files included in this strategy are essentially my photos, documents and configs. My software installations, games, media library are not backed up.
I have a synology NAS with all my documents and family photos. I’m using the synology drive app on Linux and synology photo on android.
All of that is backed up on Backblaze
I really make backups only a few times. I have the configuration files of my systems on my GitHub and Codeberg. The rest, I don’t need; the only things I keep are books and music that I download from the internet, which I have on a 1TB external hard drive.
When I have made a backup for a specific reason, I have done it with rsync. It’s a tool that works quite well and is for the command line.