Open Source Tape Backups (Bareos)

In this post I’m going to take you through some of the background surrounding data backups, tapes vs disk vs cloud, as well as an actual tutorial for the option that worked best for one of my clients – BareOS. Before getting into the general discussion of storage mediums, it’s probably important to note that there is currently some controversy surrounding BareOS vs Bacula.

BareOS vs Bacula
My fellow Linux System Administrators have probably heard of Bacula at some point. It’s old as dirt, and yes that’s a compliment. Tivoli, NetBackup and many of the other “old guard” products operate on the same principles as Bacula. Bacula became popular with the open-source community for obvious reasons, it was fully open-source, and performed the very important task of Tape-Based backups, and it performed them reliably. Everything from a single tape drive to a massive barcode reading robot library works with Bacula. Unfortunately at some point, the bacula developers decided to split the project into a FOSS and a premium / enterprise tree. The argument used for this was to ensure continued attention to the code, as working on it for free was no longer viable. We’ve seen this happen before with MySQL, and it often spells the end of a great product. Up until 2010, Bacula as a FOSS option was dying. Code was still being written for the Enterprise version, but it wasn’t making it’s way to the FOSS side (predictable). Then Bacula forked, and BareOS was created. BareOS was created with the idea of remaining FOSS, with no “premium” code branch. BareOS developers could make money on services / consulting / stable code branches, but that’s it.

Fast forward to today and BareOS has evolved to be far superior to Bacula in it’s feature set. BareOS maintains compatibility with Bacula at various levels, but not without sacrificing functionality. I chose to forgo Bacula compatibility, since we use some of the new features. So why does this matter? Well the fork between Bacula and BareOS has resulted in a lawsuit between Bacula and BareOS. Look here: Why a Lawsuit – Bacula and here: Why a Lawsuit – BareOS. Among other things, Bacula accuses BareOS of stealing enterprise code and using it in BareOS. I obviously have no idea is this accusation is true. What I can say however, is that the BareOS community is vibrant and alive. Bacula’s is not. New features are coming out constantly for BareOS and bugs are squashed quickly. In no way can I say the same for Bacula. Add to that, the fact that BareOS provides pre-built binaries for Windows and repos for all major Os distributions through the use of Open Build Server. Other new features are here: BareOS New Features. Given that BareOS code is fully FOSS, and the community is vibrant, it’s not going away anytime soon as far as I can see.

Tape vs Disk vs Cloud
So what on earth would drive me to use tape based backups? After all, isn’t that a dead technology? With disk based systems offering Deduplication, great new innovations such as AWS Storage Gateway, I’m clearly nuts. Well not so fast. New tools in the arsenal are a great thing, but tape still has a place. As it so happens with this particular client, they already were in the possession of an LTO-5 Superloader, fully populated with tapes. Given the initial investment was already made, tossing it out the window would be incredibly irresponsible. Even if the initial investment wasn’t made though, tape offers something that is easy to understand, price. Tapes are cheap, really, really cheap. You an buy a 2.5TB (native) / 6.25TB (compressed) LTO-6 tape for about $80. While you can buy a 2.5-3TB (native) 7200RPM SATA hard drive for around $110 as of this writing, that is not going to get you hot-swap capabilities, nor does it account for the fact that you’ll probably have to buy some over-priced hot-swap tray embedded version for most enterprises. That’s going to run more around $450 as of this writing.

It’s not all purely about cost per TB though. De-duplication drastically reduces the amount of data you need to store. Unfortunately products such as the Quantum RDX only support de-duping with Windows Servers as the controlling mechanism, and the price premium on them is significant. Lots of other vendors have deduplication options, EMC for one. These options are often expensive though. Open-Source projects have arisen to bring dedup to Linux, but it’s still early days in this area. I might love innovation generally, but when it comes to backups, it pays to be conservative. If the tape drive wasn’t already purchased though, I would certainly go down this road for exploratory reasons.

What about The Cloud? First off, the cloud doesn’t solve everything. If you speak to an AWS employee they will often call you stupid for doing anything on your own, and how Amazon’s pricing makes them the better option in nearly every case. Well AWS (and rackspace / others) are great, actually amazing additions to the tool-set. As a part of the BareOS setup, I actually use Glacier to store Catalog backups for archival purposes. At $.01/GB a month, it’s great for archives. S3 RRS (Reduced Redundancy Storage) is also a great thing to have. The previously mentioned storage gateway makes it clear that Amazon is trying to be the go-to source for all server storage needs. In many cases, they may be the best option. Amazon knows their stuff, they drastically reduce the amount of time associated with doing any part of your infrastructure in-house. They also provide month-to-month billing that is directly tied to usage. This makes accounting geeks like me very happy. Sysadmins who haven’t studied management might not understand how useful this is. It’s a serious benefit of the cloud, but that does not mean it’s right for everyone.

I’ve lived in many developing countries where a reliable internet connection isn’t anywhere to be found. In these parts of the world, the cloud just isn’t an option. This situation applies to the majority of the world. If everything you are doing is in the US though, this may not be a concern. So putting that aside, the other thing to consider before embracing the cloud for your storage needs, is cost. Amazon will tell you that they are always cheaper when you account for all of the “other” costs associated with doing something in-house. This is sometimes wholly accurate, and other times partially. If you are an IT manager considering outsourcing a component of your infrastructure to the cloud, first consider current staff underutilized cycles. In other words, do you have staff who you can not terminate, as they provide help-desk / other necessary services, but have free hours in the day. If so, efficiency dictates that this time be “discounted” in total cost calculations for a solution. Amazon may still work out to be cheaper, they may not. If they do, and you are willing to accept someone else handling your data, by all means, go to the cloud. Bear in mind that a solution like BareOS doesn’t integrate well with S3 though (google it), so something like a deduplicated rsync job to local disks, synced to S3 or the full-fledged virtual storage gateway might be safer.

BareOS in a Multi-platform Environment
First off, RTFM. Google that if you don’t know what it means. Sometimes FOSS products don’t have decent documentation. That is not the case with BareOS. The pdf of the manual is over 300 pages long, and I actually recommend reading all of it. It’s a fantastic manual (credit to Bacula for most of it), that actually takes you through everything you need to test BEFORE going to production. If you actually test your tape / disk / cloud implementation as the manual instructs you to, and run restore tests, you’ll be able to sleep at night knowing that things are working as intended, and you actually understand everything happening. If you don’t want to understand what’s happening under the hood and you want things to “just work”. I’m sorry, but you are in the wrong field. Things always go wrong, paying for something doesn’t protect you from failure. If you don’t spend the time up-front to learn the nuts and bolts, you’ll spend your career blaming everyone else for your own failings. I can spot a sysadmin / network admin like this a mile away, and it’s simply unacceptable. Go back to school, learn the background, and come back if you aren’t in this just for the money. Also, to the IT managers out there who don’t dive into the nuts and bolts routinely... You could still stand to benefit from reading this manual. It discusses the concepts important to backups, volumes, schedules, retention, etc. I’ve been on both sides, and if you want to manage someone you need to understand at least a higher-level view of what your employees are doing.

Now that you’ve read the manual... (good job btw). Here are some useful tidbits that may be applicable outside of just this clients context:

It goes without saying, but if you break something using any of this, I take no responsibility. These parameters work for a very specific context, that context is not your context. You’ve been warned.

Setup Software on BareOS Host:

  • yum install bareos bareos-mysql bareos-storage-tape
  • Run bareos mysql table setup (consult manual)
  • Change configs (NOTE: Block size matters! – 128KB is selected because of errors encountered with larger sizes. This reduces performance but is “safer”)
  • Add bareos user to cdrom group (for /dev/sg2 access)
  • Add iptables rule for tcp 9101:9103
  • Simply FYI: btape command (speed option) was used to find the optimum block size. Do NOT run this again, it wipes out tapes

Setup Software On Client (CentOS):

  • Add repo: wget -O /etc/yum.repos.d/bareos.repo http://download.bareos.org/bareos/release/latest/CentOS_6/bareos.repo
  • yum install bareos-filedaemon
  • Update /etc/bareos/bareos-fd.conf to have correct info / pass
  • service bareos-fd restart
  • chkconfig bareos-fd on
  • Edit /etc/sysconfig/iptables (add port 9102 tcp)
  • service iptables restart
  • Verify LANG environmental variable includes UTF

Setup Software On Client (Ubuntu – and Zentyal Server):

  • URL=http://download.bareos.org/bareos/release/latest/xUbuntu_12.04/
  • printf “deb $URL /n” > /etc/apt/sources.list.d/bareos.list
  • wget -q $URL/Release.key -O- | apt-key add –
  • apt-get update
  • apt-get install bareos-filedaemon
  • Update /etc/bareos/bareos-fd.conf to have server info / pass
  • service bareos-fd start
  • (Ubuntu is wide-open iptables by default, but add entry if necessary, Zentyal must have the entry added through the web interface)
  • Verify LANG environmental variable includes UTF

Setup Software On Client (Windows):

  • Download appropriate client: http://download.bareos.org/bareos/release/13.2/windows/
  • During install leave the default until it asks for the director configuration, pull that from the server (as well as the password)
  • Also UNCHECK “compatible” mode
  • The firewall rule install option may make the application crash, if it does, simply re-install with the option unchecked.

Useful bconsole commands:

  • estimate listing – Show a per-job listing of all files that would be backed up if the job was to run and the total backup size. This commands also acts as a means to make sure client <-> server communication is working.
  • unmount – Ejects the tape from the tape-drive, and puts it back in the appropriate slot. You MUST run this before ejecting magazines.
  • update slots – Reads the barcode information on the tapes after you have changed the tapes in the magazine.
  • mount – Loads the appropriate tape back into the drive and enables the backup server to continue with jobs (after magazine change)
  • list volumes – Show space usage / pool allocation on all tapes
  • list jobs – Show jobs run / size / date information
  • show schedule – Detailed information on when jobs will run (difficult to understand at first)
  • status director – List what jobs are running, and what jobs are in the queue for the next 24-hours
  • status storage – See what’s going on with the tape drive / auto-loader
  • status client – See how far along a client is in a backup, what file it’s transferring, the transfer rate, and many other details.
  • status scheduler – See what jobs are enabled / disabled and what schedule they operate on
  • run – Manually kick off a job
  • restore – Restore files
  • label – Make a blank / wiped tape ready for usage. DO NOT do this unless you are bringing fresh media into the mix. It is not necessary to label the tapes already in use. They will automatically be recycled as appropriate.
  • label slot=X barcodes – Use a barcode reading autochanger to pull the id off the barcode, digitally label the tape, and add it to a pool.
  • cancel – Stop a running job

To Wipe 2 Tapes from a previous backup setup for recycling (when the tape from slot 2 is loaded):

  • btape -v Superloader
  • rewind
  • quit
  • btape -v Superloader
  • weof
  • quit
  • /usr/lib/bareos/scripts/mtx-changer /dev/sg2 unload 2 /dev/st0 0
  • This unloads the tape from the drive and puts it in the OPEN slot 2 (make sure it’s actually open)
  • /usr/lib/bareos/scripts/mtx-changer /dev/sg2 load 1 /dev/st0 0
  • This loads the tape from the OCCUPIED slot 1 (make sure it’s actually occupied)
  • btape -v Superloader
  • rewind
  • quit
  • btape -v Superloader
  • weof
  • quit
  • /usr/lib/bareos/scripts/mtx-changer /dev/sg2 unload 1 /dev/st0 0
  • This unloads the tape from the drive and puts it in the OPEN slot 1 (make sure it’s actually open)

Policy on changing magazine:

  • unmount (in bconsole)
  • remove left magazine (using button push)
  • swap tapes
  • replace magazine
  • remove right magazine (using button push)
  • swap tapes (LEAVE THE CLEANING TAPE IN THE SLOT, It starts with CLN)
  • update slots (in bconsole)
  • mount

Policy on adding a new tape to the setup:

  • unmount (in bconsole)
  • remove magazine (with button push)
  • Insert new tape into slot (pay attention which)
  • label slot=[SLOT YOU PUT THE TAPE IN 0-15] barcodes
  • mount

Policy on restoring an individual file or all client files:

  • restore (in bconsole)
  • To restore the most recent versions of all files, select option 5.
  • Select the client
  • To restore all files, once you are dropped into a shell, type mark *
  • To restore an individual file, cd through the tree to the file, then mark [filename]
  • run estimate and count to make sure the number of files / file-size looks right when you are finished marking files
  • NOTE: This will restore all selected files to either /tmp/bareos-restores or C:tmpbareos-restores (depending on OS) – to restore it somewhere else, you will need to enter mod after selecting done, and change the where.
  • To restore files to the original location, set where to /
  • Type done
  • Confirm your selection
  • This will automatically select the most recent backup, and the associated tape, unload whatever tape is currently loaded, load the correct one, then restore the file to the remote host. If you want to restore files to a different host, consult the manual

Method of Backing up the Catalog (MySQL DB and configuration files):

  • The BareOS tape drive has a backup job that writes the catalog to the Linux-Full-Pool nightly. In the case that the backup server became inoperable, or bareos configuration / database was corrupted, restoring anything becomes difficult. The current configuration allows recovery from this situation via 2 methods:
    • The “standard” but horrible method, is to use the bootstrap files that are emailed nightly in order to locate where on the tape the most recent catalog backup is, pull it off using 3rd party tools, then restore the mysql database and configuration files.
    • The Amazon way – To ease restoration in this situation, and to allow an easy view into how configurations / databases have changed over time, there is a script that runs at the end of the Catalog backup.
  • This script packages up a tar file with the catalog components, then uploads it to an Amazon S3 – Reduced Redundancy bucket created under our account. This bucket has a “Lifecycle” policy that all files older than 2 days are transferred to Glacier storage. Thus the past 2 nights of catalog backups are available 24×7, and historical catalog data is available going back up to 2 years.
  • Viewing these backups is as simple as going to http://aws.amazon.com – logging in with the credentials, going to the s3 console and looking at the appropriate bucket.

To Make a Barcode: