S is for Spanning

S is for Spanning

Seldom has a journey been so winding and long as ours has been with Canister, Hedge’s archive software that has transformed the industry’s perception of LTO.

Back in 2017, mLogic reached out to us, asking if we could do for LTO what we had done for backups with Hedge. We said “sure, why not” but when we started looking at it, we were amazed by how archaic it all seemed.

But it quickly turned out that LTO is not archaic it all: it’s really **high-tech, to be honest. It’s just the software layer that is about as user-friendly as a cornered rattlesnake. That’s exactly the sort of challenge we like to take on, and so in Sept 2018, we released Canister version 1.0.

Since then, we’ve done 27 updates. That might sound like a lot, but it’s only a few percent of the total amount of updates Hedge has pushed out across all our products. Progress on Canister has been slow but steady.

Working with LTO has proven to be so full of caveats, pitfalls, and workarounds that Roelof and I, Hedge’s founders, took up Canister ourselves and have kept it separate from our regular app development process because we wanted to get it into software adulthood first. That obviously was going to take time, and it did.

Today, we declare Canister to be a grown-up: it now has a dedicated team, a Windows version is in development, manufacturers love it, and we’re adding two features we found essential going forward - Spanning and Preflight Checks.

Spanning?

In development and testing for over a year, Canister’s new Spanning capability allows you to archive more data than will fit on a single tape. No more need to cut up a hard drive’s contents into parts that fit a single tape, then archive them one by one to different tapes. Instead, just select everything you need and Canister will keep asking for new tapes until all is done.

At Hedge, we go to great lengths to keep things really simple. If we can't make it simple, we won’t build it. That means a lot of what’s considered a standard feature in the LTO landscape we quickly dismissed as “that’s not our turf”. They’re not features for filmmakers but aimed at enterprises: deduplication, automation, block-level incremental backups, system backups. Canister is not a VEEAM, Retrospect, Yoyotta, or XenData. Different audiences; different needs.

But, time and time again, Spanning kept coming up, even with those users with a straightforward, non-convoluted workflow. Speaking to a lot of customers (mainly because they kept emailing us pleading for the feature to be built), it was clear this was by far their biggest issue, and spanning was the single feature that would enhance their productivity the most.

Foundation

As always, it’s the fundamental details that make a feature work really well. We talked a lot to those users asking us for spanning, focusing on why they preferred us building it to just using a product that was already available. When people prefer not to use existing solutions because they don’t work for them, it’s a warning sign for it to be a hard problem. Collecting those gripes gave us a good idea about how to build this:

No databases

What stood out was that users asked us for a robust way to do spanning that didn’t rely on a database. Why? Because they’re a vendor lock-in. That’s a barrier for your workflow; a database is typically a single manufacturer’s way of organizing the world. You won’t be able to take your tapes to another manufacturer’s system, or it’ll be hard to move the data out of the database. And if they break, you’re screwed.

That’s why Spanning relies on Canister’s Catalogs which are your tape’s indexes recreated as tiny files on your own computer. That way the file system, the database your OS relies on, does the heavy lifting with zero chance of data corruption or vendor lock-in.

No cut-up files

Another vendor lock-in: in an attempt to fill up tapes 100%, when a file doesn’t fit on a tape most apps will cut a file in two. That’s all very well, but you’ll need the original software to put the parts back together. Instead, why not fill the tape with a file that does fit, or, if there isn’t one available, leave a few GB free. That’s not going to break the bank, with LTOs rock-bottom $/GB price point.

Spanned catalogs

There’s no use in spanning data across a group of tapes if you can’t find which tapes belong to a group. Canister solves this in a super simple way: when starting a transfer that won’t fit on a single tape, you are first warned, and then asked to specify a label. That label will show up in your Catalogs as a separate tape, with shortcuts to all tapes that make up your spanned catalog.

The way we built Catalogs paid off immediately. Not having a vendor lock-in meant that the excellent cataloging app Neofinder was able to build an integration for Canister in an hour. That’s how we like to see it - it should be as easy to stop using our products, as it’s to get started. That’s power to the people.

One spanned catalog, containing two tapes.

Dupe Detection, on steroids

Since day one, Canister has shipped with Duplicate Detection, to ensure you only write to tape what has changed instead of rearchiving all data.

Last year, we upped the ante by adding versioning to Duplicate Detection; if a file exists on tape, but is older, Canister will rename and hide the old version so you get a track record of changes. This robust mechanism is essential to spanning as LTO transfers have a habit of not going as planned; who hasn’t come back to their Mac only to find there had been a power issue, or a cable dislodged, and a tape in Limbo?

With a single tape, that’s annoying, but with a transfer spanning multiple tapes, it’s guaranteed madness. Also, with the time it takes to fill up a tape, we think it should be possible to stop a transfer any moment that suits you, to resume it at any later time, even at a different computer or hardware setup.

Again, this requirement calls for a non-database approach where the tapes are written containing all necessary data: they need to be self-describing.

So, when you, for whatever reason, have to interrupt a spanning transfer, Duplicate Detection ensures you can pick up where you left off. Just select your source - all of it, insert the first tape, and Canister will detect what has already been done and ask for tape 2.

Rinse, repeat, and within seconds your LTO drive will be archived to that one tape that got interrupted — whilst knowing for sure no file is skipped unintentionally.

Status report, number one.

Part of the reason spanning took so long to build, test, and polish is that 2021 was the year that everything changed for Mac users: Apple silicon shipped. While LTO users are usually very slow to upgrade, as LTO decks die hard, buying new Macs was suddenly everyone’s favorite pastime.

But that brought massive challenges with it: Apple’s (rightful) security crusade already reared its head with Big Sur on Intel, but with M1 it was in full swing: Apple stopped bundling drivers for RAID, Host Bus Adapters, and other ubiquitous pro video storage, and also installing kernel extensions like macFUSE became a major pain in the ass.

On top of that ATTO, a leading vendor in the Host Bus Adapter industry decided to release two new product lines, resulting in three different drivers. And that’s where the big catch 22 started: suddenly, everyone with a Thunderbolt LTO drive was baffled as they didn’t realize using it on Apple silicon meant they needed to install a driver for a device they didn’t know they had.

Mayhem, I assure you. And emails, lots of emails.

So, we put massive effort into writing proper documentation for installing all of that. Still, it was too hard for people who were used to not thinking about the eight (!) moving pieces that make up a working LTO environment.

Meet Preflight Checks: Canister’s dashboard, if you will. When you start Canister, a range of tests and probes are fired off, reporting back information about the state of your system. Incorrectly installed extensions are detected, outdated versions reported, and missing components offered, giving you a good overview of what needs to be done to get up and running - and the confidence that you’re good to continue. Four green lights? Off to the races.

Canister's Preflight Checks

Preflight Checks is the perfect example of a feature “built for and with” you, our users. We collect every issue you report (in this case, on getting up and running), we envision The One Solution To Rule Them All, build it as a separate app, test the water with all those that reported the issue, improve, polish, and then fold it into the main app. It took us nine months and hundreds of weird edge cases (Missing Allow buttons, anyone?) but today, 99% of you are smooth sailing right out of the gate. That’s an amazing feat, and we couldn’t have done that without you 👏

Updates

Since its inception, Canister has had the same license model as our other apps, where a license is perpetual, cross-platform (if available on more OSs), and comes with a guaranteed year of updates and support.

With Canister, we often found we couldn’t add enough new value within 12 months, so we often extended that period. In three and a half years, we’ve only charged for an upgrade once.

Canister 22.1 - this release - is the second. If you bought a license less than a year ago, you’ll receive this massive update for free. Everyone else can upgrade their license for just $149. Of course, you don’t have to upgrade — only do so if spanning is useful to you, or if you like getting up and running fast with Preflight Checks.

So what’s next? Queuing is high on our list, concurrent archiving, and as we mentioned, a Windows version is in the works. Then, library support, catalog sharing, Foolcat-y reports, and maybe even transcoding. More ideas? We’re all ears.