Snapshots and Shadows with the VSX

In a few recent posts, I’ve been talking about all the nifty things the Coraid VSX does. Today I want to talk about how useful snapshots are, but first we need to understand thin provisioning (skip the next few paragraphs if you’re already caught up on this).

Until your user writes a block, we don’t care what’s on the disk. Well, we care if it has data on it, but we don’t mind if the block is free. This observation, the fact that we don’t care about a block until it’s written, is the basis of thin provisioning.

If you haven’t written anything, you don’t need it, so why not wait to pick out a block until it’s necessary? Let’s just wait until the user needs to write something, and then we can pick a block from the SRX Media Array to add to our virtual disk. There isn’t a need to allocate more virtual space, of course, because with thin provisioning, we’re only using the physical disk when we need it.

This allows you to overallocate storage so you appear to have more than you actually need. "Why would anyone want to do this?" you might ask. Well, often it’s because we have an internal customer that insists he needs 10TB, and you know that he isn’t going to use all of it. The last time he had a project, he left most of his drives empty.

With thin provisioning, you can go ahead and give him a virtual 10TB and add more physical drives if it turns out he needs it. Everyone is happy.

It turns out that if you can do thin provisioning, there is another useful trick you can do called snapshots.

Snapshots allow us to make a read-only copy of a virtual disk called a clone. Only, we’re not actually making the copy. All we’re doing is keeping track of where the underlying blocks are in the virtual disk.

See, since we allocate blocks as we need them, there is a record of where they are. This means that if we need to make a copy for two different departments, we can do that instantly. Virtually there are two, but physically we’re only using one.

Snapshots are also useful for backing up your data. Backups are hard on a running system because backing up walks through the entire file tree. When you’re using data, this is a challenge because you’re stepping through files that are appearing, growing, and being removed. Whole directories might appear. Needless to say, it’s a difficult undertaking.

So rather than backing up your file tree, just hold steady and say "I just want to keep it as is until I can do my back up, and then I won’t need it anymore." You can do that with a snapshot.

Simply make a snapshot and it’ll be preserved while your original is changing. The VSX does this using a technique called copy-on-write. If you want to change a block on the snapshot, we just allocate a new block and let that be the replacement block for the one that’s now the snapshot.

Because we can keep up with changes using snapshots, we can do another backup feature called shadow. Using two VSXs, one on-site and another off-site, we can shadow a snapshot to the remote VSX.

Shadowing remotely satisfies all our remote needs, all while only tracking changes since your last backup.

As your data is running on the virtual disk, a timer goes off and the VSX takes a snapshot. Slowly over the rest of the day, it copies all the changed blocks since the previous snapshots.

At the remote site, there is another VSX receiving all these snapshot updates. When all the changes are documented (the latest snapshot has been completed), local VSX sends the remote VSX the snapshot across an encrypted TCP/IP channel.

All of this goes on as you are using your data. There is no need to shut everything down in the middle of the night. Your block-level backup hums along quietly in the background.

The VSX has even more tricks, but I think I’ll save them for another week.

Snapshots and Shadows with the VSX

About the Author