Tera Incognito

With apologies for a bad pun
I’ve been considering ways of putting together secure accessible backed-up storage of our growing video archive. We have an older Dell server, which is fast and adequate for many purposes, but limited in storage (I’m not sure, but perhaps 100 gigabytes on SCSI drives).

Given that we do have that fast server for things that we might wish to stream, it may make sense to explore network attached storage (effective a drive with an IP address) for the larger issue.

At home, I’ve purchased a Linkstation from Buffalo Technology and have been pleased with it. The same company makes a larger drive called a TeraStation, which you can purchase in up to 1.6 terabyte sizes (4×400 gigabyte drives).

That’s too much to back up, but following the mantra of LOCKSS, where “lots” includes “two,” I’m considering doing the following

1. Purchasing a Terastation in either the 1.2 tb or 1.6 tb size.
2. Inducing Fred Morrison to do the same, putting his in East Hall so we’ll each have offsite backups of our stuff.
3. Setting up each of them so that they have backups of the files on the other site. That is, I’d have a backup of my files on Fred’s drive, and he’d have a backup of his on mine.
3. Setting up a RAID configuration on each. Here there are two choices (see the Raid link for explanation of the possibilities):

a. Setting up each drive under RAID level 5, which uses about 1/4 of the total storage to store a checksum, so that no data is lost even if one of the drives should fail. This would produce what would appear to a 1.2 terabyte drive, which we could divide between storage and backup of the other site.

b. Setting up two RAID 1 partitions. RAID 1 mirrors everything onto two drives, so that there would appear to be 2 800 gigabyte drives. This might make it simpler to figure out what goes where, but we would lose 400 gigabytes in storage per drive. Again, either of each pair of drives could fail without any loss of data.

I think writing under RAID level 5 may be slower, because it has to do the parity calculations, but I’m not sure of this. Reading should be, if anything, faster because the data are dispersed across different drives.

The drives cost a bit under $1 a gigabyte, so this isn’t cheap, but it might be the best solution.

Down the road, we could potentially replace the drives with larger ones, so this might be a long-term solution.

2 Responses to “Tera Incognito”

  1. Chris C. Says:

    I think RAID 5 is slower to write because it’s necessary to calculate and write the checksum to disk. I’ve read that reading is also not paticularly fast

    I think we should figure out exactly how much disk space we’ll need for the next year or two. If we don’t think we will need more than 600 GB now and in the near future, then the simplicity and RW speed of RAID 1 seems appealing.

    If we actually need anywhere near a terabyte of disk space, then the more efficient RAID 5 setup makes sense.

  2. Chris C. Says:

    RAID5 is fast at reading for the reasons you mentioned. I guess it’s not as fast as some other RAID configurations that are not as common (such as a level 4 array), so that must have been why i thought it was “not particularly fast”

    The more i think about it, the less important write speed seems for our purposes.

Leave a Reply