Storage Musings: August 2007

Monday, August 20, 2007

Disks are getting slower - it's true!

Okay, maybe the title should be "Disks are not getting faster, as a linear function of capacity growth". That just doesn't roll of the tongue, nor is it very catchy!

Unless you live under a rock, you're aware that Seagate and Hitachi are now shipping 1TB SATA hard disks in quantity. You're probably also aware that these disks feature SATA-II 3.0Gb/s disk interfaces. What you probably don't know is that these classes of hard disks use the same disk technologies as the last two generations of Seagate drives, and the last three generations of drives from HGST.

Hard disks rely on heads to read and write blocks on the physical disk platters, then buffers the data in small caches, typically 2MB to 16MB. This cache then puts blocks on the SATA bus, traveling to the host - we call transfers data at this step "buffer-to-host". In 2005, the Barracuda line average 65MB/s transfer buffer-to-host; in 2007, this is now 72MB/s. In 2005, the largest Barracuda was 250GB; now it's 1TB. What does that mean?

Under the best circumstances, it now takes longer to empty or fill your 1TB disk, relative to a 500GB or 250GB drive. For enterprises that are moving to disk-based backups, that implies that the aggregate throughput of disk-based backups will not increase as the number of (higher-capacity) disks increases. This information is not new - it's the same reason that high-performance, transaction-based environments use lots and lots of small, fast spindles. But in transaction environments, you goal is to achieve more IOPs - executing more, small transactions, not more bandwidth - transfer streams of larger files/blocks.

With 6Gb/s interface specifications coming in the next year, we're on the cusp of a significant "loss" in performance, that is, a tremendous increase of capacity without an equivalent increase in transfer rate to/from disk. Users expecting their high-capacity storage environments to scale will be sorely disappointed.

Thursday, August 9, 2007

The folly of "Do It Yourself"

About a year ago, a friend of mine called me in a panic about the storage devices at his company. A power brownout at the office hung both a recycled Compaq server running a open source storage platform called OpenFiler and Maxtor MaxAttach NAS appliance . It seems that their revenue-generating advertising graphics were on the Maxstor NAS appliance, while the rest of the workflow data was on the OpenFiler box.
No problem, I said, I can help put a new disk in the NAS and your IT guys run a restore of both devices. "Backups?!? We didn't back it up because it was a NAS. I thought that was what RAID and everything else was for!"
Ugh. After meeting with the IT staff, I helped them replace the disk in the Adaptec NAS, which promptly rebuilt and recovered. The OpenFiler server, however, suffered corrupt LVM information and we couldn't recover the root volume group in any way, costing them 2TB of unrecoverable data.

What went wrong? How did they get themselves into this predicament? There's nothing wrong with OpenFiler - it's a solid, stable platform. Nor was the Maxtor NAS the culprit - it was actually an award-winning product when it was introduced. The problem lie in the "Do It Yourself" approach the IT team had. They focused on creating point solutions to solve problems at hand, without looking at longterm impacts or objectives. I found out that the "backup" service at the company was a series of NTbackup and rsync scripts that copied data from desktop PCs to the OpenFiler box while similar scripts on the Macs copied advertising graphics onto the Maxtor NAS. No thought was given to real business continuity or data availability - these were both solutions that implemented because someone asked them. Open source software and integrated appliances make it easy for us to buy our way into and out of trouble, unless we take the time to chart a course and pay attention to turns we make.

With lessons learned, the company implemented a brand new server to run OpenFiler, configuring one share for important, revenue-generating work flow and another share for desktop file backups. All files on the revenue share were replicated to a managed hosting provider's content management system, with indexing and versioning, giving them an off-site copy of data with a clear way to recover from a local disaster. They IT manager crafted a storage strategy to reduce the amount of unprotected data on desktops and laptops, with important business data residing on the (replicated) NAS.

The "Do It Yourself" days of rolling together disparate products, with no thought to overall strategy, have come to an end - our 24/7, non-stop Internet economy won't support it any more.

Wednesday, August 1, 2007

Storage as an asset

Storage

That word means a lot of things to a lot of people. I'll try and talk about what it means to me, in both a personal and professional context.

First and foremost, storage is a strategic asset. Just as electrical power and networking are utilities that enable computing, I see storage in a similar fashion. Storage is the computational element to safeguard data over time. I choose the word 'safeguard' specifically to indicate a level of 'stewardship' for data; to protect it from errors both human and mechanical. I have a bevy of tools at my disposal, subject to the boundaries of cost, performance and reliability.

Second, each storage asset can be described as having some level of cost, performance and reliability, relative to another asset. For example, a laptop hard disk has some level of cost, which may be higher or lower that the cost of a desktop hard disk or an enterprise hard disk in a storage array. Similarly, it may be more or less reliable than the aforementioned types of disk. In my discussions about storage, I tell everyone:

Cheap, Fast or Reliable - pick any two, because you can't get all three.

These three points are only relative between storage solutions of varying configurations and technologies. One set of technologies will always be cheaper, faster or more reliable than another, for a given configuration. We choose two of three points for optimization and see how things go from there.

Finally, storage technologies, to me, are orthogonal to server/platform/OS technologies (for the most part). Many of my colleagues add storage in chunks of servers; if you need 2TB for a project, you buy another server with 2TB of disk on it. I find that highly inefficient, much the same way you'd add more servers with 100Mb interfaces to fill a 1Gb network pipe. As much as possible, storage should be a separate, strategic element that scales independently from computing power. When implemented properly, a storage architecture enhances your computing infrastructure, giving you a dynamic environment that delivers the performance/capacity you need, at an compelling price point. Sometimes, that compelling price point or performance/capacity need will fit inside the server quite easily. I accept that, in small, singular instances. But twenty, singular instances later, you'll wish you had that SAN, instead of running herd over 19 unnecessary boxes....

Welcome!

Hello and welcome!

My job title at the University of Michigan's College of Engineering is Storage Administrator. My responsibilities include managing and maintaining our storage infrastructure, and assisting other units on campus with assessing and developing their own storage infrastructures. I've created this blog to track my thoughts and opinions on storage, server technologies and data center infrastructure. I'm sure this will be an evolving process for me, but hopefully I can capture the essence of things that I see.

Read on!

Storage Musings