On an otherwise slow news day, SearchStorage reported on Oracle's new (free) DirectNFS (DNFS) client for Oracle 11g. The DNFS client bypasses the operating system's NFS client, avoiding the manual testing/tuning/reconfiguration steps normally required. DNFS relies on information generated by the Oracle processes to determine parameters such as read/write buffer size, retransmit times and so forth. The current version includes rudimentary support for multipathing via global namespace (GNS), with expectations to enhance this capability in future releases. This client should be a boon for installations that use NAS gateways, such as NetApp filers, for providing Oracle storage, so it's no wonder the client was developed in conjunction with Network Appliance.
Coupled with NetApp's GNS, Oracle customers can now do high-performance replication and seamless failover across disparate NetApp clusters. A GNS that hosts filers on two different subnets (presumable on opposite ends of the country) can allow Oracle servers in one location to failover to filers in a different location. With this failover occurring at the DNFS layer, there's better synchronization with Oracle's management and instrumentation, giving DBAs better awareness of the cluster state. Better still, the Oracle cluster is able to employ automated recovery techniques (automated log playback) to quickly recover from a failure - even to the point of knowing when to start caching transactions during a failover. We'll see if this news is enough to get Oracle and NetApp to renew interest in both products.
Thursday, October 4, 2007
Wednesday, October 3, 2007
"Green" DataCenters
My co-worker recently returned from Storage Decisions in New York City. Many of the presentations focused on the concept of the "green" data center, and how server/storage products were adapting to those initiatives. Many believe that by retooling servers, storage and network devices to be more power/cooling friendly, it will benefit the bottom line - cost savings, environmental impact and so forth.
In practice, however, few of these savings are realized, yet more companies jump on the bandwagon. In regards to storage, several vendors announced products at SD2007 to follow these initiatives. Some vendors have implemented storage arrays that spin down drives when not in use. Others have implemented high-efficiency power supplies, to reduce electrical loss and waste heat generation. A few have even gone so far to offer tools to allow admins to control power consumption - giving users direct access to "power profile" configurations recently introduced in Seagate and Hitachi hard disks.
What does this mean for administrators? Nothing good, in my opinion. Hard disks are only guaranteed to tell you they're not working when there is power applied to the drive. When a hard disk spins down, there is no guaranteed method to a) ensure the drive will spin up or b) report that a drive will not spin up reliably. If there's anything the Google disk failure report indicates is that silent failures are both possible and probable. SMART reporting from drives was not possible with 100% of spinning, active drives - spinning down a drive, which turns off SMART, certainly won't improve failure prediction rates. Administrators will now be faced with many more disk failures and higher replacement costs, reducing any cost savings realized.
What does this mean for end users? Nothing good, either. Users will now see even more delays as arrays spin up data drives, in response to data queries. Delays access web images on shopping sites results in customers looking elsewhere.
What does this mean for manufacturers? More money. Including a license to control green features on an array is a sure-fired cash cow. As most of the spin-up/spin-down command infrastructure is already present in the SCSI command set (UNIT_OFFLINE, UNIT_ATTEN, etc.), it was relatively easy to port this functionality into the SATA and SAS command sets. Power supplies with only 10% or 20% efficiency improvements over standard power supplies command 50% premiums in the open market. Vendors can inflate this cost to more than double, realizing considerable profit.
In my estimation, green data centers don't offer anything today. Another generation of product advances (specifically in the power supply and heat dissipation areas) are necessary to realize any worthwhile cost-savings from green initiatives. Sure, it might make us all feel warm and fuzzy inside, but I can get the same benefit from a nice, wool sweater.
In practice, however, few of these savings are realized, yet more companies jump on the bandwagon. In regards to storage, several vendors announced products at SD2007 to follow these initiatives. Some vendors have implemented storage arrays that spin down drives when not in use. Others have implemented high-efficiency power supplies, to reduce electrical loss and waste heat generation. A few have even gone so far to offer tools to allow admins to control power consumption - giving users direct access to "power profile" configurations recently introduced in Seagate and Hitachi hard disks.
What does this mean for administrators? Nothing good, in my opinion. Hard disks are only guaranteed to tell you they're not working when there is power applied to the drive. When a hard disk spins down, there is no guaranteed method to a) ensure the drive will spin up or b) report that a drive will not spin up reliably. If there's anything the Google disk failure report indicates is that silent failures are both possible and probable. SMART reporting from drives was not possible with 100% of spinning, active drives - spinning down a drive, which turns off SMART, certainly won't improve failure prediction rates. Administrators will now be faced with many more disk failures and higher replacement costs, reducing any cost savings realized.
What does this mean for end users? Nothing good, either. Users will now see even more delays as arrays spin up data drives, in response to data queries. Delays access web images on shopping sites results in customers looking elsewhere.
What does this mean for manufacturers? More money. Including a license to control green features on an array is a sure-fired cash cow. As most of the spin-up/spin-down command infrastructure is already present in the SCSI command set (UNIT_OFFLINE, UNIT_ATTEN, etc.), it was relatively easy to port this functionality into the SATA and SAS command sets. Power supplies with only 10% or 20% efficiency improvements over standard power supplies command 50% premiums in the open market. Vendors can inflate this cost to more than double, realizing considerable profit.
In my estimation, green data centers don't offer anything today. Another generation of product advances (specifically in the power supply and heat dissipation areas) are necessary to realize any worthwhile cost-savings from green initiatives. Sure, it might make us all feel warm and fuzzy inside, but I can get the same benefit from a nice, wool sweater.
Thursday, September 27, 2007
SWC no more?
Keeping with my trend of putting storage companies out of business by buying their products, I've now graduated to storage conference! Storage World Conference, the last of which was held September of 2006, is now a memory. The website is no longer updated and plans for SWC 2008 are in doubt. Previous SWC gatherings were held in conjunction with the Association of Storage Networking Professionals (ASNP). The ASNP merged into the Storage Networking User's Group (SNUG) last year, bolstering the ranks of SNUG.
Is this a preview of things to come? Has the storage industry begun to shrink? What's it all mean?
First and foremost, the loss of SWC reduces the list of storage-specific conferences to just two - Storage Networking World (SNW) and Storage Decisions (SD). SNW is hosted by ComputerWorld and SD is hosted by TechTarget. Both media groups seem relatively stable, with a vast array of IT publications in their collections. Both SNW and SD have a full calendar of events on both coasts, alternating in the Fall and Spring. With, the influx of sponsor dollars necessary for these schedules will be easier with fewer events each year.
Moreover, this could be a sign that the industry is self-correcting. Do we really need three storage conferences each season? Maybe. Maybe not.
Is this a preview of things to come? Has the storage industry begun to shrink? What's it all mean?
First and foremost, the loss of SWC reduces the list of storage-specific conferences to just two - Storage Networking World (SNW) and Storage Decisions (SD). SNW is hosted by ComputerWorld and SD is hosted by TechTarget. Both media groups seem relatively stable, with a vast array of IT publications in their collections. Both SNW and SD have a full calendar of events on both coasts, alternating in the Fall and Spring. With, the influx of sponsor dollars necessary for these schedules will be easier with fewer events each year.
Moreover, this could be a sign that the industry is self-correcting. Do we really need three storage conferences each season? Maybe. Maybe not.
Monday, September 10, 2007
Deduplication
Many storage vendors have jumped on the deduplication bandwagon, promising users incredible improvements in storage space, data recovery and so on. In many environments, dedupe can both shrink backup times and extend the amount of data retained by only copying newly-created blocks in a backup session. By extending the deduplication within and across backup sessions (i.e. only saving one copy of word.exe across servers and one copy of jre.dll within each backup session), backups require much less time and resources than ever before.
The promise of deduplication and the the reality, however, often differ greatly. Dedupe can be performed on the host or at the storage target; in-line during the backup or post-backup. Different combinations are necessary to meet the requirements of some applications. You may find that dedupe performed post-backup may work well for small, flat file backups, especially across multiple server backups. Large databases, might need in-line dedupe (which has a tremendous impact on the backup process) to achieve any level of cost savings or to even return a viable backup file. Vendors offer a Chinese menu of products and features, with an equally confusing array of licenses and software add-ons to make it all work.
So what do we, the customers, actually need from all of these? Basically, we're all after shorter backups with greater granularity in smaller amounts of storage. Add in faster recovery times and user self-service restores and you've got the recipe for an impressive backup solution. Now, all we need is any vendor to offer this as a product - not the morass of loosely-coupled software products offered today.
The promise of deduplication and the the reality, however, often differ greatly. Dedupe can be performed on the host or at the storage target; in-line during the backup or post-backup. Different combinations are necessary to meet the requirements of some applications. You may find that dedupe performed post-backup may work well for small, flat file backups, especially across multiple server backups. Large databases, might need in-line dedupe (which has a tremendous impact on the backup process) to achieve any level of cost savings or to even return a viable backup file. Vendors offer a Chinese menu of products and features, with an equally confusing array of licenses and software add-ons to make it all work.
So what do we, the customers, actually need from all of these? Basically, we're all after shorter backups with greater granularity in smaller amounts of storage. Add in faster recovery times and user self-service restores and you've got the recipe for an impressive backup solution. Now, all we need is any vendor to offer this as a product - not the morass of loosely-coupled software products offered today.
8Gbps FC.... why?
Early this summer, the FibreChannel Industry Association approved signaling standards for an 8Gbps version of the FibreChannel protocol. 8GBps FC switch and HBA products are expected to arrive in early 2008 in test samples, with major shipments by mid-year. What makes this interesting is, for the first time vendors are delivering their wares before disk products are announced! At this time, Seagate and Hitachi have not announced when or if they will ship 8Gbps disks. Why does that matter?
For the first time, we might see infrastructures that are effectively faster than the backing store. In our current generation, we can claim "end-to-end throughput" - 4Gbps disks attached to 4Gbps arrays on 4Gbps switches and so on. Now, that may no longer be true. Worse yet, there's no evidence to support the believe that the current generation of server hardware can effectively leverage the 8Gbps "pipe". This argument was raised with 10Gbps network cards and iSCSI interfaces, but card vendors claim the protocol overhead prevents the server from obtaining full line-rate. As 10Gbps networking prices fall, it looks like iSCSI can finally provide both lower costs and higher performance than FC.
Infiniband, too, is making inroads across the board. Once relegated only to HPC deployments, it is quickly becoming a common storage interconnect. Scalable storage clusters from DataDirect and Isilon use IB for storage-to-storage and storage-to-host interconnects. If pricing models improve, IB will greatly outstrip both the cost and performance of FC.
8Gbps FC comes to market as virtualization is poised for a meteoric rise; some expect it to become the dominating "operating system' in the data center. Does 8Gbps FC enable virtualization? No more than 2Gbps or 4Gbps. The claims that 8Gbps reduces the number of ISL links between switches may hold true, but it's too early to tell what benefit that really holds.
So the big question I'm left with is why do we need 8Gbps FC? Aside from the ISL argument, there are no drives available (or currently scheduled) and Infiniband and 10Gbps FC offer more bandwidth. Maybe it's finally time for FC to go the way of SCSI and IDE.
For the first time, we might see infrastructures that are effectively faster than the backing store. In our current generation, we can claim "end-to-end throughput" - 4Gbps disks attached to 4Gbps arrays on 4Gbps switches and so on. Now, that may no longer be true. Worse yet, there's no evidence to support the believe that the current generation of server hardware can effectively leverage the 8Gbps "pipe". This argument was raised with 10Gbps network cards and iSCSI interfaces, but card vendors claim the protocol overhead prevents the server from obtaining full line-rate. As 10Gbps networking prices fall, it looks like iSCSI can finally provide both lower costs and higher performance than FC.
Infiniband, too, is making inroads across the board. Once relegated only to HPC deployments, it is quickly becoming a common storage interconnect. Scalable storage clusters from DataDirect and Isilon use IB for storage-to-storage and storage-to-host interconnects. If pricing models improve, IB will greatly outstrip both the cost and performance of FC.
8Gbps FC comes to market as virtualization is poised for a meteoric rise; some expect it to become the dominating "operating system' in the data center. Does 8Gbps FC enable virtualization? No more than 2Gbps or 4Gbps. The claims that 8Gbps reduces the number of ISL links between switches may hold true, but it's too early to tell what benefit that really holds.
So the big question I'm left with is why do we need 8Gbps FC? Aside from the ISL argument, there are no drives available (or currently scheduled) and Infiniband and 10Gbps FC offer more bandwidth. Maybe it's finally time for FC to go the way of SCSI and IDE.
Monday, August 20, 2007
Disks are getting slower - it's true!
Okay, maybe the title should be "Disks are not getting faster, as a linear function of capacity growth". That just doesn't roll of the tongue, nor is it very catchy!
Unless you live under a rock, you're aware that Seagate and Hitachi are now shipping 1TB SATA hard disks in quantity. You're probably also aware that these disks feature SATA-II 3.0Gb/s disk interfaces. What you probably don't know is that these classes of hard disks use the same disk technologies as the last two generations of Seagate drives, and the last three generations of drives from HGST.
Hard disks rely on heads to read and write blocks on the physical disk platters, then buffers the data in small caches, typically 2MB to 16MB. This cache then puts blocks on the SATA bus, traveling to the host - we call transfers data at this step "buffer-to-host". In 2005, the Barracuda line average 65MB/s transfer buffer-to-host; in 2007, this is now 72MB/s. In 2005, the largest Barracuda was 250GB; now it's 1TB. What does that mean?
Under the best circumstances, it now takes longer to empty or fill your 1TB disk, relative to a 500GB or 250GB drive. For enterprises that are moving to disk-based backups, that implies that the aggregate throughput of disk-based backups will not increase as the number of (higher-capacity) disks increases. This information is not new - it's the same reason that high-performance, transaction-based environments use lots and lots of small, fast spindles. But in transaction environments, you goal is to achieve more IOPs - executing more, small transactions, not more bandwidth - transfer streams of larger files/blocks.
With 6Gb/s interface specifications coming in the next year, we're on the cusp of a significant "loss" in performance, that is, a tremendous increase of capacity without an equivalent increase in transfer rate to/from disk. Users expecting their high-capacity storage environments to scale will be sorely disappointed.
Unless you live under a rock, you're aware that Seagate and Hitachi are now shipping 1TB SATA hard disks in quantity. You're probably also aware that these disks feature SATA-II 3.0Gb/s disk interfaces. What you probably don't know is that these classes of hard disks use the same disk technologies as the last two generations of Seagate drives, and the last three generations of drives from HGST.
Hard disks rely on heads to read and write blocks on the physical disk platters, then buffers the data in small caches, typically 2MB to 16MB. This cache then puts blocks on the SATA bus, traveling to the host - we call transfers data at this step "buffer-to-host". In 2005, the Barracuda line average 65MB/s transfer buffer-to-host; in 2007, this is now 72MB/s. In 2005, the largest Barracuda was 250GB; now it's 1TB. What does that mean?
Under the best circumstances, it now takes longer to empty or fill your 1TB disk, relative to a 500GB or 250GB drive. For enterprises that are moving to disk-based backups, that implies that the aggregate throughput of disk-based backups will not increase as the number of (higher-capacity) disks increases. This information is not new - it's the same reason that high-performance, transaction-based environments use lots and lots of small, fast spindles. But in transaction environments, you goal is to achieve more IOPs - executing more, small transactions, not more bandwidth - transfer streams of larger files/blocks.
With 6Gb/s interface specifications coming in the next year, we're on the cusp of a significant "loss" in performance, that is, a tremendous increase of capacity without an equivalent increase in transfer rate to/from disk. Users expecting their high-capacity storage environments to scale will be sorely disappointed.
Thursday, August 9, 2007
The folly of "Do It Yourself"
About a year ago, a friend of mine called me in a panic about the storage devices at his company. A power brownout at the office hung both a recycled Compaq server running a open source storage platform called OpenFiler and Maxtor MaxAttach NAS appliance . It seems that their revenue-generating advertising graphics were on the Maxstor NAS appliance, while the rest of the workflow data was on the OpenFiler box.
No problem, I said, I can help put a new disk in the NAS and your IT guys run a restore of both devices. "Backups?!? We didn't back it up because it was a NAS. I thought that was what RAID and everything else was for!"
Ugh. After meeting with the IT staff, I helped them replace the disk in the Adaptec NAS, which promptly rebuilt and recovered. The OpenFiler server, however, suffered corrupt LVM information and we couldn't recover the root volume group in any way, costing them 2TB of unrecoverable data.
What went wrong? How did they get themselves into this predicament? There's nothing wrong with OpenFiler - it's a solid, stable platform. Nor was the Maxtor NAS the culprit - it was actually an award-winning product when it was introduced. The problem lie in the "Do It Yourself" approach the IT team had. They focused on creating point solutions to solve problems at hand, without looking at longterm impacts or objectives. I found out that the "backup" service at the company was a series of NTbackup and rsync scripts that copied data from desktop PCs to the OpenFiler box while similar scripts on the Macs copied advertising graphics onto the Maxtor NAS. No thought was given to real business continuity or data availability - these were both solutions that implemented because someone asked them. Open source software and integrated appliances make it easy for us to buy our way into and out of trouble, unless we take the time to chart a course and pay attention to turns we make.
With lessons learned, the company implemented a brand new server to run OpenFiler, configuring one share for important, revenue-generating work flow and another share for desktop file backups. All files on the revenue share were replicated to a managed hosting provider's content management system, with indexing and versioning, giving them an off-site copy of data with a clear way to recover from a local disaster. They IT manager crafted a storage strategy to reduce the amount of unprotected data on desktops and laptops, with important business data residing on the (replicated) NAS.
The "Do It Yourself" days of rolling together disparate products, with no thought to overall strategy, have come to an end - our 24/7, non-stop Internet economy won't support it any more.
No problem, I said, I can help put a new disk in the NAS and your IT guys run a restore of both devices. "Backups?!? We didn't back it up because it was a NAS. I thought that was what RAID and everything else was for!"
Ugh. After meeting with the IT staff, I helped them replace the disk in the Adaptec NAS, which promptly rebuilt and recovered. The OpenFiler server, however, suffered corrupt LVM information and we couldn't recover the root volume group in any way, costing them 2TB of unrecoverable data.
What went wrong? How did they get themselves into this predicament? There's nothing wrong with OpenFiler - it's a solid, stable platform. Nor was the Maxtor NAS the culprit - it was actually an award-winning product when it was introduced. The problem lie in the "Do It Yourself" approach the IT team had. They focused on creating point solutions to solve problems at hand, without looking at longterm impacts or objectives. I found out that the "backup" service at the company was a series of NTbackup and rsync scripts that copied data from desktop PCs to the OpenFiler box while similar scripts on the Macs copied advertising graphics onto the Maxtor NAS. No thought was given to real business continuity or data availability - these were both solutions that implemented because someone asked them. Open source software and integrated appliances make it easy for us to buy our way into and out of trouble, unless we take the time to chart a course and pay attention to turns we make.
With lessons learned, the company implemented a brand new server to run OpenFiler, configuring one share for important, revenue-generating work flow and another share for desktop file backups. All files on the revenue share were replicated to a managed hosting provider's content management system, with indexing and versioning, giving them an off-site copy of data with a clear way to recover from a local disaster. They IT manager crafted a storage strategy to reduce the amount of unprotected data on desktops and laptops, with important business data residing on the (replicated) NAS.
The "Do It Yourself" days of rolling together disparate products, with no thought to overall strategy, have come to an end - our 24/7, non-stop Internet economy won't support it any more.
Wednesday, August 1, 2007
Storage as an asset
Storage
That word means a lot of things to a lot of people. I'll try and talk about what it means to me, in both a personal and professional context.
First and foremost, storage is a strategic asset. Just as electrical power and networking are utilities that enable computing, I see storage in a similar fashion. Storage is the computational element to safeguard data over time. I choose the word 'safeguard' specifically to indicate a level of 'stewardship' for data; to protect it from errors both human and mechanical. I have a bevy of tools at my disposal, subject to the boundaries of cost, performance and reliability.
Second, each storage asset can be described as having some level of cost, performance and reliability, relative to another asset. For example, a laptop hard disk has some level of cost, which may be higher or lower that the cost of a desktop hard disk or an enterprise hard disk in a storage array. Similarly, it may be more or less reliable than the aforementioned types of disk. In my discussions about storage, I tell everyone:
Cheap, Fast or Reliable - pick any two, because you can't get all three.
These three points are only relative between storage solutions of varying configurations and technologies. One set of technologies will always be cheaper, faster or more reliable than another, for a given configuration. We choose two of three points for optimization and see how things go from there.
Finally, storage technologies, to me, are orthogonal to server/platform/OS technologies (for the most part). Many of my colleagues add storage in chunks of servers; if you need 2TB for a project, you buy another server with 2TB of disk on it. I find that highly inefficient, much the same way you'd add more servers with 100Mb interfaces to fill a 1Gb network pipe. As much as possible, storage should be a separate, strategic element that scales independently from computing power. When implemented properly, a storage architecture enhances your computing infrastructure, giving you a dynamic environment that delivers the performance/capacity you need, at an compelling price point. Sometimes, that compelling price point or performance/capacity need will fit inside the server quite easily. I accept that, in small, singular instances. But twenty, singular instances later, you'll wish you had that SAN, instead of running herd over 19 unnecessary boxes....
That word means a lot of things to a lot of people. I'll try and talk about what it means to me, in both a personal and professional context.
First and foremost, storage is a strategic asset. Just as electrical power and networking are utilities that enable computing, I see storage in a similar fashion. Storage is the computational element to safeguard data over time. I choose the word 'safeguard' specifically to indicate a level of 'stewardship' for data; to protect it from errors both human and mechanical. I have a bevy of tools at my disposal, subject to the boundaries of cost, performance and reliability.
Second, each storage asset can be described as having some level of cost, performance and reliability, relative to another asset. For example, a laptop hard disk has some level of cost, which may be higher or lower that the cost of a desktop hard disk or an enterprise hard disk in a storage array. Similarly, it may be more or less reliable than the aforementioned types of disk. In my discussions about storage, I tell everyone:
Cheap, Fast or Reliable - pick any two, because you can't get all three.
These three points are only relative between storage solutions of varying configurations and technologies. One set of technologies will always be cheaper, faster or more reliable than another, for a given configuration. We choose two of three points for optimization and see how things go from there.
Finally, storage technologies, to me, are orthogonal to server/platform/OS technologies (for the most part). Many of my colleagues add storage in chunks of servers; if you need 2TB for a project, you buy another server with 2TB of disk on it. I find that highly inefficient, much the same way you'd add more servers with 100Mb interfaces to fill a 1Gb network pipe. As much as possible, storage should be a separate, strategic element that scales independently from computing power. When implemented properly, a storage architecture enhances your computing infrastructure, giving you a dynamic environment that delivers the performance/capacity you need, at an compelling price point. Sometimes, that compelling price point or performance/capacity need will fit inside the server quite easily. I accept that, in small, singular instances. But twenty, singular instances later, you'll wish you had that SAN, instead of running herd over 19 unnecessary boxes....
Welcome!
Hello and welcome!
My job title at the University of Michigan's College of Engineering is Storage Administrator. My responsibilities include managing and maintaining our storage infrastructure, and assisting other units on campus with assessing and developing their own storage infrastructures. I've created this blog to track my thoughts and opinions on storage, server technologies and data center infrastructure. I'm sure this will be an evolving process for me, but hopefully I can capture the essence of things that I see.
Read on!
My job title at the University of Michigan's College of Engineering is Storage Administrator. My responsibilities include managing and maintaining our storage infrastructure, and assisting other units on campus with assessing and developing their own storage infrastructures. I've created this blog to track my thoughts and opinions on storage, server technologies and data center infrastructure. I'm sure this will be an evolving process for me, but hopefully I can capture the essence of things that I see.
Read on!
Subscribe to:
Comments (Atom)
