There is a growing "performance gap" in computer storage due to basic physics, and exacerbated by server virtualization. People who run data centers see the problem, and so do their customers. It gets worse every year.
Most storage vendors have no real solution to this problem. Xiotech, with its ISE product, is a notable exception. For this reason (among several others), Xiotech is one of my favorite companies.
The Performance Gap
In a nutshell, the "performance gap" is that as disk-based storage gets less expensive, the rate at which any one customer or user can get at their data gets worse and worse. Capacity goes up and performance goes down, resulting in a performance problem. Seems strange, doesn't it? Doesn't everything about computers get faster, smaller and cheaper? What's this about something getting worse all the time? Well, it's true.
As everyone knows, computers get faster and less expensive. As they get faster, they read and write data more quickly. Storage that is built out of the same electronic "stuff" as the computer (RAM, DRAM, the "main memory" of the computer) gets faster and less expensive at pretty much the same rate as the processors. No problem there. But what about the disks (hard disk drives, HDD's)? Do they get faster and less expensive too?
Well, that's the problem. They do get less expensive -- that's part of why we have iPod's and digitial cameras now. But they do not get much faster. Mostly what happens is they hold more data and get physically smaller.
A dozen years ago, you could have bought a HDD that had 10GB on it. Today, you can buy HDD's that are physically smaller that hold over 1TB. For less money!
Smaller, more capacity, less expensive. What's wrong with that? Imagine that you have 1TB of data. A dozen years ago, you would have stored it (ignoring details like overhead and extra space) on 100 HDD's, each holding 10GB. Today you would only need one HDD -- a 100:1 advantage! That's the good news.
The bad news is that a dozen years ago, you would have had 100 HDD's, each with a head and data channel to read and write your data, while today you would have a single read/write head to access the same amount of data. This is 100 times worse than it was a dozen years ago! It's like Yankee Stadium with the same number of fans, only locking all the entry gates except one; do think there would be a line at that single door?
If banks operated the same way with ATM's the way drive vendors do with HDD's, here's what it would be like. The bankers figure they're going to put a certain amount of cash in ATM's for the citizens of, say, New York City. Each year, new ATM's become available that can magically store a lot more cash in less space at lower cost. Of course they go to town, replacing each couple of old ATM's with one double-capacity ATM. They're feeling good about themselves -- they've made the same amount of cash available to their customers while lowering their costs.
Suppose that ten years ago, their were 100 ATM's in NYC. With this incredible technical growth in ATM's, the bankers can make the same amount of cash available using just one ATM -- isn't that great?!
Of course, the problem is obvious. People don't care about how many ATM's it takes -- they care whether there's an ATM near them when they need cash, and whether there's a line of people waiting for access to that ATM. Imagine the impact of reducing the number of ATM's by a factor of 100. The same amount of cash is in the remaining ATM's as before, so the capacity is not reduced, just the number of access points.
That is the performance gap: the same amount of "stuff" is crammed into a tiny fraction of the number of "boxes," but the "doors" to the few remaining boxes are no larger.
Fundamentals: Physics vs. Electronics
Hard Disk Drives (HDD's) contain one or more little platters (kind of like small CD's) that spin in a sealed enclosure. The platters spin, and as they pass by the read/write heads, data may be written or read. Here is a basic summary of HDD technology, with diagrams.
Every year, vendors manage to make the little platters even smaller, make the read/write heads able to handle "bits" that are smaller and smaller. More data gets crammed into less space.
The trouble is that the platter can't spin much faster than it already does. If the data you want is on the other side of the platter from the head, you still have to wait for the platter to spin around -- and that takes the same time as it did when there was 100 times less data on that platter.
In addition, platters (like CD's) put data everywhere, from the small inner part to the larger outer edges. With a single head to read or write, the head still has to be moved to the right place (this is called "seek time"), and that movement isn't much faster than it was years ago.
Finally when you've got the "ATM" problem or the "Yankee Stadium" problem, the problem gets really bad -- the chances that your data is on the same platter as someone else who wants their data at the same time has gotten 100 times worse.
As the electronics of shrinking bits gets better, the performance gap gets worse, because physics doesn't let us move things or spin things way faster, and the simple arithmetic of cramming more data through the same-size door just blocks us.
Server Virtualization
Server virtualization is a major trend in data centers. It basically enables data centers to reduce the number of servers they use by using virtual machines instead of physical machines. With the help of software, applications that used to require dedicated machines can share a smaller number of physical machines.
This is generally a good idea and saves money. Except for this little problem about storage. By putting multiple applications on each server, the number of read/write requests coming out of each server has just gotten larger. Which makes the storage performance even worse. Ughh.
Conclusion
Advances in storage technology are paradoxically making storage performance worse. Server virtualization, while generally a good thing, exacerbates the problem. How do you solve the performance gap? There are several major ways.
- Buy enough HDD's to give you the performance you need. This may mean that you are way over-capacity, but who cares? Your users aren't trying to wring your neck because they can't get their data. And the extra money all that costs? Well, that's life.
- Pay 10 to 20 times more per GB and buy solid-state storage (SSD's) -- and change your applications to take advantage of it. Problem solved! The extra money and trouble? Well, that's life.
- Buy ISE's from Xiotech. Buy only the capacity you need. Your users will love you and you won't have to touch your applications to make it work. The extra money and trouble? There is no extra money or trouble. That's life -- the good life, that is.
Am I proud to be associated with Xiotech? You betcha.