Spread it and forget

Im really quite keen to find out what some of your guys out there are doing, seeing and even recommending.  As for me, Im starting to see more and more storage setups where the design, if you can call it that, is just to spread your data and workload over as many array resources as possible and hope for the best.

First let me explain what I mean – lets assume you have a new storage array that you are going to run some Oracle databases on.  Instead of planning in some workload isolation, where some of your array resources - such as host ports, raid controllers and even disks – are isolated for different workload types or performance requirements, Im seeing more and more people opting to simply spread all of their data over as many array resources as possible and assume they will never have to think about it again. 

Of course the principle of load balancing across as many array resources as possible is a solid foundation for good performance, however, its not the be-all and end-all of a good storage design.  It certainly doesn’t mean that you can remove your brain when managing your array, and that’s what I’m seeing a lot of people doing.

Let me give some examples of what I mean –

Probably one of the most common examples of storage related load balancing these days is using a host based volume manager to stripe a logical volume over multiple physical volumes, usually the more the merrier!  So we might stripe our database files over 4 disk groups, which in turn sit behind 4 RAID controllers – good design so far!  However, our storage array only has 4 RAID controllers and we also stripe our log files over the same 4 disk groups behind the same 4 RAID controllers and then later on we might throw a few filesystems on for good measure – still a good design???

We know that disks are not at their happiest when you mix in random workloads with sequential workloads as this has a tendency to cancel out the disks natural ability to efficiently service nice long sequential I/O and use its buffers optimally.  And although tagged command queueing allows a disk to somewhat manage its work for optimal head movement, it still introduces situations where the disk is not operating under ideal circumstances.

Most RAID controllers will also optimise resource usage depending on the type of workload, such as preloading cache with read-ahead data if they are working on sequential I/O.  So constantly asking them switch between sequential and random workloads doesn’t get the best out of them either.  Once you mix in enough random with your sequential it all starts to look random – and disks don’t like random.

Similar things could be said about load balancing I/O across multiple HBA’s in a server.  It might seem like a no-brainer to implement round robin load balancing, or some of the more flashy variations of round robin.  But many modern arrays run effective learning algorithms that can detect different workload types, but only if these I/O patterns aren’t cannibalised before arriving at the array.

I also see setups where people have been told by their vendor/supplier that their performance will be fine as long as they don’t attach more than 10 hosts per array port.  Then when I get brought in and look at performance stats I see that some ports are sitting doing almost nothing while others are relatively flat out.

One last one - there are the natural side effects of zoned data recording techniques used on fixed block architecture disks – where there are often up to twice as many sectors on the outer tracks of a disk than on the inner tracks.  Obviously data placed on the outer tracks is accessible quicker, especially in nice sequential accesses.  I don’t see many places where people even consider this ancient but excellent optimization trick.

Good storage design appears to be a dying art!

I’m not slamming this “spread it as wide as possible” approach, it’s very popular these days, and certainly has its merits.  I for one am quite a big fan of having my LUNs touch as many disks as possible.  I also see that this approach makes it much simpler to manage storage especially in dynamic environments.  But how can you predict performance in this type of environment?  Surely many of the factors mentioned above must also be considered – I would never just take an array with, for arguments sake, 100 disks installed and spread all of my LUNs across all 100 disks and assume because I’m spreading over all disks that I’m getting the best out of my array and don’t have to think about anything else.

I must admit that in many of these environments I see, the storage staff are often Unix guys who know a bit about the arrays tools and how to present a LUN but lack the important knowledge about the way their arrays work under the hood.

If you've made it this far, thanks and well done.  As always thoughts are welcome – even ones slating my opinions.


PS.  I know that truly sequential workloads are the stuff of fairy tales for most people.  And heck why should I care anyway – I, Im sure "we", get paid to go in and fix bad designs Wink