Dynamic Provisioning: Notes from the field

I've recently returned from a well deserved 3 month break from work, and while getting back in to the swing of things I decided to check out a couple of storage forums (http://forums.hds.com and http://forums.itrc.hp.com) to see what people were doing and talking about. 

On both forums there appeared to be quite a few people looking for advice and further information on the subject of Thin Provisioning, as implemented by Hitachi - HDS USP-V and HP XP 24000.  It seems that despite the technology being well past its first birthday, its still seen as a bit of an unknown, or dark art.

So I thought Id write a few posts on here in an attempt to dump my brain and experience on the subject, and may be get some feedback from others. 

I thought Id use this post as an intro to the subject and set the scene.  Then over the next few weeks cover topics such as V-VOL Groups, the legendary and somewhat mystical 42MB chunk size, best practices, and what is needed for the product to mature etc....

BTW:  Ive decided to use the HDS terminology throughout, but obviously the technology is exactly the same whether you use Hitachi, HDS, HP or Sun.

First things first

So where am I coming from......  I've personally designed, tested and implemented HDP for a couple of companies including a large UK financial house that is now running the technology on at least two production USP-V (XP 24000) pairs.  They (the financial house) have the technology servicing hosts running a variety of platforms including Windows, VMware and HP-UX.  The technology is implemented and running in conjunction with -

  • External Storage
  • ShadowImage
  • TrueCopy
  • Copy-On-Write

I'm currently working with some good people implementing a production HDP design for a large government project. 

So I know a little about the technology and have felt a little of the weight of making technical recommendations and implementing a technology that is much hyped but relatively unknown under the hood.

Fractional Reserve Banking Storage

Interestingly the bank where I implemented HDP were not interested in the over-provisioning features of the technology, at least not for their production systems.  The risk of a "block run" (storage equivalent of a "bank run") didn't sit well with them and they couldn't possibly run the risk.  Of course its standard practice in todays fractional reserve banking system for the banks to loan out our money and not have enough on hand should we all turn up tomorrow and demand our "full allocation" - dare I say cash over-provisioning :-D

Anyway.......... if they didn't want the space saving benefits on offer, what did they want from the technology?  At this point I should probably point out that they did make use of over-provisioning in their Pre-Production environment - its OK with the Monopoly money ;-)

Well...... HDP offers at least two other possible benefits that they were interested in -

  1. Simplified LUN management
  2. Potentially better performance

Monkeys

Lets look at point 1 first.  When I say "simplified LUN management", what Im basically saying is that once you implement HDP, your storage administrators no longer need to worry think know about Array Groups, Back End Directors, I/O profiles and the likes.  The idea being that you do the thinking up front when designing your Pools.  I.e. putting enough spindles behind them and balance them across as many internal components as possible etc.  Once you've done that, there is so little to LUN allocation that a monkey could do it. 

For some environments this is great and is a godsend.  For others it wont work, some people need the ability to micro-manage their storage a little more than that.  But the beauty of it is that its flexible enough that you can have HDP Pools as well as standard Array Groups (and hopefully soon SSD) in the same box.

Performance and so called wide striping

Then there's the performance thing.  While I understand that the wide striping approach taken by HDP can bring performance benefits - spindles win prizes and all - it doesnt appear to be that simple.  A chunk size of 42MB does not seem to have been chosen with performance as the top priority (I'll go into this in a future post - its interesting).  And then there's the overhead involved in first time allocation of chunks (free page lookups etc..) and just the fact that its another software product being run on the front end ports all takes a toll. 

Experience shows me that sometimes you get performance improvements whereas other times its less clear. 

Many eggs in one basket

One thing that has worried me, and I know it worries others, is the feeling of throwing a lot of eggs into one basket.  With the almost universal use of RAID5 (7+1) Array Groups as the building blocks of HDP Pools, then pooling 30, 40 or even more of these Array Groups into a single Pool, and then having a large number of applications spread their data all over this large Pool....... well.  The idea of one of these Array Groups failing is enough to keep one awake at night.  But do we really need to worry so much?

Is it the future?

That depends.  Yes I think it probably is the future of enterprise disk.  However, the future is clearly SSD.

The bank were looking at HDP being the de-facto standard for all future deployments of external storage through the USP-Vs.  I think it's a perfect fit for this and can absolutely see this taking off.  

Oh and lets not forget, that although the technology is over a yea old, its still a version 1.x product.  But it seems to be getting a fair amount of attention from the developers at Hitachi.  Modifications and improvements are coming through at a steady pace.  The only thing lacking (well obviously not the only thinkg) being the fact that many of these improvements are not well known and not filtered through to customers.  The usual.

Well thats the boring up fonrt stuff out of the way.  Next up, either V-VOL Groups or 42MB pages, I haven't decided yet.

Nigel

PS.  I just had a thought (warning, thinking out loud).  The R0 cabinet and its associated Array Groups always seem to leave people wondering what to do with it - unbalancing an otherwise symmetrical design......  So why not slip in and extra BED pair dedicated to R0 and have it set aside for high performance SSD?  This literally just popped into my head.  Let me know your thoughts on that if you have any.  The only other good use I can think for it might be as a storage area for the data centres Christmas decorations :-D