The Full "Extent" of Dynamic Provisioning

Recently Hitachi added Zero Page Reclaim functionality to its Dynamic Provisioning (DP) on the USP V.  This feature is one from a fairly long list of enhancements on most peoples DP wish list.  Other improvements are hopefully in the pipeline. 

In this post I'm going to talk about Zero Page Reclaim as well as a possible interesting shift towards the "extent" as basic unit of the entire array......

Dynamic Provisioning 101 (I'll keep this bit short)

When I talk about dynamic provisioning (DP) I am also referring to Thin Provisioning..........

Under the hood of any DP implementation is a basic construct that I will refer to in this post as the "extent".  Each vendor has a different name for it.  3Par calls it a "chunk" and Hitachi calls it a "page".  Not only do most vendors have their own names for their extents, they all have their own ways of implementing them.  One obvious and much debated difference is the size of each vendors extent. 

This extent, call it what you will, is the basic unit of allocation for all DP implementations that I know of.  Essentially, each time a host writes to a DP (thin) volume, the array will allocate one or more extents.  As this post is not intended to be a tutorial on DP I'll leave the theory there and move on to the meat.

Zero Page Reclaim: How it works

As mentioned, extent size varies between vendor.  Many vendors have an extent size smaller than 1MB, and Hitachi has fixed its extent size at 42MB (more of a "book" or even "collection" than a "page" but such debates are old news now so I won't go into it here).

When an array performs a Zero Page Reclaim operation on a DP volume, it searches the allocated extents for that volume looking for extents that have no data.  Or to be more correct, it searches for extents that contain only zeros.

Sidestep:  Most arrays will zero out a volume when it is RAID formatted, basically writing zeros to all blocks comprising the volume.  This helps the XOR parity calculation.

So any extent that the array finds comprised entirely of zeros it assumes is unused and breaks the association between the extent and the volume.  This has the effect of placing such extents back into the Free Pool.  This has obvious capacity saving benefits and is a useful tool for any thin provisioning array to have up its sleeve.

How useful will this be?

Well, I guess only time will tell, but its certain to have some benefits.

How useful this functionality is relies on a few factors. Two of which include; extent size, and file system behaviour.

Extent size.  It would seem, at first glance, that the smaller your extent size, the more opportunity for reclaiming space.  The theory behind this being that there is more chance of finding extents comprised entirely of zeros if your extent is relatively small.  Conversely you are less likely to find multiple megabytes worth of consecutive zeros, as would be required if you have a larger extent size.  However, it may not turn out to be that simple.  Read on......

File System Behaviour.  One point in relation to file systems is that many file systems do not zero out space when a file is deleted.  Instead they take the lazy approach and simply mark the blocks as free but leave the original data in place. 

Now while this soft delete behaviour may come to our rescue when we want to recover a file we accidentally deleted, it doesn't help us with Zero Page Reclaim.  Put in other words, freeing up 150GB of space on a volume does not normally (depends on your file system behaviour....) write zeros and make the space eligible for Zero Page reclaim.  See diagram below -

 

The above diagram is an oversimplification but illustrates the point.  For Extent 1 to be a candidate for Zero Page Reclaim, the file system would have to hard delete (zero out) all deleted data as shown in the diagram below - 


It is expected, in its current incarnations and combined with present day file systems, that the best use case for zero page reclaim is after migrating volumes from classical thick volumes to new thin dynamic volumes.  This works well when you have, for example, a 500GB traditional thick volume but only 200GB has been written to.  When migrating this to a thin volume you will more than likely be able to reclaim the untouched 300GB at the end of the volume to the Free Pool.

Databases and the likes may work differently, depending on your database/application and OS.........  If so, a smaller extent size may yield superior benefits here.

Array Maturity or Adolescence

Storagebod recently postulated the idea that the storage array may have reached "functional maturity" and as a result, the days of major changes in storage array functionality may be behind us, for a while at least (assuming Ive understood Storagebods comments). 

Obviously I have no access to the planet sized brains in the heads of some of the guys that work at the storage vendors, never mind a crystal ball, so I may be way off the mark with this, but I tend to disagree.  Read on......

The Rise of the Extent

Another feature that I want to see added to DP aware arrays is the ability to migrate at the extent level. 

While there is mileage in the idea that pooling all of you RAID groups together and spreading the allocation of extents across all spindles will help avoid hotspots, exact mileage will no doubt vary.  With the ability to monitor pool performance at the extent level and the ability to migrate extents from spindles within and across pools will give great performance and management flexibility. 

It will be interesting to see what implications extent size will have on the viability and performance of such extent based migrations?

Migration at the LUN level is fine, buts its cumbersome and clunky.  The ability to move an extent to cooler spindles within a pool and even from a disk pool to a pool comprised of EFDs (Enterprise Flash Drives) will really add functionality and value.

I'm hoping thinking that SSD/EFD is going to encourage a lot of changes to storage arrays over the next few years.  Today's storage arrays are highly tuned for spinning disk and will no doubt evolve in step with the uptake of SSD/EFD.

Personally I think it's an exciting time to be involved in storage, with lots of great things happening and lots still to be explored.

Nigel