The MySpace storage monster

Being a curious storage person (curious about storage not a "curious person" Wink ) Im always keen to know about other peoples storage environments – which vendors do they have and how successful are their implementations……….. especially the less traditional but big players such as Google, Yahoo and the likes. So I was really interested to read this article about storage at MySpace which I picked up from the storage feeds. Im sure most would agree that MySpace has to be one of the more interesting setups out there, with some pretty huge and interesting demands.

Setting the scene for the article , Jim Benedetto VP of technology at MySpace, says that at MySpace they "… live or die by how fast they can serve their content….have had every storage vendor in their data center…...todays storage systems simply cant handle thir needs….". Whoooaa! Talk about a wake up call for the big iron storage vendors out there. Im sure they’d all love to have their claws into a massive environment like MySpace. Fortunately for the rest, NetApp are the only ones named and shamed in the article.

Although I know that MySpace is an absolute monster when it comes to storage consumption and demands, I personally see this type of environment becoming more and more common. These new breed of companies serving up online digital content appear to have quite different needs to the more traditional customers of big iron vendors such as government establishments and banks. Unfortunately they don’t go into the specifics of why the current storage systems and vendors don’t meet their needs. Comparatively speaking, the big iron vendors seem to have solutions tailored "quite" well to these more traditional customers. I imagine companies like MySpace are unbelievably more dynamic than a bank – less change control etc (please hire me!!!!!). As an example the article says that the MySpace data centre in LA had a pile of flattened cardboard boxes the size of a truck that were this weeks delivered servers - may be a slight exaggeration but an impressive image nonetheless.

I was interested to read that in some cases they have 146GB and 73GB disks with as little as 5-10GB of data on them so that they don’t max out I/O queues and performance on each disk. Sound familiar to my recent post titled "When bigger isn’t better" – where we talked about big disks being fine in some environments but not when ultra-high performance is demanded. They had considered using smaller, 36GB, disks for their environment (Im not sure about 3PAR, their preferred vendor, but I know of a couple of big iron vendors who no longer list 36GB disks as supported in the top end arrays). However, they ended up going for the bigger disks and use the spare space for snap backups. They don’t go into detail on this but I would imagine their performance demanding active data is placed on the outer tracks of each disk with the snap stuff being on the lower performing inner tracks – sound familiar again to when I recently talked about Pillar re-championing this idea of short-stroking a disk and designing a policy based approach to this technique. We talk about all the good stuff here on rupturedmonkey Cool

From my experience the monolithic arrays from the big "3 letter vendors" tend to be quite inflexible compared to their mid-tier cousins. And may be in the past that wasn’t such a hindering factor – after all, in the past the people who needed big arrays tended to be big boring companies. But some of the more modern big storage users appear to be needing something a little more nimble but without sacrificing the reliability, scalability and performance inherent in the top end monolithic boxes. Will the big storage vendors continue to redevelop and refresh their existing boxes and approaches year on year ad infinitum all the while heading down the same road without realising that the world around them is changing (in my opinion)??

Im currently working away from home, so spend Monday to Friday in rented digs and am finding it increasingly annoying that the TV programmes I have recorded on my Sky+ box (PVR) are not accessible to me while Im away form home! Why doesn’t my cable/satellite provider offer me a service where I can record my TV while away from home and then access it on the road? Im sure it wont be long before they do! The same goes for my data on my laptop. I recently had to borrow my wife’s laptop for a week as my MacBook Pro wont take a GPRS/3G wireless network card. No probs – I borrow my wifes laptop only to find that half of the documents I need that week are back at home. Why don’t I have my docs stored up in the cloud somewhere accessible everywhere? The there’s my iPod – why do I need a newer model with more internal storage – why cant I just store ALL my digital stuff out there on the internet and just pull down what I want when I want it? I know these types of services are out there and maturing but imagine who big they will one day be - I now have a warm reassuring feeling that people with storage skills are still going to be needed in 10 years  Cool

Surely one day these new breed digital content companies will become the biggest storage users in the world. One instant benefit I would see from storing all my stuff on the internet would be that I would no longer have to store DVD’s, CD’s etc out of grabbing reach from my 10 month old daughter Lily. If all of my content is up there in the cloud then she can’t reach it and scratch it and I also get more floor and shelf space in my house – instant double win!!

Just to close this post with a teaser of sorts, they also mentioned that they expect MySpace to be totally off tape in the next few months – hmmmm interesting (assuming they aren’t just saying this to sound cool and cutting edge!?)


PS. Anyone with any more storage info about these types of companies such as Google, Yahoo, Microsoft…… please feel free to post comments telling us who’s and what kit they are using!