Being prepared for the unexpected, and the expected

I will start off by saying I tend to consider myself a prepared person for most situations and tend to think in the logic of the flow charts if the result of A is this, then move here, if not move here.  Recently I found myself scrambling after our old SAN that stores our bulk backups took a dump and we lost the controller.

I can’t say this wasn’t expected, but I found myself not prepared with the appropriate plan of action which made for a week and a half of long days and nights with my director on my case to get things back up and running.  While all I needed was a simple storage solution for the bulk backup to push to, there was also a counter part of getting these backups to our tape for offsite storage.  When this all happened there wasn’t a huge amount of panic as the best thing you can do is to stay cool and collected, logically think “what is the best option to move forward with?”

For me this was using our little IOmega ix4 NAS box which allowed me to just create a SMB share and pointed the backups there.  It didn’t quite have the performance of our OLD (IBM DS400) Fiber Channel box, but it did well enough to think, hey this isn’t that bad.  I was able to rerun all the Full backups from the weekend and was on my way to having my temporarily solution in place. This is when part two of our backup process came into place, moving the backup files to tape.  I rudely found that I was not able to get Backup Exec to attach to that SMB share and backup the full files.  While I know I probably could’ve eventually figured out how to get Veritas to see those files it seemed like it wasn’t the best solution for a timeframe of getting a new storage solution in place.

I then went in the direction of creating a iSCSI disk from the ix4 to the backup server running Backup Exec.  This in itself was a small challenge since we’re a FC shop, I have had next to no iSCSI experience, a bit of time on Google was able to help me muddle through the process of setting it up.  I copied the files off of the SMB share, onto a small USB Storage drive, attached the iSCSI drive to the server and copied those files back up.  I thought well, this is good, now I can see the drive through Veritas and I’ve always heard about how good iSCSI is, sweet here is my solution!  This time around, performance on the iSCSI drive just tanked. While I know iSCSI works and the performance is there, or companies like NetApp wouldn’t be in business anymore, my configuration wasn’t up to snuff obviously.  So here is attempt number two in the books, no closer to finding a solid solution to hold off my management from making a rash purchase that I would have to live with for the next five to seven years I started to panic a little.

After a week long process of trying to get this up and running, staying up to check on performance during the backup time window, I was starting to get really frustrated and knew my time was drawing short. The golden rule of IT kept playing over in my head, lose data, lose your job and I was going on a week of no bulk backups.  I was about to go to bed, when my girlfriend reassured me, you’ll find something soon, and that is when the light went on in my head.  Earlier that day looking at our production SAN for how much storage we would need to replace our production san (move our production to backups) I remembered there was a 700GB chunk of storage not being used.  Funny thing was the DS400 had 700GB of storage we were using for backups… a perfect match. There should be no performance issue going onto a more powerful SAN and it’s on the fiber network.  The biggest obstacle would be convince my director to allow me to carve this logical drive out for backups, where he always put his foot down that he wanted backups and productions strictly separated.  That night I sent an email stating all the reasons while we should use this as our temporally solution, he agreed to my idea and off I went.

Right now that is the solution we’re using and it seems to be holding up, at least for the time being.  The moral of the story is to think what equipment you have, the performance of that equipment and how are you prepared for the unexpected, or in my case the expected.  While I thought while we have this Iomega ix4 which I had been using for my R/D lab and it had worked well for that, it couldn’t handle 15-20 concurrent connections all dumping backup files.  If we didn’t have that 700GB on our production storage, I would probably still be in the weeds.  While you can’t always have the equipment onsite to counter any problem, it doesn’t hurt to think about where you are vulnerable or single points of failure and what you might have to do to counter those issues if they ever arise.

%d bloggers like this: