Quantcast
Channel: Spiceworks Community
Viewing all articles
Browse latest Browse all 5334

Calm after the storm: The birth of a DR solution

$
0
0

This is the 234th article in the Spotlight on IT series. If you'd be interested in writing an article on the subject of backup, security, storage, virtualization, mobile, networking, wireless, cloud and SaaS, or MSPs for the series PM Eric to get started.

October 29, 2012, is a day I will always remember. That day was a trial for me professionally and the day I found out how well our business continuity plan was designed and tested. This was the date when Hurricane Sandy hit the New Jersey area and devastated the state—and our office in the city of South Kearny.

This location is our main print production hub that runs nearly 24 hours a day and, at times, seven days a week. And, after the storm, this office was down for nearly two months.

Prior to this disaster, the DR plan for our production branches was solid in theory, but in practice the proper tools for off-site backup were not in place to actually make it effective.

The storm

On the day of the storm, the Governor declared a state of emergency, and the office in New Jersey closed almost immediately after opening. We all knew it was serious, but we had never before experienced what was to come. In all honesty, we took the impending storm lightly.

Since our corporate offices are right outside of Boston, I knew I had some time before the storm would reach us, so I went in that day to ensure all the previous night’s backups ran properly.

During this time, I kept in close contact with the branch manager, who was no longer in the office but lived nearby. After noon, his phone went silent and all calls went straight to voicemail. I tried calling the office, but I couldn’t get through. Alerts started coming through from our monitoring system on failed pings and service checks. At this point it was obvious that the T1 was down, and that the office had lost power. At 3:30 p.m., my manager called and ordered me head to home for safety. For the remainder of the day, I was in the dark, unable to contact anyone at the Kearny branch.

The next business day the CEO closed all offices. The day after, I worked remotely—dealing with the outages with phone calls to Verizon, rerouting lines to managers’ mobile phones, and sorting out remote access to customer service staff so they could handle any incoming email inquiries from customers.

Due to the road conditions and severity of damage, it took the managers in the New Jersey office three days to make it in to the facility. No one was prepared for the scene inside. Everything had been under water—around 3 feet of water from the looks of it. And this was in office space that is elevated 6 feet off the ground!

Once we were able to get our bearings, the senior management team and I met to discuss what needed to be done. All customer work was being rerouted to the production center located in the corporate office, and key staff from the New Jersey facility was being diverted to Woburn, Massachusetts, to assist with the backlog of work.

I began the process of taking from backups I had pulled over the WAN to corporate to get the jobs on our servers so work could be done. IT gear was driven in from corporate for me to review, and it didn’t look good. All the PCs had been destroyed in the flood. The only surviving equipment were two servers I had moved to the top of the rack during my last visit. After some tweaking, I had the Lotus Notes Job System we use in the New Jersey office up and running.

There was much other work done during this time, and the process was long.

I went down about six weeks after the storm and had the site re-wired, redid the network closet from the ground up, and had 10 new workstations and Synology NAS units ready to go for the office. After all that work, the facility still didn’t reopen until the end of January 2013 due to on-going issues with power (the landlord still had us running off generators).

Presently, I’m pleased to say that the facility is humming right along and is actually in better shape than ever.

The aftermath

Though we were fully functional within a few days of that devastation after moving operations to our Massachusetts hub, the powers of nature served as a reminder to our team that we needed to revisit our disaster recovery plan and see where we needed to add or increase our ability to recover quickly from anything: floods, fire, and anything else that could be thrown our way.

I set out first to speak with the leaders of key business units to find out what data was crucial for them to be able to continue business after a disaster. I’m a strong believer that IT is not and should not be the group responsible for quantifying what data is critical for a business to be able to operate after a disaster.

With that knowledge in hand, I began my review of solutions available to me. I’d already abandoned tape (I was never a huge fan of the technology as I always felt it could not be trusted). I knew the path I was going to take would be the cloud, but I had requirements.

2.5TB — that’s what I knew I needed to get out of our environment and quickly and securely out to the cloud. I had no data management software to provide me insight as to what my annual data growth was, so I estimated the key data to be at less than 1TB a year of true business critical data after reviewing the systems marked as critical and seeing what has been added over the past year. Not a completely scientific method, but I felt confident I could still grow into this solution.

The solution had to have an option for a local copy of the data to be easily accessible, offer secure transmission and multiple remote data centers, be OS agnostic and outside of my current backup solution.

After testing trial runs, speaking with reps, reading reviews, and so on I, decided I would go with the Barracuda 690 Backup appliance that comes with 4TB of suggested backup storage with a maximum capacity of 8TB. This solution is fully loaded and goes to the cloud, another site, or a private cloud. It comes with data deduplication, retention policies, bare metal restores and many other options.

My current environment is being backed up using Veeam Backup and Replication, connecting to an HP StorageWorks server over a dedicated backup network. We’re also using another cloud provider, Zetta, which is only for our Exchange and AD environments. For $195 and 500GB of storage this solution will stay in place considering the low cost.

The new design will be as follows: Veeam will continue to run our backups as scheduled during the course of the week. The Barracuda backup will run on an automated schedule and back up key systems locally — full on the first run and then the Barracuda will run through subsequent backups and ensure files are unique, transfer those offsite to their DC and then once this is does the data replicates to a second DC. With this solution, I now have a secondary option if I run into issues with my Veeam backups, along with an option to access data in the cloud from anywhere at any time should the situation call for it.

For the senior management team and me, these are all pieces that help show our customers we’re serious about technology, the work we do for them, and that being able to serve them even when a disaster hits is high on our priority list.

One thing I strongly suggest is to never make decisions purely from an IT perspective. You must engage the various business units within your company and discuss what their challenges are/could be, and look into solutions that can achieve those results. Through this, a team can be formed based on expertise and not bias to set objectives and goals with firm deadlines on meeting those. In the end, this can only aid the business as a whole to achieve compliance, continuity, and, most importantly, buy-in from key personnel to make the purchases and changes needed to improve your business.

Questions, thoughts, or disaster recovery-related tales you're itching to tell? Chime in in the comments below!


Viewing all articles
Browse latest Browse all 5334

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>