Business Opportunities in the UC Channel - Part 6

16 Apr 2012

This is the sixth of an eight-part weekly series of articles that leads up to the UC Summit 2012 that will take place May 6-9 in La Jolla, CA. See the UC Summit website for more details.

Last week we talked about the challenges of UC route planning and directory integration and the differences between a centralized UC system and a distributed PBX system in this regard. The centralization of infrastructure in data centers, while that has its advantages, also has drawbacks. Clearly, centralization leaves the customer vulnerable to local outages at or near the central site that will impact the entire enterprise: so this week we are going to cover disaster and failover planning.

Whether the outage is caused by a force of nature (hurricanes, tornados, ice storms, earthquakes, tsunamis, flooding, etc.) or by an act or omission of man (power outages, IP outages, fires, system failures, misconfiguration etc.) business continuity plans need to be completely revamped when a UC system is deployed. Previously if, say, the Accounts Payable system went offline, the AP team would take an early lunch or do something else while waiting for the system to come back. With centralized UC, a system failure has the potential of preventing the entire organization from communicating internally as well as externally; and that includes inbound communications.

With a UC deployment, the organization's communications topology will change, as will the dependence of the communications systems on other systems, such as the user directories that we discussed last week. For this reason, a key step in deploying UC is a review of IT systems and interfaces with the goal of understanding interdependencies in order to design a system that is robust to failure and has the highest degree of failover mechanisms that are available. Particular attention should be paid to critical failure points such as user directories, and system bottlenecks such as network edge devices. These systems must be fully redundant with a good failover architecture: ideally an overprovisioned "active-active" pairing, which allows two (or more) systems to load-balance and to take over the entire load if a particular instance were to fail.

Systems are not resilient to all failures; for example a software bug that exists in all replicas of the system may cause a "correlated failure," so it is important to have a process in place to enable the restoration of a previous working version. Furthermore, while single failures are relatively easy to plan for, multiple failures, such as loss of power and IP connectivity are normally not survivable locally. Even a data center that has local power generation and redundant IP connections will become unavailable if the entire local area has been impacted by a natural disaster, so this will require a site failover to a backup data center.

The best UC systems can provide seamless, or near seamless, site fail over. UC client devices can be provisioned by centrally managed policy to connect with a second data center should the first become unreachable. The data centers themselves must be over-provisioned to take on the load of the unavailable data center. In order to be able to do this, the data centers must have been constantly replicating their data and configuration settings to each other up to the point of failure. This is a fairly standard function of an enterprise class relational database system, which is normally the "back-end" of good UC system. Going back to the topic of vendor selection: these features and functions are critical and must form part of the system selection criteria if the UC deployment is to be a success.

Like data centers, corporate office and branch locations are also subject to localized outages that cannot be survived. A good UC system makes provision for this by allowing, at a minimum, internal communications to continue even if contact with any data center is lost; as will happen in the "back-hoe cut" scenario where the road crew down the street did not "call before digging." In this situation, the UC system should be capable of failing over to the PSTN for external voice access. However, if the road crew also dug up the PSTN connection, then the business continuity plan should mandate that staff who need to communicate externally should either move to another facility, or even work from home. Both of these options work fine with a UC system.

Next week's topic, "Network Edge Security, SIP Trunking & Federation" is related to business continuity, so be sure to check back and read that.

Unified Communications Strategies Logo Sm

Also on UCStrategies.com in this series:

Comments

There are currently no comments on this article.

You must be a registered user to make comments