Five 9s Service for UCC - Really?

30 Jul 2013

Introduction

Ahhh the good old days - TDM consistently worked well, executive management came to expect a 99.999% model (always on but 5 minutes per year), bugs and bug fixes were rare, reboots for systems were "almost never." What happened to it all? Is 99.99% (four 9s) or 99.9% (three 9s) the new acceptable standard - and did we have to lower our standards for VoIP and UCC?

We have been engaged recently by several clients migrating from TDM to VoIP/UCC - the number of end points are in the thousands and in several 24/7 critical communications services environments. To quote one of the stakeholders: "Communications to our organization equates to oxygen for our organization - without it we would not survive." This is HUGE statement, and one that I am sure we can all relate to.

Over time the emphasis has gone to UCC and less about telephony, however, all communications roots starts with telephony as its baseline. Evolving from a five 9s model to something less than that, at least in my opinion, is unacceptable, and there are several ways to mitigate such, as this post will point out.

Once you get beyond all the hype of all the new features and functionality of next-gen UCC technologies, the simple fact is that if it's not reliable, then really who cares and should we really put our organization at risk?

Observations

We have observed several factors that can contribute to the issues at hand:

  1. Some enterprises do not have any Quality of Service (QoS) on their Wide Area Network. In some cases they don't because they are operating on a Metropolitan Area Network (MAN) with lots of bandwidth and minimal latency; however, any broadcast storm anomaly could introduce voice and video quality issues into any private network.
  2. Some channel partners have not risen to the occasion of a total voice data and UCC-integrated model. Some channel partners have the skill sets needed and certifications needed, however, do not have real-time help desk or NOC services that, in my opinion, should be offered as a part of their services model.
  3. While the market is truly shifted to solving 80% to 90% of all issues remotely and less about having a presence "on the ground," we still adhere to the model of response times that are commensurate with major and minor outages established already in the industry.

    Note that there is a recent trend that some manufacturers are looking at actually outsourcing some of their staff modeling, with "feet on the street" resources as outsourced. This actually makes sense, as long as SLAs are introduced and adhered to with the company that you have a contract with, and with their subcontractors as well.
  4. VARs in some cases qualify for gold status based more on annual sales and less about certifications - so be careful to evaluate the VAR based on capabilities, services, certifications, number of support staff, and annual sales.
  5. Voice and data staffing cultures have clashed even to this day. We have seen few environments with a cohesive organization with both voice and data skill sets matched to that of the larger enterprise community.
  6. Some of the manufacturers have actually moved towards a lesser support, almost a self-service support model and have left the "onus" on the enterprise to backfill those services that they had in the past.
  7. There's also another trend taking place: digital phones are slowly being put out to end-of-life. The indicator is obvious: digital phones in many cases are now more expensive than IP phones.

So What's an Enterprise To Do?

As the famous song goes from Twisted Sister - "We're Not Going To Take It Anymore," - it's time for the enterprise user to stand up and be heard. No longer do you need to compromise on a 99.9% or 99.99%. Fortunately, there are channel partners who "get it" and provide additional services to supplement those offered by the manufacturers.

And fortunately, there are several things you can do to avoid such risk and create an environment that is robust and gets as close to five 9s as possible. They include:

  1. Ensure that the VAR that you are purchasing from offers the engineering and process skills to properly install the VoIP/UCC environment.
  2. Avoid the self-service model, unless you're willing to invest a lot of money and resources and internal staff qualified to manage a fully integrated VoIP/UCC environment, then consider supplemental VAR services that will provide you the support levels you are required. Although this sounds like an additional investment financially, in most cases maintenance costs and software subscription costs together are far less than maintenance cost alone under the old TDM umbrella. This affords you the opportunity to increase service level without necessarily any increase in cost.
  3. Design all points on the network for QoS including all WAN, switches routers, end-to-end. Run a Network Assessment leveraging network assessment software prior to going live for any site.
  4. Design and overbuild your UCC solution for redundancy, redundancy, redundancy at the server, WAN levels, and at the public level, whether SIP trunking or PRIs or combination of both. Also, build your environment for as much resiliency as possible, with software license duplication through your enterprise, in the event of a core failure. Lastly, leverage as much virtualization that the manufacturer supports, to manage the number of servers being supported in the enterprise environment.
  5. Introduce survivable remotes wherever viable as a back-up strategy (note that some of the latest survivable remotes offer full feature/functionality as if you are operating in a fully connected environment).
  6. Acquire QoS-enabled Network Management tools to monitor the voice and video-enabled network - include Mean Opinion Score (MOS) scoring as one of the key ways of measuring the health of the network. "Up/down" measuring of the network health simply doesn't cut it.
  7. Hold the VAR you selected to a high service expectation - include SLAs, penalties, and contract outs as a part of the services contract (we have a number of SLAs we use in our practice that we can share with you if you drop me a line via e-mail [email protected]).
  8. Consider NOC services from the selected VAR or other partner - NOC services can clearly hold the vendor accountable to very high standards along with SLAs to protect you and your enterprise.

  9. Clearly define major and minor outages and associated SLAs. For major outages, the baseline should be no greater than four hours, and for minor outages, the baseline should be no greater than 24 hours to site, and response remotely sooner. In some cases, some VARs are actually offering now time-to-fix and not just time to respond (the defacto standard). It is something you should ask for in your next RFP for sure (at least optionally).

  10. Consider an on-site technician as a supplemental service if viable and cost effective (there may be enough work for a tech to be performing several functions and therefore a very short response-to-site needed when an outage occurs).

  11. Realize and plan for the technology refresh at month 48 of your purchase cycle for all VoIP/UCC server environments.

Conclusion

Yes, VoIP and UCC technologies are disruptive, however, the impact to your organization and benefits far outweigh the disruption factor. The good old days, are here, if you just consider and include many of the above talking points in this post as you migrate to a fully loaded UCC environment.

Comments

There are currently no comments on this article.

You must be a registered user to make comments