top of page

Search Results

BCStrategies

63 results found with an empty search

  • The Value of Proactive Monitoring (Part 2)

    If your organization uses Microsoft Teams for its collaboration and communications, when Teams is unavailable your business can experience lost revenue, a drop in productivity, and increased IT work (to diagnose and correct the issue). Martello Technologies  commissioned  EnableUC  to develop a model that estimates the positive impact on revenue, productivity, and IT labor of deploying Vantage DX proactive monitoring and enhanced diagnostic tools within a Teams environment. In this second exploration, we explain a model developed to estimate the holistic impact of Teams outages on businesses of various sizes and configurations. (For more background information see part 1:  Determining the Value of Proactive Monitoring .) Key Takeaways ➡️ Proactive monitoring can reduce lost revenue and productivity due to Teams issues and can reduce overall IT labor costs. ➡️ For an organization with 1,000 users proactive monitoring has the potential to save $700K per year in lost revenue, productivity, and IT labor costs. ➡️ Organizations with 10,000 or more users can expect millions of dollars of potential savings if proactive monitoring and enhanced diagnostics are deployed. What Impacts Teams? In developing our model, we identified 11 categories of issues that created outages or service degradation for Teams. Each category has a probability of occurring, a scope (how broad is the impact), and a potential for mitigation with proactive monitoring. Based on our collective expertise, discussions with IT professionals and Microsoft MVPs (most valuable professionals), along with online research, here’s how we rated each category. Combined, these factors degrade Teams service an estimated 1.8% of the time for one or more users. Depending on your organization’s work hours, not all these outages will occur during working hours, unless you operate 7 x 24, the model accounts for this. For each of the identified 11 issue categories, we estimated the percentage of issues that could be mitigated with proactive monitoring, ranging from 0% to 90% depending on the source of the issue. Typically mitigation strategies would include… Detect and correct : Synthetic transactions, used as part of proactive monitoring, often alert IT to issues before they impact end users; for example, a misconfiguration issue that causes an outage that occurs before or after working hours. In this case, IT may be able to diagnose and correct the issue before the start of the next work cycle. Detect and communicate : Proactive monitoring may note a broad or location-specific issue. Some issues may be outside the ability for IT to correct (for instance a Microsoft Teams or supporting service issue, such as the one that happened recently, referred to as  MO941162 ; a power failure, or a physical cable cut). In these cases, IT can communicate the outage and suggest alternatives. For example, potentially rescheduling a meeting if Teams is not available or using an alternative meeting solution (many larger organizations maintain some Zoom or Webex licenses for this exact scenario), or working from home, a coffee shop, or another company location,  if an issue is impacting a specific office. Of course for mitigation strategies to be effective, some pre-work is required. This can include training users on alternatives (for instance making sure everyone knows how to “hot spot” if their home network or an office network is impacted) and preparing communications in advance of specific types of issues (e.g. office closures due to weather, power, or physical infrastructure issues). Some issues, predominantly individual hardware and software issues are difficult to prevent and so the approach is to  react efficiently . This also involves pre-work such as stocking spare devices, components, and having a tested process to “swap” out components, or in some cases entire laptops, while preserving data and configuration settings. For some organizations this could also include having “loaner” laptops that can be used while a full replacement is being arranged. Based on our research, our model uses the following default values, which can be modified for specific cases … *While proactive monitoring can help mitigate many issues, in our assessment, end-user errors or issues, caused by not understanding how to use Microsoft Team effectively, can best be mitigated through enhanced initial and on-going end-user training. Impact on Revenue If Teams is your communication and collaboration platform, when Teams is not available your sales pipeline is not advancing. This doesn’t mean that sales are necessarily lost, however, our model assumes that revenue is deferred during a Teams outage – either because “closing” sales are postponed or invoicing and collecting revenue is delayed. Our model looks at annual revenue and assesses the potential reduction in “deferred” revenue when proactive monitoring is implemented. Impact on Productivity For organizations that use Teams, it is often central to communications, collaboration, and workflows. This means when Teams is impacted productivity across your organization is impacted. Our model calculates the lost productivity associated with Teams outages, based on average employee costs. We then estimate the productivity savings a reduction in outages via proactive monitoring would deliver. Impact on IT Staffing More incidents mean more IT staff to investigate and resolve issues, which leads to higher labor costs. Proactive monitoring helps reduce the number of incidents and advanced diagnostic tools (which often provide a more comprehensive view as compared to the built-in Teams reports) help resolve issues more quickly. In the  previous article , we detailed the operational model used to compare IT labor required with and without proactive monitoring. These labor calculations were included in this second more holistic model. This Too Shall Pass Communication when an issue is detected is important, as it allows users to make alternative plans that reduce the impact of a Teams-related issue. Equally important, but often overlooked, is the ability to communicate that an issue has been resolved, so that normal workflows can be resumed. Agents or appliances, deployed at various locations, which are tasked with executing synthetic transactions simulating user activity serve a second important purpose of being able to detect when systems are once again functioning normally. This allows groups of users to most efficiently resume normal business processes, which minimizes lost revenue and productivity. Results Taking into consideration all the above, our second holistic model projects the following for several different sized organizations. For all scenarios we assumed the organizational had annual sales of $100K per user (used to calculate revenue impact) and that the average fully loaded salary cost was $120K per employee (used to calculate productivity impact). The model allows you to modify these assumptions to match your specific situation. With 1,000 users working in the office 3 out of 5 days (a common hybrid arrangement), proactive monitoring could deliver potential savings of over $700K. As the number of users increases, proactive monitoring has a larger potential impact. (Other variables such as the number of locations, percentage of the day when the business operates, annual revenue, and number of locations can have a significant impact.) As organization size approaches ten thousand users, projected savings with proactive monitoring exceed $1 million. The complete model takes into consideration other factors including the number of desk phones and room systems deployed, the number of time zones operated in, outage time to create an incident (defaults to 10 minutes), percentage of users who raise tickets when an issue occurs (defaults to 16%), etc. Conclusion Using reasonable assumptions related to the availability, impact, and operational effort to manage a Microsoft Teams environment, proactive monitoring and enhanced diagnostic tools can provide a hundreds of thousands of dollars of potential savings for organizations with as few as 1,000 users. As organization size and complexity increases, so to does potential savings. Larger organizations with 10,000 or more users can expect millions of dollars of potential savings if proactive monitoring and enhanced diagnostics are deployed. Additional Information Details related to IT labor estimates are more fully explained in the previous article:  https://www.linkedin.com/pulse/determining-value-proactive-monitoring-kevin-kieller-ntrpc In this LinkedIn Live discussion, I discussed proactive monitoring with my colleague and 10-time Microsoft MVP Dino Caputo:  https://www.linkedin.com/events/7265361226208018432/about/ Notes on Model Development Input for the model was based on our collective expertise, discussions with IT professionals, who are responsible for managing Teams environments, and Microsoft MVPs (most valuable professionals), along with online research. Our research and model development occurred without any input or influence from Martello and we only shared the results when completed.

  • Determining the Value of Proactive Monitoring

    System downtime has a cost to any organization. If your organization uses Microsoft Teams for its collaboration and communications, when Teams is unavailable your business is impacted. The challenge is separating headline grabbing estimates, “… downtime can eclipse $5 million an hour in certain scenarios…” (Forbes Technology Council, April 10, 2024), often associated with only the largest organizations, from reasonable estimates for various sized medium and large organizations. To address this challenge,  Martello Technologies  commissioned  EnableUC  to independently develop a model that estimates the impact of deploying Vantage DX proactive monitoring and enhanced diagnostic tools within a Teams environment, based on different sized and configured organizations.  Our research and model development occurred without any input or influence from Martello and we only shared the results when completed. In completing this project, we ended up building two models (we are over achievers 😊), one that focused on the operational costs to support a Teams environment and another that looked more broadly at operational, productivity, and revenue impacts. This article discusses our first model which calculated the difference proactive monitoring and enhanced diagnostic tools can have on operational costs. Key Takeaways ➡️ 60% of issues could potentially be mitigated with proactive monitoring. ➡️ For an organization with 1,000 users, proactive monitoring is likely to halve IT support labor required. ➡️An organization with over 10,000 users should expect to reduce required staffing by 70% if proactive monitoring and enhanced diagnostics are deployed. Building a Model To assess the advantages of an enhanced monitoring and issue diagnosing toolset, we developed an operational model loosely based on the  Microsoft Operations Framework  (MOF). While Microsoft has shifted its focus to a tool-based approach they call the Microsoft Operations Management Suite (OMS), MOF provides a structured life-cycle based approach and serves as a good foundational model for IT service management. We extended MOF using a series of “ runbooks ” we have developed over the years for various organizations who have implemented Microsoft Teams. The result was a clearly defined series of daily, weekly, monthly and annual tasks required to successfully operate any Microsoft Teams environment. Task Effort Estimates Based on our collective expertise, discussions with IT professionals, who are responsible for managing Teams environments, and Microsoft MVPs (most valuable professionals), along with online research, we assigned effort estimates to each of the identified Teams management tasks. We then estimated the number of issues and tickets that would be generated, based on hands-on experience and research. Understanding the number of tickets generated is critical because a significant portion of daily IT time is typically allocated to addressing tickets. We identified 11 categories of issues that created outages or service degradation. (Categories included core services issues, supporting service issues, hardware and software issues, human error, loss of power, etc. We will explore these categories in detail in a follow-up article.) Collectively, these items degrade Teams service 1.8% of the time, for one or more users. Depending on your organization’s work hours, not all these outages will occur during working hours, unless you operate 7 x 24, the model accounts for this. Additional assumptions built into the model (which can be configured) include: Expect 1 incident per every 1,000 physical phones deployed per day Expect 1 incident per every 50 Microsoft Teams Rooms per day. An issue or outage needs to last 10 minutes in order to potentially create a ticket. For instance, if a momentary “blip” occurs while trying to join a meeting, most users simply retry a few times. On average 16% of users raise a ticket when an incident/issue occurs. The Impact of Proactive Monitoring Proactive monitoring reduces the number of user-impacting incidents, because it allows IT teams to correct issues quickly, potentially before users are impacted or, when an issue can’t be quickly corrected, allows IT to communicate alternatives.  For example, if a network issue is impacting a location, users can be advised to work from home, a coffee shop, or another nearby location. If Teams, or a supporting service (e.g. authentication), is experiencing an issue, users can be alerted that they should use a backup UC solution, or their mobile phones for an upcoming meeting. For each of the identified 11 issue categories, we estimated the percentage of issues that could be mitigated with proactive monitoring, ranging from 0% to 90% depending on the source of the issue. In total, our model indicates that up to 60% of potential issues could potentially be mitigated with proactive monitoring. Implementing Proactive Monitoring To proactively monitor a Microsoft Teams environment  synthetic transactions  and  agents or appliances  are key tools. Here’s a breakdown of how they work and their benefits: Synthetic Transactions Synthetic transactions simulate user activities to test and monitor the performance and availability of Microsoft Teams services. These transactions are pre-scripted actions that mimic real user interactions, such as: Joining a Teams meeting Sending a message Sharing a file Scheduling a meeting By continuously running these synthetic transactions, IT teams can detect issues before they impact actual users. This proactive approach helps identify performance bottlenecks, service outages, and other problems early on. Agents or Appliances To execute synthetic transactions, organizations deploy agents or appliances at various locations. These agents can be software-based or hardware devices that perform the following functions: Monitoring Performance : Agents simulate user activities and measure the response times and success rates of these actions. Collecting Data : They gather detailed metrics on network performance, application responsiveness, and service availability. Alerting and Reporting : When an issue is detected, agents can trigger alerts and generate reports, providing IT teams with actionable insights. Enhanced Diagnostics Proactive monitoring can reduce issues, but it cannot eliminate every issue or the corresponding tickets that users raise. As such, our model takes into account how enhanced diagnostics can reduce the time required to identify a root cause and address a particular issue. Microsoft continues to improve the built-in diagnostic reports, most recently deprecating the Call Quality Dashboard in favor of PowerBI Quality of Experience (QER) report templates. However, both CQD and QER reports can be data rich and information poor. They provide lots of technical details but overwhelm all but the most skilled IT professionals. Additionally, the Microsoft reports don’t provide much detail outside the Microsoft environment. Local network and ISP details are not fully captured using the Microsoft built-in reports. For organizations using direct routing, session border control (SBC) details and carrier SIP trunk details are incomplete. For customers using Operator Connect, key carrier or network service provider details are sparse. We believe that enhanced third-party diagnostic tools can reduce the time taken to resolve a particular incident from an average of 30 minutes to 15 minutes. Put another way, a typical support engineer can handle an average of 20 tickets per day with the bult-in tools and an average of 30 tickets per day with an enhanced set of tools. Note that these tickets per day averages assume some tickets are more straightforward moves, adds, or changes and do not require root cause analysis. Results Taking into consideration all of the above, here is what the model indicates for several different sized organizations. For organizations smaller than approximately 200 users, you typically require at least one person whether proactive monitoring or enhanced diagnostic tools are deployed. Once you reach approximately 250 users, you can invest in more people or use better tools to reduce overall labor costs. With 1,000 users working in the office 3 out of 5 days (a common hybrid arrangement), the potential labor savings are significant as proactive monitoring reduces the number of tickets that require investigating and speeds up the time to resolution for issues that can’t be mitigated. Scenario: 1,000 users in 2 locations As the number of users increases, proactive monitoring has a larger potential impact. Scenario: 2,500 users in 5 locations Scenario: 10,000 users in 20 locations The complete model takes into consideration other factors including the number of desk phones and room systems deployed, the number of locations, the number of time zones operated in, etc. Conclusion Using reasonable assumptions related to operational management of a Microsoft Teams environment, for most organizations, with 200 or more people, proactive monitoring and enhanced diagnostic tools can provide a significant return on investment by reducing the amount of support labor required. For organizations with over 1,000 users, proactive monitoring can halve the amount of IT support labor required. Larger organizations with over 10,000 users can expect proactive monitoring to reduce support labor by two-thirds. This is only part of the story because outages also impact productivity and revenue generation for an organization. We will explore these broader impacts in a follow-up article that will dive into the details of the second model we developed as part of this project.

  • Teams Reporting: Evolving But Still Gaps

    Microsoft has consistently worked to improve quality and usage reporting associated with Lync, then Skype for Business, then Skype Online, and now Teams. There have been significant advances over the past nine years, but gaps and opportunities for improvement remain. Let’s first acknowledge the significant advancements Microsoft has made. Then we can examine the current gaps and opportunities to improve. The History The Call Quality Dashboard (CQD) was originally released as a free add-on for Skype for Business Server, the on-premises version of Microsoft’s chat, meeting, and communications server. In 2015 Microsoft  released  a version of CQD that worked with Skype for Business Online (the platform that would morph eventually into Teams). In 2016  version 2.0  of CQD provided access to 6 months of data and expanded reporting beyond audio quality, including video and appsharing information. The year 2017 brought  further updates  to CQD that added a reliability issue report focused on call setup issues. This was also the year Teams launched. The combined  Teams and Skype for Business admin center  was launched in 2019 which also integrated the call quality dashboard (although it really was just a menu link to the CQD portal). A significant number of CQD updates were launched in 2019 under the “ Advanced CQD ” banner. Call data was now updated within 30 minutes (labeled “near real-time data”) as opposed to taking over 24 hours. The ability to drill down within reports even to the user level was provided along with the addition of several near reports. After years of improving CQD, Microsoft pivoted in 2020 bringing call quality data into Power BI (business intelligence) with the release of the  first version  of the Quality of Experience (QER) templates. Current State The latest version of the Power BI QER templates, version 8, are available  here  and a detailed listing of the various Power BI QER reports can be found  here . Recently Microsoft has deprecated the original CQD portal, adding a banner that directs users to use Power BI: The current series of QER Power BI templates is packaged into five different templates, each with many reports: QER.pbit  is the main template with over 20 reports focused on identifying Teams meeting and calling issues. QER MTR.pbit  provides reports focused on Microsoft Teams Rooms. QER PS.pbit  is a template optimized to analyze Microsoft Teams Phone System deployments. CQD Teams Auto Attendant & Call Queue Historical Report.pbit  includes three reports related to auto attendant, call queue, and agent usage. CQD Teams Usage Report.pbit  details how users in your organization are using Teams. Current Gaps Despite the significant number of changes and the large number of reports available through the admin center, the Teams admin center, and Power BI, there are several gaps between the current state and the ideal state: 1.  Too much data too few insights The goal of analytics is to provide actionable insights, that is, to highlight issues you can take corrective action to address. The current reports still too often provide interesting visuals that don’t point IT professionals towards specific issues. 2.  Inability to compare groups The ability to compare quality, reliability, usage, adoption, and user satisfaction across different geographical, functional, and facility groups is one of the most powerful mechanisms to identify potential issues. While some existing Teams reports allow you to group results based on IP address, they lack the ability to track “VIPs” or other functional groups. 3.  Too many “good” calls CQD uses a very specific formula to classify calls as “poor”. The rules are too rigid and often having multiple parameters near the threshold can cause users to indicate the call was poor, even though it is marked as good. Specifically, CQD only marks a call as poor if one or more of the following conditions are met and Packet Utilization is > 500 packets: 4.  Lacking a complete view The CQD and Power BI reports do not have the ability to pull data from on-premises session border controllers (SBCs) or other network devices which means you have an incomplete view of what may be causing issues.[TJ1]  For organizations using Operator Connect or Direct Routing as a Service (DRaaS) this becomes even more challenging as they don’t have access to details that can help identify the likely source of an issue. Filling the Gaps I recently had a detailed discussion with representatives from  VOSS  that focused on how they  address the issues related to the built-in Teams reports for their customers. I came away from our discussion, understanding that  VOSS Insights  was focused on addressing several significant Teams reporting limitations: 1.  Focusing on actionable insights. According to VOSS, the name of its reporting product “Insights” speaks to the intent for the VOSS toolset to provide actionable intelligence into your complete UC estate. Customized dashboards can readily compare different user groupings. Customized dashboards can be complemented by intelligent alerting, the ability to group and summarize alerts as opposed to overwhelming IT pros wit a barrage of alerts during an incident. Beyond providing actionable insights and alerting, in some cases the VOSS tools can initiate automated remedial action, known as self-healing. [TJ2] [KK3] This can reduce the burden on the operations team and help to resolve certain issues more quickly. 2.  Delivering multi-platform reporting.   While many organizations have standardized on Microsoft O365 and Teams, lots still use other UC&C platforms for specific use cases. VOSS provides a “single pane of glass” even if you use multiple UC&C tools, so you can gain view and manage the full UC stack from a single point of control. Understandably, Microsoft reporting does not (and likely will not) provide this capability. 3.  Providing a more complete “big picture”. VOSS Insights incorporates the traditional CQD data along with proactive synthetic testing data, detailed data from SBCs, and network layer data such as NetFlow to provide an in-depth insight into the UC stack, helping to ensure better UC observability. This more complete picture can help shorten resolution time and reduce finger-pointing between teams (or providers). 4.  Helping optimize cost. VOSS Insights can help analyze usage and optimize capacity and licensing data to ensure you are delivering communications and collaboration capabilities as cost-effectively as possible. Additionally, by ingesting facility information, including power consumption data, customized Insights dashboards can assist in delivering better overall asset management. Information is Key The built-in Teams reports have certainly evolved, and no doubt will continue to improve. However, the Microsoft approach often provides lots of reports all with an overwhelming amount of data and limited information. Based on my discussions with VOSS, their toolset starts where the Microsoft reports end and focus on providing actionable insights. For those responsible for delivering consistent, reliable, cost-effective communications and collaboration, this combination is worth investigating.  References : VOSS site:  https://www.voss-solutions.com/ VOSS Insights product details:  https://www.voss-solutions.com/offerings/voss-insights/ CQD Stream Classification:  https://learn.microsoft.com/en-us/microsoftteams/stream-classification-in-call-quality-dashboard Power BI Quality of Experience Reporting:  https://learn.microsoft.com/en-us/microsoftteams/cqd-power-bi-query-templates RFC 350:  https://datatracker.ietf.org/doc/html/rfc3550

bottom of page