PagerDuty Inc.

10/08/2024 | News release | Distributed by Public on 10/08/2024 06:44

Being Operationally Mature Can Save You Millions

On July 19th, a widespread technical failure crippled operations across industries, resulting in lost revenue, wasted operating costs, and damaged customer trust. For businesses that had built trust via providing reliable and resilient services, this had both an immediate and a lasting impact.

We estimate that the July 19th outage ('Outage') cost our customers billions of dollars, with hourly downtime costs in the millions for some companies.1Unfortunately the consequences of the Outage were not always measured in mere hours, as the residual impact reverberated for many days after the main event.

The impact was not equal across companies.More operationally mature companies recovered more quickly and experienced 60% less business impact than their peers.2Our data concurs: operationally mature customers not only responded more quickly and efficiently to the Outage, with mean time to acknowledge (MTTA) up to 30% faster, but they proactively remediated residual issues before they arose. And by leaning more heavily into the PagerDuty Operations Cloud-with noise reduction to focus the team's efforts, and automation to streamline and orchestrate the overall response strategy-our data shows that teams were able to get to resolution more than 60% faster than their peers, empowering them to quickly return to their normal course of work. This translates to millions in potential savings from just one event, as well as establishing a reputation of resilience and reliability in the eyes of their customers.

As companies move forward from this experience, it is critical to evaluate if they're prepared for the next event. In an interconnected world that relies on technical infrastructure that is both aging and becoming increasingly complex with advanced technologies like generative AI, it is not a matter of 'if' but 'when' the next outage happens.

Preparedness is an investment that will not materialize on its own during a crisis like the Outage. Instead, it requires companies to prioritize ongoing investment in operational maturity-including their operational platform, processes, and people on the front line.

Operational platform
During an outage, companies need a platform they can trust and rely on to be up and running when the rest of the world is not operational. The PagerDuty Operations Cloud is that best-in-class platform with unrivaled reliability.

On July 19th, the Operations Cloud demonstrated resilience, with our data showing that despite an exponential increase in transactional volume over the norm (Incident Workflows up 1400%), the Operations Cloud performed well within its service level agreements. This allowed PagerDuty to play a crucial role in helping our customers identify and resolve time-critical problems so they could get back online as quickly as possible and minimize the financial and reputational impact to their business.

People and process
With the PagerDuty Operational Maturity Model
, customers can easily assess their current level of maturity as well as view top recommendations for improvement driven by peer-based benchmarking.3The Operational Maturity Model contains key categories such as people management, noise reduction, and automation to help customers understand how prepared their teams are to manage incidents and time-critical work efficiently. This makes operational maturity at scale across tens or hundreds of teams a seamless part of everyday operations so organizations are always on their front foot.

Outside of the response effort, a critical part of the Operational Maturity Model is continuous learning. More mature organizations don't leave experiences like the Outage-or even incidents with far less news coverage-in the rear view mirror without review and analysis. They leverage analytical insights and post-incident audits to identify areas of strength and resilience in their operations, as well as opportunities to mature. The PagerDuty Operations Cloud includes these analytics and post-incident review capabilities that can help companies improve their operational maturity and set themselves apart from their peers (you can learn more here).

Get started with the PagerDuty Operations Cloud today, or learn more about how you can increase your organization's maturity with our Operational Maturity Model.

1 Numbers calculated based on PagerDuty customers who had greater than 100% increase in high urgency incidents, the relative magnitude of the increase, the time that these customers remained in their elevated incident response state, and their total annual revenue and operating expenses.

2 These numbers were calculated by comparing customers who met or exceeded their 180-day average mean time to resolution during the Outage with all other customers.

3 Recommendations based on companies in similar industries and of similar size.