Splunk Inc.

10/29/2024 | News release | Distributed by Public on 10/29/2024 22:34

The Observability Center of Excellence, Part II: From Pitch to Formation

Welcome back to our Observability Center of Excellence (CoE) series! In this second article, we'll dive into the steps to help you form your very own Observability CoE team. Haven't had a chance to check out the first blog? All good... Please take a quick look at "An introduction to the Observability CoE," as I will reference its content throughout this article.

TLDR recap: The Observability CoE positions you to build a leading observability practice, addressing the challenges and impacts of legacy monitoring practices of the past. Check out the table below to see what the Observability CoE helps with:

Challenges
Impact
  • Tools fragmentation
  • Low confidence in alerting systems
  • Reactive Monitoring/Response
  • Focus on tech domain, not service health
  • No Measurement framework
  • Increased Costs
  • Reputation damage
  • Prolonged outage calls
  • Tools sprawl and visibility inconsistencies
  • Business & IT Blind Spots

Still with me? Great! Now that you're on board, it's time to generate some excitement within your organization and build your CoE team. In this article, I'll give you the tools to spark that enthusiasm, secure executive buy-in, and start recruiting the right people to make this happen.

Your Internal CoE Pitch

Introduction to the Deck

Here's a quick rundown of what's coming next: We're about to dive into some key concepts you'll need to grasp in order to get buy-in and start building your Observability CoE. To help kickstart the process, I've included a SAMPLE PRESENTATION that covers each concept we'll discuss. Feel free to apply your favorite theme, make tweaks, and tailor it to align with your organization's vibe. At the end of this process, you'll have a deck ready to deliver to the decision makers to make sure that you can get what you need to build out a CoE.

Before pitching and securing buy-in for your Observability CoE, make sure you first understand the key concepts covered in this article, as they're what I used to build the deck. These are the building blocks for gaining executive support and forming your team. As you review the content, focus on how each section aligns with your organization's specific goals or pain points. Relating the content to your business needs will help you present it in a way that resonates with leadership, aligns with strategic initiatives, and sets a strong foundation for your CoE.

Breaking Down the Pitch: Key Sections

Introduction of the Observability CoE

We've already covered these challenges in depth in the previous blog post, "An Introduction to the Observability CoE" and the TLDR in the intro above.

It's important that you understand the common challenges organizations face with legacy monitoring practices, the resulting impacts on business operations, and how the Observability CoE addresses these issues.

Better yet, think about examples where some of these challenges and impacts resonate with your business. For example,"Do you trust the alerts in your environment? How many alert emails hit your email inbox a day?" or "Have you had instances where customers reported issues before your team even knew about them?" Tell stories about specific problems that your business has suffered because of insufficient observability. Use data on the customer and employee impact, if you have it.

By asking these questions and connecting them to your organization's specific pain points, you make the need clear right away before you start to ask for resources.

Demystifying Observability (as needed)

Observability can feel like an overloaded and misunderstood IT buzzword, but it's essential to level-set and break it down into bite-sized, vendor-agnostic terms. This section clears the air, giving your organization a balanced understanding of observability that will resonate with both executives and the technical leaders who will form your CoE. Here's a few topics you may want to run by them:

  • Observability vs. Monitoring:Monitoring tells you when something is wrong, but observability gives you the context needed to understand why things are breaking down, enabling faster resolution and proactive insights.
  • Pillars of Observability (MELT):The core pillars-Metrics, Events, Logs, and Traces-are the building blocks of any observability practice. Each provides a different perspective into your infrastructure, applications, and business services, helping to paint the complete picture.
  • Critical Capabilities: These were already covered in depth in the first blog, including infrastructure monitoring, APM, DEM, centralized log management, AIOps, and event management. Each capability plays a role in ensuring full-stack visibility.
  • The Observability Practice:The organization, utilization, and ultimate value of these capabilities form your organization's observability practice. The Observability CoE should be positioned to ensure that these pieces are optimized, standardized, and governed-ensuring a structured approach that drives business value.

The Dream Team: Roll Call

Next, let's talk about the key people who need to be involved in your Observability CoE. Below is an overview of the various roles to consider, along with common titles and some "pro-tips" for each role as you build the CoE.

As you assemble this team, remember to keep it lean (two pizza team) focused and agile, with just enough members to accomplish your objectives. This team should include a strong mix of IT resources, such as support, IT operations, SRE/DevOps, application development, and those responsible for critical IT processes.

Keep in mind, team composition may vary depending on the type of organization you're in. For example, a shared services model, business unit IT, or more advanced SRE setups may require different mixes of skill sets and roles within the CoE.

It's important to strike a balance between CoE work and their existing responsibilities. One way to achieve this is by attaching to in-flight and high-priority initiatives already underway in the organization. Early on, as you form your CoE, the focus will be on clarifying priorities, setting scope, and ensuring that observability inputs/outputs are positioned to be ingrained in related and dependent processes. Over time, the CoE will transition into delivering iterative value, driving your observability strategy forward while balancing the team's day-to-day roles.

Here's a table summarizing who should be involved and their responsibilities:

[Link]

Sample CoE Focus Areas and Key Initiatives

Once your Observability CoE gets rolling, the value it can bring to your organization is practically limitless. Below are some sample initiatives the CoE might take on, all tied to the key areas we covered in the first blog-governance, standards, tools, business alignment, and measurable impact.

The goal of this section is to give you practical examples of things that both your executive sponsor and CoE team members could agree are currently missing and need to be fixed. This is the art of the possible, laying out what can be achieved through focused CoE initiatives.

These examples are just the start. Grab the ones that resonate with your team, or better yet, make this an early win for the CoE: review the options and pick what fits your organization. And stay tuned, because in future posts, we'll dive into the nuts and bolts of how to execute on some of these core use cases. Be sure to ask those you're pitching to whether or not they can think of any additional initiatives.

Some of these may sound buzzwordy. Make sure you use the right language for your audience and your company. If you want to learn more about these topics, I'll be writing about them in greater detail later.

Governance, Standards, and Best Practices Business Alignment
  • Develop Repeatable Standards and baselines
  • Observability as-service Request and Fulfillment Offerings
  • Develop a repeatable Observability framework for New and Legacy Workloads
  • Creating a Tier-Wise Observability Offering
  • Identifying Observability as a Service offering
  • Connecting Observability with Critical IT Initiatives
  • Leveraging Observability Vendors
  • Educating the Organization on Observability
  • Shifting Observability Left in the SDLC
Measurable Impact Tools
  • Measuring Observability KPIs (agent saturation, alert to incident ratio, tools license utilization)
  • Assessing Organizational Observability Maturity
  • Quarterly "State of Observability" business report
  • Tools utilization/saturation
  • Tools audit
  • Driving Tool Adoption
  • Reducing Tool Sprawl
  • Champion Observability Solution POC's

Meeting Format and Cadence

Getting your Observability CoE off the ground requires commitment, but the goal is not to overload your team with additional meetings. Start small, find your rhythm, and gradually increase as the CoE begins to deliver real value.

Begin with weekly or bi-weekly meetings, keeping them around 30-45 minutes. These should focus on the essentials: what has been completed, what's next, and anything blocking progress. A key task in the early meetings will be prioritizing the focus areas and/or initiatives (examples above) the team will tackle first. Given the list of objectives or tasks (e.g., tools audit, synthetic monitoring for critical workflows), it's important to determine which actions deliver the most immediate value and align closely with existing business goals. This helps ensure that the CoE doesn't feel like an extra task on top of an already busy IT schedule.

As the CoE progresses, staying closely connected with your executive sponsor is crucial. Regular communication helps maintain alignment with business priorities and keeps leadership engaged. Make sure to conduct quarterly reviews to showcase progress, celebrate wins, and adjust where necessary. These reviews should involve your sponsor to ensure that the CoE's outputs remain relevant and impactful.

Remember, the goal is to let the CoE grow at a pace that delivers results without overwhelming the team. Start small, prioritize effectively, and stay connected to leadership. I've included a sample CoE meeting agenda and a quarterly update slide in the sample deck for reference.

Wrap-Up: Next Steps and Call to Action

By now, you've seen just how important an Observability CoE is for your organization. The next step? Generating excitement and securing buy-in, from the executive sponsor and future CoE members. Use the content from this blog and the reference deck as your guide to spark that momentum.

Take the deck, update it with your organization's specific context, and grab some time on your proposed sponsor's calendar. Secure buy-in, form your team, and start getting those meetings on the books to set your CoE in motion.

This is just the beginning. We'll be rolling out additional content soon, with more guidance on specific CoE initiatives and how to make them a reality in your organization. Check back here for more.

If you're passionate about learning more about observability, I'd encourage you to check out my teammates' Observability content on Splunk's Community blog and watch some of our latest videos on YouTube (Splunk Observability for Engineers)