In the world of IT operations, “observability” is a concept that’s been around for some time. Having been in IT operations for more than 30 years, I can say that, even before anyone called it “observability,” we were in effect examining ways to achieve the same ends.
While definitions can vary, in essence, observability is about gaining an understanding of a given system by tracking and analyzing its external outputs. When done right, observability should tell you two things:
What’s going on now
How to make improvements that yield better future results
Getting back to basics: 4 steps to gain true observability
In spite of all the purported potential of observability, many IT leaders I speak with still don’t feel like they’re really getting what was promised. Over time the concept of observability has grown highly complex; unnecessarily so in my opinion. I think a lot of teams could realize the power of observability by taking a simpler approach. Here are a few key steps to get started.
#1. Don’t collect more data, collect the right data
Today, large enterprises are awash in massive volumes of data points being collected by various monitoring systems. In time, enormous databases have emerged. Invariably, when issues arise, operators don’t know where to start; no one has time to look at all the data captured. Further, the reality is that many of these data points have no bearing on what really matters: the user experience.
Start by doing a cohesive inventory of all the data being collected, particularly with respect to application performance. Look at what needs to be captured, and, just as importantly, what doesn’t need to be.
#2. Do an honest assessment of approaches and tools
Next, do an objective assessment of current tools and approaches. See how you can rationalize the disparate tools being used. By reducing the number of tools that need to be procured, implemented, and supported, teams can realize a number of benefits, including reducing costs, administrative overhead, and monitoring “noise.”
#3. Ensure you have the right team
In going through this process, it is important to take a look at the people who look at the data.
In recent years, as environments have continued to evolve, I’ve often seen a disconnect in organizations. For example, a team of network engineers will be tasked with assessing increasing volumes of application-centric data.
In establishing a site reliability engineering (SRE) function in another organization, we were able to build a team with top talent from across key technology domains. We had network engineers, java programmers, database administrators, mainframe specialists, and more, and all were some of the company’s top experts in these areas.
Having the right expertise in the right place was invaluable. With this team, we could make significant strides in terms of establishing observability and gaining actionable insights for improvement. Most critically, we had the expertise needed to execute those insights.
#4. Break down silos
True observability isn’t achieved with monitoring silos. Ultimately, this isn’t simply about networks or apps—it’s about holistic observability that reveals what the user experience is really like. All different aspects need to be looked at together to make decisions that are best for the user experience and the business.
Within many enterprise IT shops today, the promise of observability isn’t being fully realized. By stripping out the data and tools that aren’t creating value, and taking a holistic, user-centric approach, teams can establish effective, powerful observability.
The post Back to basics: Keys to taking a pragmatic approach to observability appeared first on CIO.