Datadog is a monitoring and analytics platform which offers real-time insight into your entire technological landscape. However, as important as machine identities are to that landscape, they are often not included in monitoring efforts. But now there is hope of changing that.
A big differentiator for Datadog is they integrate with just about every tool out there. And so that enables operators to have visibility on real-time level with everything in their technology ecosystem. They monitor infrastructure and network security, and even do application performance monitoring. They also do log management, incident management, and a few other things as well. Part of the draw for using Datadog is if you're a Site Reliability Engineer (SRE), you're getting information from a bunch of different sources that you need to correlate, and Datadog gives a single pane of glass to look into your environment.
Until recently, easy access to machine identities was not part of the Datadog metric set. That’s changed thanks to our friends at New Context who returned to the Machine Identity Management Development Fund with a Venafi-Datadog integration that gives SREs clear visibility of machine identities within their Datadog dashboard.
In this regular developer interview series, I am speaking with Michael McClanahan from New Context, who are the security innovator for highly regulated industries such as energy, telecommunications, finance, and government, to learn more about SREs and the valuable integration they have built for Venafi Platform and Datadog.
Bridget: Let’s start with some background on Site Reliability Engineers (SREs).
Michael: An SRE is somebody who is essentially a software engineer at heart. Indeed, SREs do have a software engineering background, but they also understand operational tasks, server management, networking, and lots of security components. They essentially hate doing things manually. So, they seek automation. The point of an SRE is not just to develop a product, but to look at the whole life cycle of a product as well as its support. That's generally where SREs are different from a typical IT organization—where you have your developers on one side and then you've got your operators on the other, not to mention network and security teams. An SRE team is supposed to be all encompassing and perform across all of those functions.
With all of the information coming from all of these different sources, it’s the SRE’s job to make sure that those inputs are all expected data. So, if they discover an anomaly or something, then that would be the trigger to go investigate further. Or, if they've created one already, that anomaly would trigger an automated response.
Bridget: How does the Venafi-Datadog integration help SREs do their job?
Michael: If you're an SRE, chances are you're using something like Datadog already. What the Venafi-Datadog integration does is pull in information on machine identities for SREs who are responsible for maintaining some sort of “certificates as a service” within the organization. Or they're responsible for maintaining the certificate authorities used by the enterprise, then SREs will just need to ensure that development teams are able to continue getting these certificates when they need to.
SREs would use this integration to ensure that the Venafi platform is up and running, and it is working as expected and requests are flowing through it as they typically do on a normal day. The Venafi-Datadog integration essentially solves the challenge of providing SREs with a scope of the automation used to collect TLS certificate data.
Bridget: SREs are probably front line for any certificate-related outages then!
Michael: An SRE team is definitely responsible for the availability of an application—ensuring the product is up and running. But the reality is that certificate outages happen all the time. That's another thing that the Venafi-Datadog integration brings SREs—providing visibility into their certificate landscape and knowing whether certificates are up-to-date or bout to expire. Armed with that data, SREs no longer have to worry about being in the headlines when the whole service goes down because there's a certificate nobody knew about, and that certificate has expired.
As a matter of fact, I just saw a news headline on that, and it didn’t seem that long ago when the company had an entire service outage—and it was ultimately a certificate issue. It's a problem that every IT organization has experienced at one point in time. For that reason, the biggest value-add in my opinion—at least for Datadog—is getting that single plane of visibility into everything that impacts availability. And now with the Venafi integration, you get that visibility into machine identities that is extremely critical for maintaining your certificate landscape under your entire machine identity management program.
The Venafi-Datadog integration is available now. You can learn more about the machine identity management metrics available in Datadog from the Venafi Marketplace. And stay tuned for future interviews with Machine Identity Management Development Fund recipients.
This blog features solutions from the ever-growing Venafi Ecosystem, where industry leaders are building and collaborating to protect more machine identities across organizations like yours. Learn more about how the Venafi Technology Network is evolving above and beyond just technical integrations.