Is DevOps Bullshit? An Interview With Cory O’Daniel, Author Of The “DevOps Is Bullshit” Blog Post
Has DevOps really worked?
DevOps was conceived as a way to bring developers and operations together, but has it really worked? In this episode, Dave Williams sits down with Cory O’Daniel to discuss his controversial blog post “DevOps Is Bullshit.” Find out why Cory sees platform engineering as a solution. Discover its potential as an antidote to the unmet promises and inherent challenges within the current DevOps landscape. Tune in now!
I'm here with Cory O'Daniel, CEO of Massdriver. Thanks for taking the time. I appreciate it
Thanks for having me.
I want to talk about your article on the Massdriver blog. You criticized the current state of DevOps.
That's me.
You claim it's an unholy beast of division and tunnel vision. Can you elaborate on your personal experience that led you to those thoughts?
I think those are two very different thoughts that I put together because they sound good together, but I'll start with the whole division. The whole initial concept of DevOps was always to bring together developers and operations in this culture. What's funny is you can hop on Hacker News and you'll find people talking about how, “We did that well.” The reality is most companies haven't.
We have this thing that we say time and time again where it's like, “Tools won't solve the problem. It's a cultural problem. We have to fix it.” We've been at this for many years and most companies still haven't figured out DevOps, so much of the fact that we have DevOps teams now. We've rebranded these operations engineers
I feel like you do have this division and these are people that can get along well around the water cooler. When my product manager wants me to finish a feature. I've ended up in this weird world of TicketOps where I have to ask somebody else to do some work for me so that I can do my job. That's where the division is. You still see this in a lot of companies and you see the animosity building up between the teams that makes that cultural DevOps movement even harder to achieve because the operations teams are understaffed.
It's funny when you say DevOps teams, I don't even know that the idea of a DevOps team was ever in that original idea. It was like, “With tooling one person can be both.”
It felt like the initial pitch. I feel like over the years, I feel like there's been a little bit of revisionist history as to what it originally was but made sense at a very specific point in time. DevOps was plausible before the cloud grew to the size that it is. If you rewind to 2008 or 2010, and it was hard to drive to a data center, rack something, go back to the office and write code. As we started getting VMs, Slicehost and stuff like that where I can now provision a piece of infrastructure compute and run my application on it, it made sense.
It was possible we didn't have the weight of compliance on us. We had very simple machines that we ran very small applications on. In a world of a global audience, and people demanding high uptime, even for non-critical services, it becomes difficult for an engineer looking at the 200 services of AWS to handle all that and keep that in their head trying to make sure that their product manager's happy and you're delivering features that the business needs.
It's interesting you talk about that complexity. What role do you see platform engineering play in mending that loop or solving this problem?
This is probably another mildly hot opinion, but one of the things that I've always felt was problematic in software development as a whole is that you do have software developers who have a specialty. You'll meet that person who's worked at eCommerce company after eCommerce company. I think that's great. When we start to think about platform engineering, it's very similar. There's a domain of knowledge that needs to be understood there for that person to do their job well and that's the cloud. One of the important things for a platform, somebody who's interested in getting into platform engineering is having that deep knowledge of infrastructure cloud services, application development and treating that domain of expertise as your product.
The question I have is how hard is going to be to find in the market. What are some of the challenges that companies are going to face as they take that on?
It's hard to find. I think that's why you're seeing a proliferation of different services us included, that help people get there quicker by offering this buy option. One of the difficulties of getting there is if you look at the Stack Overflow survey over the past few years, the number of people with cloud operations experience relative to the number of software developers has gone down. A few years ago, it was 10%. You could hit that 1 in 10, 1 operations person for every 10 engineers. In 2024, it's 6%.
It's not that we are losing those people. Some are probably retiring. We're making engineers faster. We're not just getting engineers out of universities. We're getting them out of bootcamps and there are great engineers coming out of there. We have more people joining the industry. We have more companies that are becoming software companies and traditional old companies that are now moving their services to the cloud.
We're getting a lot more people and a lot more companies on the cloud. Cloud operation is still very much an on-the-job training. The hard part is getting that expertise and then when you look at operations engineers, there's a good subset of those people that aren't software developers. They've got great knowledge of how networks Kubernetes works. They've worked in data centers but they might not be software engineers. Finding the right talent is definitely going to be hard because a lot of it is in fang.
Finding the right engineers that can build this platform is going to be difficult. One of the things we need to do as an industry is to be a bit less opaque about operations. We're not going to get there unless we get these people. Getting engineers that you're bringing on early, working with operations engineers so they can start to share some of this knowledge of how these systems work rather than having a team of let's say three ops people that have all this knowledge and everybody else is asking for services through tickets. We got to share that information and get some of this on-the-job training a bit earlier. I don't think we're going to see it come out of schools. It's very vendor-specific. I can't teach you AWS at a university and then you graduate four years later and all the services have changed.
One of the things we need to do as an industry is to be a bit less opaque about operations.
Expanding on that, it's going to be a challenge to get to platform engineering. Over the last decades, everybody's bought into DevOps. They're not exactly there either. What's the core difference? If you're halfway into DevOps, how do you get to platform engineering? Why would you even want to go to platform engineering? What are the advantages?
The number one thing for you as an operations engineer is that burnout. We do have a higher burnout rate than software engineers and our turnover is much faster. Some of the key things there is are you working in a company you want to be at long term? Do you love the product? If so, you need to communicate to your team how you can start to add value as an operations engineer. You don't see a lot of products, I say this in the article. You don't see a lot of product managers running around high-fiving people because they built a fast autoscale. A lot of the work that we do as operations engineers isn't visible until something's broken. We need to be able to communicate how we can add value to the business. Numbers rule. We have to back it out to cash.
Operations engineers have a higher burnout rate than software engineers.
How much time are your engineers spending fiddling with this? How backlogged is the average engineer waiting for your smaller team of operations folk to get them the things that they need? How comfortable are we in compliance? How far are we behind on our advance of dashboards? Gathering those things and say if we start to take some time to address some of the debt that we have and consolidate some of these systems that we've built, we can start to move our way towards platform engineering.
It's not going to be something you do tomorrow, it's something you have to build on over time. you've got to start dedicating that operations team to building and figuring out how to build a product. Not just solving tickets for people. That's going to require software engineers on that team. It's going to require possibly a product manager or at least somebody acting in that role.
It sounds like speed is at the heart of the advantage you get from switching to this platform engineering mindset, moving from the ad hoc scripting and automation of things to some self-service. Can you share some insights on how organizations can practically begin to move into that?
One of the key things, and I think a lot of organizations unfortunately missing this nowadays is starting to lean into IAC very early. It's going to be impossible to get to platform engineering with some scripts that set up a cluster or do ClickOps. We need to be able to create declarative definitions of our infrastructure, whether it's terraformed, plumbing, etc., and start to think about the use cases of this business. How many teams do I have? How many different clusters am I running?
What are the similarities between these teams? What are things that are important for us to start to codify that they shouldn't have to worry about? If you are an organization that has to worry about CCPA, HIPAA or GDPR get a lot of that baked into those infrastructure bundles where they can. That is the first step. Starting to codify those things that are painful for engineers and are frequently requested of operations engineers.
In the Microsoft For Startups blog, you touched upon the insatiable hunger for software and the rapidly expanding software engineering workforce. How does platform engineering change the trajectory of those software engineers?
The funny thing with platform engineering, my argument in that article or the first article of that series was that DevOps is b******. I think that’s where we are nowadays, it most definitely is I think the original idea made sense. If we can execute platform engineering well, DevOps ceases to be b******. The idea of you build it, you run it is where we want to get to. Our application developers understand the services they need. They understand how Kafka works, Postgres, their applications and business domains. That is build it and run it. They should be able to get that Kafka instance or some SNS and SQS queues of Eventbridge or whatever it is. If they can get that quickly and focus on their business, they are doing DevOps. We happen to have this platform internally. I think that is one of the keys there.
As those engineers start to practice DevOps, what I think changes about their careers as the engineers on the platform is they get to start to hone their craft more. They get to focus more on the business and adding value to that business rather than having these interruptions all day long when they're trying to figure out how to do a DevOps task or change a config or fix a state file in a Terraform pipeline, things that add zero value to the business but are necessary. They'll be able to focus on only the things that are necessary.
The context switching of the DevOps mindset especially in the single developer who's doing both, switching from Terraform back to your app, back to Terraform, back to your app to your observability software, so on and so forth. Is there a world where the same developer is working on the platform and the product at the same time or are these like two separate tracks?
I feel like they've probably got to be two separate teams or at least divided in effort. If you're trying to deliver business value you have to be focusing on the product, the product that the business is selling, not the product that you're selling to your own engineers. I think that clear delineation is important. That will allow your developers who are building products to focus on that without blurring the lines. I think one of the things that's hard about the way that we've done DevOps is as software developers, we want abstractions. We want these nice interfaces between things, but in DevOps, it's very easy to get the line blurry of, “Am I in infrastructure? Am I an application?”
You obviously know it because of your code, but you'll see people shift infrastructure concerns into their applications. If you can create these interfaces where these things connect together and you don't have the ability to blur those lines, I think that's important. You'll see the same thing in business. You don't want to have your eCommerce concerns starting to blur into your platform. You want it to be solely about delivering software quickly in a secure compliant way with the minimal overhead possible to your engineers. That clear delineation is very important.
It brings up an interesting thing where you're creating the big difference between platform engineering and DevOps is that product mindset. In your experience, have you encountered issues with the prioritization problem when you're not thinking in that product mindset? In DevOps, it's a list of security tasks that are all high priority versus, “I know the needs of my customer and I need to meet those needs?”
When dealing with the priorities, particularly in a platform. There are ways for a team to prioritize their work, but most importantly, it is trying to figure out the biggest component that can shift the most amount of complexity out of the hands of your developers, trying to find that core system or core pattern that exists across your 2, 3 or 4 teams and get that codified and out of their way as quickly as possible.
The catch here is you also have to balance that with security and compliance. That's one of the things that is in a lot of companies is not getting enough attention as it should. Getting the security and compliance into your platform earlier is important. I think trying to figure out how to prioritize what's going to be more efficient for your developers in their day-to-day versus what security and compliance things have we cut corners on because either we didn't know or we didn't have time or it didn't matter. Getting those into the platform early because those security and compliance constraints can affect the way that you're doing deployments down the line. It can have a pretty big impact. Those are important cornerstones of good platform engineering that people should prioritize alongside the work that saves engineers the most amount of time.
Those security and compliance constraints can affect the way that you're doing deployments down the line.
The most shift left you could ever be on security and compliance is in the product process where you're weighing that against time saved doing something versus the amount of money lost if we have a breach.
It's funny, we're always saying, “Shift left.” We're shifting a lot of the accountability left but not that expertise, even looking at the number of people who have deep security experience. Going back to the Stack Overflow survey, it's something around 3% to 6% of engineers have deep security experience. Let's say you get somebody on your team who has deep security experience and you've got three other teams or are delivering software.
What's the most effective way of getting that person in and increasing the security of your business? Is it giving it to your eCommerce API team or is it taking that person over to your ML team or is it getting that person plugged into your platform engineering team where they can be act as a force multiplier for security, being able to implement those security details in the platform and deliver pipelines or maybe doing security scanning as software is going out rather than having them work with a specific team that you think the biggest problem is on?
That's an awesome vision of the future. I want to wrap up with your general thoughts about what the world looks like when we succeed in making the shift from DevOps to platform engineering.
We don't have enough operations and security engineers on the planet to operate and secure all the software that we're putting in the cloud. There are a lot of teams that we've even met and trying to do sales in Massdriver that say, “Security is not a revenue driver.” The security is not a revenue driver for anybody but a security company. How do we get more stable and secure compliance systems? That's the future that I'm most concerned with. Everything is moving online. I think that platform engineering is the way that we get there. Operations and security are very much on-the-job training.
You can do a little bit of security training at the university, but they're not telling you and teaching you how to do deep penetration testing. You can do a little bit of operations at university, maybe learning about Docker, but they're not going to teach you about 400 different cloud services and how to weave them all together. We don't have enough of these people. If we can magnify the experience of these people through platform, whether that's a platform that you buy or it's a platform that you build internally or whether it's a platform that's open source through something like CNCF, that is the future. More software faster, more secure software, more compliant software and engineers that are happy in their jobs like solving business problems and not dealing with the tedious day-to-day of what we now consider DevOps.
A world where software eats the world and doesn't s*** all over it. I love it.
That would be fantastic, but we're s***** a lot.
I appreciate you taking the time. It was great to talk to you here.
Thanks for having me and remember, DevOps is still b******.