Episode 50

full
Published on:

10th Jun 2026

Continuous Integration at Agentic Velocity with CircleCI’s Rob Zuber

When code gets cheaper to produce, feedback becomes the limiting factor - CI, reviews, and the handoffs between tools can quietly slow everything down.

Rob Zuber breaks down what platform engineers are seeing as teams adopt AI-assisted development: more branch builds, new failure modes, and growing pressure to shorten the loop between “change made” and “change validated.” He focuses on how CI can evolve from a human-first dashboard into a system that agents can interact with directly through APIs, CLIs, and MCP-style interfaces - so fixes can happen faster and with less waiting on manual triage.

Along the way, Rob and Cory dig into practical questions engineering leaders are wrestling with: how PR review becomes the next major bottleneck, what “agent experience” means in a delivery pipeline, why speed isn’t only about faster compute (it’s also about doing less unnecessary work), and how teams can share learnings so “agentic velocity” doesn’t only benefit a few power users.

If you’re building or running the systems that ship software, this is a clear look at where CI fits in an AI-accelerated workflow, and what needs to change to keep delivery safe, fast, and sustainable.

Guest: Rob Zuber, Chief Technology Officer at CircleCI

Rob Zuber is a 20-year veteran of software startups, a four-time founder, and three-time CTO. Since joining CircleCI, Rob has seen the company through its Series F funding and delivered on product innovation at scale while leading a team of 300+ engineers who are distributed around the globe.

CircleCI, Website

CircleCI, LinkedIn

CircleCI, GitHub

Links to interesting things from this episode:

Transcript
Cory:

Welcome back to the Platform Engineering podcast. I'm your host, Cory O'Daniel. Today on the show, I'm joined by Rob Zuber, CTO at CircleCI.

Rob's been in the software space for three decades, been a CTO three times over, and you've been at CircleCI for the past eleven years?

Rob:

Yep.

Cory:

Welcome to the show. CI, that is... that's where all of our compute used to go.

Rob:

Yeah. Yeah. Well, first of all, thanks for having me. Excited to be here.

Yeah, it used to be, and it's kind of... it's back in a new and different way. You know, I think people are trying to go as fast as they can. It turns out when you go as fast as you can, you actually want to know if your stuff's any good. So we're reimagining what it is that we do, but we're right in the thick of it now, for sure.

Cory:

Yeah. I feel like there's a lot of discussion... I mean, there's a ton of discussion online about AI in general, but, like, I feel like a lot of what I see on LinkedIn and Reddit is people talking about, like, how do the fundamental things that we've kind of built on for the past twenty, thirty years, like change in this new world? And I feel like CI is going to be at the heart of a lot of it.

So, super excited to talk to you today and see what you all are experiencing and what you're seeing on the other side of the world.

Rob:

Yeah.

Cory:

From us, who's just kind of pushing Docker images up.

But yeah, so, I mean, like, you know, software is changing a fair amount. We're getting a lot... a fair bit.

Rob:

Yeah, just a bit.

Cory:

Open a couple more PRs.

What has... I guess internally at CircleCI, what is the load like?

I know GitHub's talking about how much more changes and pull requests and PRs they're seeing. Are you all seeing the same type of just sheer velocity change from the amount of code changes getting pushed through?

Rob:

Yeah, absolutely. So we're seeing significant increase in volume... handling it well from a reliability perspective. And then what we are seeing a mixed change, if that makes sense.

Like, you know, we think about branch builds where people are checking to see if the thing they're currently working on is working as expected, and then mainline, trunk, whatever you want to call it, meaning they've actually merged and they're trying to get something out of production. And we've seen shifts in the mix, like kind of more branch builds. I mean, of course, it's distributed across customers differently.

But in a lot of cases, more branch builds, meaning there's activity, but maybe not as much of an increase in main builds, meaning people are seeing, you know, lower success rate. Right.

So the work of their agent is maybe okay, but they're using CI and that branch build as a means to check that it's good and they're finding out that it's not.

And so we're investing a huge amount of energy right now in trying to shorten that feedback cycle, give people things that they can connect directly to their agents so that they can get a shorter cycle basically, and know that their work is good and basically not just good or bad, but give the feedback directly to the agent to say, "Actually change this, change this, change this before we even sort of go through the process or the ceremony of sort of PR and beyond."

And so, yeah, absolutely, we're seeing that volume and what we're seeing as a result is folks trying to find ways to tune the whole process so that it all doesn't just back up at kind of like PR and then release.

Cory:

Yeah, I feel like it's a spot of the stack where it's at least, I feel like twofold going to be one of the places that slows us down. A) just the build times, right? CI takes time to run. We're spending time pushing code up and now we're pushing way more PRs up than we used to. I mean, I think last week I shipped like seventy PRs or something like that to the terror of my team, who is on the other side.The other thing that's going to slow us down a bit, I feel like in this adoption journey, is the review. The humans in the loop looking at everything after it's been pushed up.

But it does seem like one of those places where it's going to be pretty critical, as more people start to adopt these workflows, that CI works well fast and it kind of keeps you in the loop with what's being changed at an extremely high velocity. Which we haven't seen before.

What is changing? I mean, besides just the sheer volume. But what are, I guess, customers and what are engineers requesting differently about CI than maybe what they were looking for a year or two ago?

Rob:

Yeah, I think so. So there's a whole category around speed, which we'll talk about. I'll get into in a second.

And then there's what I would call agent experience, which is like I'm actually operating with the other tools in my tool chain by letting the agent do that, right? So the interfaces, like agents, not going to a web UI to figure out what went wrong. So we're exposing a lot more direct capabilities through tools that we could, you know, we would also use as humans like CLIS and APIs, but also MCP server, like those sorts of interfaces that allow the agent to kind of run its own loop and say, "Okay, I pushed a thing, I'm watching CI, CI failed, I've got the logs, I know what happened. I'm pushing again." Like basically allowing people to get further away from that process. I mean, we'll come back to your whole thing about humans and where they get involved.

But then on the speed front, there's like absolutely raw compute... you know, we spent a lot of time over the years optimizing parallelism and getting sort of wall clock time down, not just compute time... but then there are also some, you know, some shortcuts, right?

Like you could think about doing the same thing as fast as possible, right? How do I take all these steps and get really fast, compute, optimize, caching, manage parallelism, like all the things that we think about. But then you could think about doing less work, right?

So when we talk about giving feedback directly to the agent, first we isolate the parts that are going to be really valuable to the agent in that step. Right. Certainly these days when I push something and it fails linting in CI, I'm like, "How did this happen? How did I get this far down the process? Oh, now I gotta like, you know, force or like, you know, amend my commit, force push, all these kinds of things."

It's just like a lot of overhead that is... yes, it like the individual step might take time, but it's more that like it comes back to me as a human, maybe I went to get coffee, now I come back and like, you know, a whole bunch of time has passed. So really pushing all these things down to the agent. Again, isolating the things that really matter, doing those super, super quick, right? With kind of live standby compute that, you know, we can run that check in thirty seconds, not your ten minute sort of build time. And then things like isolating specific tests that we know that are the ones that will be impacted by a change and running those, maybe prioritizing them, running them first, only running those on a branch. Like things that are not just what is faster the same... I guess the same thing, like at a faster linear speed... but how can we make nonlinear improvements?

Because people, everything else is getting sped up before that in a way that makes... you know, a ten minute build time used to feel great for a lot of people. And now people are like, "What am I supposed to do for ten minutes?" Right? I mean, they spin up another agent, another agent, another agent. Like whether they multi, you know, they multitask and get lost.

And then on the human side... and this is like we're kind of... we have touch points all through the process, Can we give you a enough validation and enough confidence in the change to reduce the human involvement? Because it is like... you know, I was joking actually just earlier today with some folks on my team that what I really want just to start is like estimated reading time. You know, like I look at a pile of PRs and I'm like, "What can I carve out time for?" You know, like on Medium or something, it's like this one's sixty seconds and this one might take you twenty minutes. Like just that would be great. But then, you know what we're layering on top of that is this change is a critical part of the system.

This change was made by someone who maybe doesn't know this part of the system very well. This might take a little bit more. This feels like something. Honestly, you could auto merge this, right?

I think we're going to move away from kind of this big grandiose ceremony around the PR and into systems that are designed to move at the speed. The PR has always kind of been like a laborious gate.

Talk to anyone about developer efficiency and you end up having a conversation about PR review time, lag, and then everyone is this heavy exhale, like "ugh." Right.

So it doesn't feel like it's the thing that was even tuned to be fast when we were just typing code. And now, you know, you're trying to go ten, a hundred times faster. Something has to be different about that. And so we're investing again in, within that automation, how can we give you the strongest signal to either shorten or even like reduce the number of checks that you have to make?

Cory:

Yeah, I think that's actually pretty interesting. There's some couple of things you said there like just the intelligence and the command or pull request itself.

I think that with the number of PRs I still review a day, I do think that there would be a significant life improvement just to seeing like this one's going to take a minute to review. It's like, "Okay, let me get Joe happy and approve these quick ones for him really quick." Versus that one's going to take 25 minutes... because sometimes you'll see a bunch of file changes, but the file changes do not correlate to the amount of time it takes to process and understand a pull request. That was interesting.

The idea of being able to, and I know you can do this today with labels and whatnot, but again, it's one of those laborious tasks to do as a bag of meat. But to be able to tag parts of your system or even repos and say that this is a critical versus a non critical component and treating your reviews and PRs differently, that is something that's hard to do for many orgs today.

While it would be very beneficial even without AI, I could see that that would be... like to be able to look at a bigger system and be like, "I don't have to worry about this as much because it's not hitting a core part of the stack." And like being able to identify that, that would be pretty awesome.

Rob:

Yeah.

As a platform that sits in the middle of software delivery, we have a pretty good historical understanding of what changes in your system lead to difficult outcomes. Let's put it that way. Right.

Like we know when things break downstream and so have a sense that like, oh, this is a place that you should probably feel pretty good about making a change. This is a place I would put a little bit of extra effort into. Right. There's known side effects.

Historically this has led to, you know, those like tightly coupled places where everybody changes one thing and forgets another and then we're like, "Oh yeah, that incident happened here, et cetera, et cetera." So like, I think there's... I guess intelligence begets intelligence in a way.

Like we've applied intelligence to a part of the process and now we have a lot of opportunity to apply it elsewhere, which is really the only way we're going to be able to speed. Now I'm saying let's call it artificial intelligence.

Before we like we've applied humans to the process before, but again it's sort of this like intuition. I find this really interesting in general, right. Like we have humans who have a very unevenly distributed mental model of our system.

Whether it's who in the organization is really trustworthy and who we should double check their work or is just growing or whatever and where in the system things are a little sketch.

And I don't feel super comfortable that the test coverage here, even though the number is high, really truly gives me confidence that I can Put this thing out, this thing's, you know, super robust. And it's like some people know it, some people don't, some people, whatever.

And I think what we're really seeing now is the ability to extract that and have it applied more universally, right? Like as people build out skills and context and whatever with their organization to be able to say, cool, this change.

Like regardless of who created it, this and who's looking at it, this system has all of that knowledge or can gain all of that knowledge and retain it in a useful way if we manage it effectively. I think it was like the late nineties, we spent all this time talking about expert systems, which was like, you remember that?

And now we finally built them and all they are is a pile of markdown and a repo. But it's like we took everyone's knowledge and we put it somewhere we actually are able to apply it.

Like that was a fool's errand in the late nineties, right?

And now it's just like, it's just kind of happening because people are type writing stuff down or they work on something and they're like, "Hey, Claude (or whoever/whatever tool), can you write this down for me and make a record of it?" And as that stuff gets more centralized, right.

We're able to extract and consolidate these little pockets of, of understanding that we're just, you know, if this person happens to be the person that reads your pr, it'll go okay. Which is not great. Organizational design was never designed. It's just what happened.

Cory:

Yeah, the code owner, it's just like, yeah, Cory owns it. If he. It's going to be fine. It's like, yeah, maybe, maybe not. Yeah, no, that is.

I feel like there's going to be a lot of really interesting changes to the way that we write and deliver software. I mean, I know even in my own practice, like, I'm a very TDD oriented engineer and I'm a person who opened to PR very early.

I love seeing a failing build. I love seeing them go green. I just open and let him rip.

But I've done a lot of engineering myself just around my workflow that I've had to rethink my own development workflow itself as I started to adopt AI tools. So a thing that I used to do a lot was I don't love the idea of a linter failing my build on a CI run.

I don't mind seeing a test fail that I'm working on. Great. That's what's expected. I want my teammates to see that test failing, so they can see where I'm at in the process.

But I've always leaned into the pre commits and whatnot. It's like, hey, get most of this stuff. Just make sure it's good before you even waste anybody's time.

And now I'm just like, there's way too many agents running for them all to be running all my pre commit tooling locally. That is CI's problem now, right? And it's just like.

So it's like something that I had spent a lot of time like, ah, trying to be like a good steward of CI. I'm just like, commodity compute, let that all run over there. Like, I don't have time for this thing to run locally anymore.

But it is funny because it's like that, that ten minutes that you're waiting, then you're like, oh, I guess I start up another agent. Like, that's exactly what happened to me.

I was like, when I first started, I was like, I'm only going to do one at a time so I don't drive myself insane with a thousand PRs. And then the first time you're staring, you're like, shit, I don't have anything to do for ten minutes.

You're like, I could write that spec in ten minutes.

Rob:

Yeah.

Cory:

And then all of a sudden you got two. And then all of a sudden you got, I think eight right now running in the background.

So as far as like your own team's concerned, like, what are some of the things that, you know, you've got a, you got a big team, you got a few hundred engineers right around.

Rob:

About a hundred, yeah.

Cory:

A hundred.

So like, what are you seeing not in the CI process itself falling apart, but like, like, what are you seeing, like, starting to break just for, like, larger engineering teams that are starting to adopt these tools.

Rob:

There's a lot of things to think about there. And certainly I'm working also with a very specific team as well as the broader organization.

And in that specific team, one of our goals or mandates was like, let's run as fast as we can and see what breaks. And yeah, it took an hour before we were like, well, this is never going to work.

And it was PR reviews, like two of us sitting across the table from each other hammering out PRs. And we're like, okay, when are we going to stop and review each other's PRs?

And then I looked at them and I was like, how am I going to get through this? We're going to be here for the rest of the week. Right. And so then we started building tooling to review PRs, right? Like how do we.

And we actually built something, I mean, speaking of the people you trust, we built a thing that extracts kind of the most common and most trusted reviewers from the history of a repo, turns that into context and applies it to the next round of reviews.

So it says, okay, well we know someone's going to comment on this and on this and on this and like how many senior or staff or you know, whatever the top level of your org is, engineers, open up a PR and go, seriously, again with this mistake. Like, we've made this mistake a hundred times. We've talked about it. Like speaking of the linter, right?

And why would I put this in front of someone else and use their time. We know what all the first round of review comments are going to be, so let's just have them happen locally, right?

Like basically we, within our agents, right, we have skills that are like execute this type of review, you know, before it ever gets committed and pushed.

So that by the time we're asking a human to look at it, I mean, honestly, they're also pulling some little local CLI and running some review over it, saying pull out the things that I care about kind of thing. So like trying to optimize a lot of that stuff.

So review is probably not surprising, but it's pretty easy to pull some stuff together that will, that will optimize that a lot.

And then, you know, I would say a hot topic in every circle that I travel in at the moment, engineering leaders, and that circleci is no different is token consumption. And what's interesting there is efficiency, right?

Like the way that some people use LLMs versus the way that other people use LLMs, just very, very different. And as a result, right? Like there's a lot of discussion of leaderboards and token maxing and all this kind of stuff.

Like I don't really care about that. I would prefer if people aren't just trying to burn tokens to make it look like they're doing work.

But for folks that are using them in good faith, like we see really big differences in sort of outcomes relative to kind of inputs and outputs, right? And so trying to help people understand how can I use this effectively, where was all this energy getting used, right?

And then can we tool around that? Can we build little CLIs or skills or whatever's, you know, whatever's the right thing to get you to an outcome faster?

Because yes, that's like token cost is real, but also just as A human, you're trying to get something done and someone else has figured out how to get the LM to get that thing done much faster than you seem to be able to get it done. And that's, that's no shade on people. Like, we're all learning this toolkit together. So how do we bring those things together?

How do we create sort of shared repos of skills and stuff like that internally to again give people the boost of other people's learning? Like, I think that's. If I were to summarize all that. Like it started with token costs, but really it's fractured learning, right? Because there's no.

I can't just, you know, go on to Amazon and buy the book on how to be awesome at using whatever today's coding agent is, because by tomorrow it's going to be a different one or like a new version and optimize for different things. New model versions drop. And like your prompts that used to be great are now a little, you know, not working very well.

So there's, there's all kinds of those different pockets again of like fractured learning.

Ironically, I was talking earlier about how we're bringing all our knowledge together and so this is a new area that's just like we're learning so quickly. So that had. Bringing everyone along is hard, right? Like in even. Even I'll say just 100 people, right?

There's a decent sized engineering org, but I know people in engineering orgs with 35,000 engineers. I can't even imagine what that's like, let alone how you try to educate them all on how to be really effective with this kind of tooling.

So I don't like we have the standard problems, right? Cycle times of reviews, you know, we optimize a lot of it. But it's really, how do we do that then consistently across the org.

Not that everything is like cookie cutter stamped it, but just without leaving people behind who are sort of like struggling or.

Cory:

Yeah, yeah, no, I feel like even at 100 engineers, that's gotta be tough. I don't know how. I'll be very curious how these larger, larger, larger organizations handle it.

But like we're fairly small team and you know, as we started leaning into it, like it was just, it just felt like all of a sudden there was forty employees one day. Like it was just like the amount of PRs. It was like, oh, two people on the team have figured it out. And then it was like, we got to get.

Now the rest of the team has to figure it out. Or like, the value extraction's off, right?

Like, if two people, you know what I'm saying, are like, are outputting 40x, that sucks for everybody else that has to review that code. Right now, their job is reviewing code, not writing code, which, that's the worst place to be.

Like, I haven't figured out how to use this tool yet, and my reward is reviewing everybody else's output because I haven't caught up.

Host read ad:

Ops teams, you're probably used to doing all the heavy lifting when it comes to infrastructure as code wrangling root modules, CI/CD scripts and Terraform, just to keep things moving along. What if your developers could just diagram what they want and you still got all the control and visibility you need?

That's exactly what Massdriver does. Ops teams upload your trusted infrastructure as code modules to our registry.Your developers, they don't have to touch Terraform, build root modules, or even copy a single line of CI/CD scripts. They just diagram their cloud infrastructure. Massdriver pulls the modules and deploys exactly what's on their canvas. The result?

It's still managed as code, but with complete audit trails, rollbacks, preview environments and cost controls. You'll see exactly who's using what, where and what resources they're producing, all without the chaos. Stop doing twice the work.

Start making Infrastructure as Code simpler with Massdriver. Learn more at Massdriver.cloud.

Cory:

I was talking to my co founder about this. It's like, he's like, it. It just feel like before he'd leaned into it, before he kind of like, got the hang of it. He's like, it just. He's like.

It feels like I, you know, don't really like, what value am I adding to the team? Like, like, you guys move so fast without me, right? And then it's just like.

Then you see him, like, start to catch up, and then all of a sudden it's just the volume of code is there, right? And it's just like they're all kind of working on the guardrails and catching stuff and making sure that good code quality is going in.

And it's like once you get to that part and you start to feel it humming, it does feel really good. But then it's that next step of when we bring on our next teammate. What does interviewing look like, right?

What does bringing them into the fold look like?

Where all of a sudden it's like, you know, you think about five years ago, six years ago, I feel like our number one goal in, like, bringing on a new hire was like, can we get them to put something into production on day one? Like, could you imagine sitting down at a job and you're like, oh, let's go take a look at the ol at how many get up PRs there are.

And it's like, oh, there's six thousand that were open today. Like the, just the sheer intimidation of joining a team that's kind of figuring it out. Could be, yeah.

Rob:

It feels like in a way it's an interesting description. Like you're trying, you know, you're trying to jump onto a moving freight train kind of thing.

But you could also, like, your speed is so enhanced, right? Like my ability to drop into a code base I've never looked at and be like, how does this work? Tell me what the key pieces are.

How would I implement this thing? And just, you know, grepping, grepping, grepping, whatever. Like what I used to do to figure out a code base.

Oh, let me pop open this file, look up this file. Oh, maybe here's a. Here, this looks interesting. I wonder if I can find a reference to this function somewhere. Right?

It all gets done for me and then I get a little written description and it's probably 80% right, 90% right. It's close enough for me to start working. So like that idea that I've been in that exact same place, right?

Like, how do we get someone to ship on the first day?

And it's like, well, let's find this really simple isolated bug that we know they could fix in about ten minutes because the rest will be setting up their laptop and figuring out how to push and making sure all the access is right. And then everyone's standing around and waiting to get paged. That whole thing has shifted.

Cory:

Right.

Rob:

It doesn't have to be this one liner change anymore. It's like, I don't know, I could probably refactor this whole part of the code base in about twenty minutes.

We can grab lunch and then we'll push it to production, like. And so I think it's accelerated on both sides of that in a way that I think is particularly interesting.

And then kind of on your interviewing front, I think even more broadly, right to your point of you and your co founder kind of working on this stuff together.

I think as an engineering leader, the only way for me, the only way for me, everyone's mileage may vary, whatever, but the only way for me to like, reason about that problem is to do some of it right. Like, I don't, I can't say, oh, well, I need you to show up and do this in an interview or we should look for these traits, right?

As I'm sort of, like, thinking through something with, like, directors or whatever, because I don't know. I don't know what it's like to do the job anymore unless I do it right?

Like, I spent 30 years or whatever the number is in this industry, like, living off of my expertise, the expertise that I built over all that time.

And then one morning, like, the way that I did the job was irrelevant, you know, and to be like, okay, everybody, I need you to do this today, or whatever. Not that that that's how we operate, but you like, you know, to generalize and make it. Make it simple.

I was like, I don't actually know what I need anyone to do anymore. So how am I supposed to help them even ask good questions, Help them understand how they could think about the problem differently?

Because we're all going through this together, right? And so I think it's, for me, again, the way that my brain operates, like, I need to do some of it.

It's not all I do is sit around and write code at this point, but I have to do some of it just to understand the problems, to understand. And then also, I mean, for me, our customers are all going through it.

And so to think about the product, to go talk to my customers and have a real conversation about what they're grappling with, right? To not have done a bunch of this work myself would. Would kind of be ludicrous.

Cory:

What's interesting, too, is, like, I feel like we're quickly approaching. I feel like a handful of problems around, like, novices in the space. I feel like we're gonna have a. I feel like we're gonna have almost like a.

Just an extremely wide gap between people like yourself with 30 years of engineering experience that have built systems that underst. You know, like, you've written the code, you've built the systems. And now we've hit juniors that are coming in. They're like, I haven't done that.

I've read a book. I haven't put anything in prod yet.

And I know how to use an LLM to generate some code, but I don't know what a healthy, good production system looks like. I feel like that gap is going to be a real interesting place for us to kind of train and make sure that we don't have a.

A ticking time bomb, ten years out when some of us start retiring, hopefully, please, God, let me retire in ten years. I hope they don't have a tick. So I feel like there's that with the novices, but then there's this whole other. That's slightly terrifying.

It seems like a ticking time bomb of people that we definitely need to be able to pass off experience to, such that the machines aren't just doing all the things blindly. But then the other side, which we're also experiencing, is the non software developers that are writing software now, right?

And you see a lot of this on the Twitters and the blue skies where it's like a solo founder, I got an MBA and I built some software. Now, magically, that's fun.

But I think the scarier version of that is I have a thirty-five thousand employee company and the SDR has magicked software into existence on his machine, right? And it's like, oh, we went from we all run our software in the cloud to SDRs are just making software on their laptops now, Right?

And now our entire organization's laptops are potential software deployment targets.

Rob:

Yeah, right.

Cory:

And I think, I think these two things are happening at the same time, which I think is going to be very interesting for us as an industry to solve. Like, how do we make sure that these juniors are getting this knowledge from seniors? Like how are we apprenticing people?

Rob:

And then at the same time these.

Cory:

Complete novices are just like, ah, my laptop is a production machine now. And it's like, oh, shit.

Rob:

Yeah, yeah. I mean, I think the latter. I definitely want to come back to the juniors.

The latter is an interesting one because I mean I think of like Microsoft Access and Excel, right? Like Excel is still like the dominant no code programming platform. I mean, it's got bits of code or whatever in it, right?

But oh yeah, probably on the planet, or maybe it's Google Sheets at some point, but like, you know that it's.

I'm a big fan of Wardly Mapping, which is probably super nerdy here, but like, like if you go all the way to left in like Genesis and just looking at who's doing what, totally bespoke, like totally for themselves. And then is there enough of that to indicate there's a product opportunity, right?

Like if I ran an IT department inside a company, I'd be going and looking at what everyone was building in Excel and say, oh my goodness, can I solve at least this problem for you? Right?

And so people having these tools in their hands and expressing because they're terrible at telling you what they want, but if they build it, you can be like, oh, please don't Run that. But I'll replace it for you in about three hours. Right.

At this point, because your IT team or whoever can be like, oh, we could knock one of those out and it'll be secure and it'll be integrated into SSO and all these other things that we need. I think it's a huge opportunity if you kind of treat it right.

And this is like shadow it as a. I don't know if that's a domain, but as a concept has been around for as long as I've been in this industry and always stems from trying to prevent people from solving their own problems. Yeah, right. We make things too hard to do, so they go find a way to go around it, right?

Cory:

Oh, yeah, baby.

Rob:

And you know, you get corporate documents sent on Yahoo. Email or whatever. Like, I'm definitely dating myself here. But like, because, oh, the limit. The limit is too low, right?

Everyone's trying to clamp down the exchange server, so we're just using Gmail to send each other like, really important critical documents or whatever. Of course, now everybody uses Gmail anyway, like, though those things, right? And why. Why does everyone use Gmail for their companies now?

Because it solved the problem instead of trying to like, clamp it down. Right. So I think looking at how people are expressing their needs is really helpful.

If you ignore it, then you're going to end up with, yeah, like connectors out to production systems and whatever, Whatever. So also don't let your SDRs have. Have connectivity to production. That seems like a good thing to prevent. Right? Like, like find a way. So whatever.

We'll manage through it. On the junior front, I'm so torn. Right. Even the cloud. So I mean, I've been around long enough that I had to fly to Virginia like everybody else.

We all saw each other in Dulles Airport and we drove down to our respective data centers and racked boxes and crimped cables and all this kind of stuff. And I happily don't do that anymore. But I know what a data center looks like on the inside. Right?

But I work with engineers who, like, the cloud is this magical thing where compute magically appears. And there are sets of problems, right?

There are classes of problems that occur because of the physics of data centers that you can reason about as a software engineer if you either have studied enough or have been there, right?

And you're like, oh, well, obviously if these things are this far apart and this thing needs to get this from disk instead of whatever, like the magic stops and the pain starts. Right? And so I think that we go through that same transition. Right.

But the number of cases where that actually occurs seems to get fewer and fewer over time.

Cory:

Right.

Rob:

Because people build better and better abstractions in the cloud. Right. Higher level services that we can then build on that, that abstract away some of the challenging concepts. Right.

And whether that's like using RDS instead of deploying my own database in the cloud, because they have or even Aurora, like a level above, primarily AWS will be a little AWS centric on the tooling, but like a layer above that. Or spanner, you know, shout out to gcp like I'm building layers of abstraction so that the people using them don't have to reason about anymore.

The underlying pieces.

Cory:

Ideally, yeah.

Rob:

Which means I reduce the number of failure modes where I'm like, wow, I actually need to understand, you know, I, I definitely. We had a major incident type where the way that EBS snapshots got restored impacted the behavior of our software and impacted our service.

But like, I could tell you about that one time in 11 years, you know what I mean? Like, it's not that important that I understand the details. So today we're in this super hybrid mode, right?

Like AI is generating code, but we're looking at it, we're understanding it, we're not totally sure that the AI wrote the right thing, et cetera, et cetera. But then the question is, what tools can we build?

It might not just be better LLMs or better models or better training, but tooling around that, right? You mentioned tdd. People are super into harnesses now. And like, how do I, to me is like TDD in a bash script kind of thing.

Like, how do I put this into a process where the outcome is guaranteed or much higher probability of being good so that as a junior, what I'm learning is the skill of building harnesses, not the skill of like understanding all this underlying software. And I don't know what time horizon we're on to get there, but I'm hopeful, right?

Because I, I think all we do is just like the people that did it before are the only people that can do it now. Like we'll have long, illustrious careers. It's been to like our retirement years.

But that's not what we want for the world, it's not what I want for the world. Whereas I think we really need to rethink what it is we're trying to teach people and like projecting forward into a world we don't understand.

If I'm totally honest to say, the skills you're going to need a year from now are these. So focus on these. Like we're guessing, right?

I mean the, the rate of change is beyond anything most of us I think can comprehend to say I really need you to invest in like memory management. Right. Or whatever. Like whatever sort of like lower level concept of, of computing that just people are not going to have to think about anymore.

Cory:

Yeah.

And it's, it's rough too because like as you even give like a friend who's just switched careers like two years ago into computer science and he's just like, what should I be focusing on? I'm like, I have no idea. Because like it's like, like the problem, like the thing you might be learning might be solved in like six months. Right.

Which is just hard. It's like, I mean the fundamentals, the, the fundamentals and how to give a good code review.

I don't know, it, it does, it does seem like there'll be quite a big, quite a big knowledge gap. But yeah, I think that, you know, educating and training people is going to be one of our bigger challenges. But yes, it is, it is a wild one.

Just to think like it is, it is at a clip that we have not been at before.

Rob:

Yeah. The best I can come up with is like welcome them in and bring them along for the ride, you know, like this again.

I'm working with a small team as part of also like our larger organization and we have a couple pretty junior folks on that team. They're super open to learning. No one is saying this is not how we've done it before. Like that's not going to work. You know what I mean?

They just have no reason to be skeptical because they're just like so excited about learning whatever, which is amazing. Great enthusiasm and learning things that like, you know, they're just at a point in their career where they're super absorptive.

I don't know what the right word is, but like, like just willing to take on new, new information. New information. And it's not challenging a core set of beliefs. And so they're super productive and super excited to have the opportunity, all that.

So, you know, I think it's like, you know, one option is to worry and try to plan how we're going to train people. I guarantee we'll be wrong. So the other option is just like, let's go on this ride together and we'll figure it out.

Cory:

Yeah. What do you think happens to open Source over the next few years? I know there's been a few like open source is dead kind of takes.

But thinking about, I wouldn't say when I was a junior, but I was originally a PHP engineer and then I was introduced to this wonderful thing called Ruby on rails 1.0 a long, long time ago. And I think it was a joke around like Ruby 3, where like the next version of Ruby would just have tab completion for sasses.

I don't know if you remember that joke. It was just like it just got to the point where it's like it could generate so much. Yeah, right.

There's just like the Rails generators were beautiful. You could just, you could whip code out in no time, right? But like, you know, going back to being a junior engineer, like doing development, right?

Like even senior engineers, like we're obviously reaching for dependencies constantly, right?

In our software, whether it's, we're grabbing some service and throwing it into a Kubernetes cluster, whether it's, I'm grabbing a new library, right? There's been a lot of takes that like open source is data, just doesn't know it yet.

And I feel like between that and the amount of agentic attacks, the supply chain attacks, et cetera, like there is something interesting happening in the open source space.

Like, you know, as a CTO of a company that runs CI and is dealing with builds, but also, you know, somebody who's writing software and has to take dependencies or has to make decisions about taking dependencies.

Like how are you viewing the world of open source and the value you get out of an open source library or piece of software in this era where it's so much more likely, I guess, that there will be agentic attacks or that there will be supply chain attacks. And how are you guys kind of thinking of open source in this world?

Rob:

Yeah, I think there's, I mean, there's clearly market factors at play, right? And you've named a couple of them. The security of open source, the maintainability of open source.

We recently had an opportunity to meet some folks who run a registry because their consumption, right, we're talking about consumption increasing, their consumption has gone up to a point. And most of them are like not for profit, right? The consumption of just downloading constantly out of builds.

But even like some of these data companies, their entire pipeline just pulls everything back out of the registries on every run, right? And they're like, we can't. You're using it, we're paying for it. Like we can't actually pay for this thing anymore.

How are we going to rebuild the entire model of Package registries, right? That's a real thing that's happening in our, in our market right now. And I totally respect that. I have no idea what else these people should do.

Like running infrastructure is expensive, right? And when everyone's, you know, build volume goes up 100x or whatever.

And again, it's not just builds but like everywhere this stuff is getting pulled. So there are a huge number of forces at play just in volume, in exposure. Right.

Like again, if you think of who's building a lot of these open source packages, it's like one person did it as like a little pet project for a minute and now they're holding up the Internet, you know, with their spare time sort of thing. And so I think, I don't believe as a software engineer that I don't need your open source package because I can just write it myself.

Like Claude will spit me out a replacement, right? Because now I've got a thing that's never been tested in production, barely.

You know, may not even be a good implementation based on how I got it out of Claude or whatever.

Like I put a lot of value on sort of production hardened software, whether it's a third party service, whether it's an open source package, whatever it might be. Also put a lot of value on focusing on your core domain, right?

Like we feel like we have infinite time because we're going so much faster, but our backlogs are still full. Right. Like we always have more things that we can do for our customers. Our customers are trying to move faster. The world is changing.

I need to adapt to that. Like am I going to sit down and write a payment platform 100%? No. Like no way. Right? That is.

And there's so much complexity in that domain that I know nothing about. So now am I going to download left pad, right? Like the butt of every open source package? Dependency joke. Like no, it's two lines of code, right?

Like I think the pendulum swung a little bit in some ecosystems too far to the, like I can just get another package, right? And when there's thousands of dependencies in my project, how am I ever going to know that one of them got taken over?

Like someone, you know, table flipped and handed over the keys to someone they've never met and that person is only there so they can drop calls to a C2 system into your life? Like that's terrifying, right? So I think, I think those market dynamics will combine to have more stable, reliable components.

Sorry, Fewer, larger, stable, reliable components is how I would think about it. Again, I Don't build most of these things, so who knows?

But I think that super long tail of like, oh, look, someone on their afternoon implemented this thing that I'm thinking about implementing. Let me just download that. I have no idea who that person is or whether they still work on it. I think that goes away, but I think this SaaS is dead.

I can build everything myself. IT systems are dead. I can build them all myself.

Sure, you can take the things that your SDRs are building and running on their laptop, but is that how you want to run your organization? Like, I think there's going to be, you know, we'll.

We'll swing too far to one side for a second and then we'll be like, we're maintaining a ton of stuff that is not our core business and it's taking all of our time when we could have just bought it from someone else. And we're trying to manage internal systems again that we could have just bought from someone else. Like, I think we'll.

People's pricing models will change. The long tail will maybe disappear. Like, it'll shift and we'll find a new balance, but it's not going to zero in my mind.

I just don't see how that happens.

Cory:

Yeah, we've been doing a lot of, like, trying to assess, like, what is. And I feel like this is one of those things. It's like engineering teams should have been doing this all along. But, like, when do we have time?

You know, like, let's look, really look at the worthwhileness of a dependency. But we started to definitely do it more and it's come down more to like, how closely does it tie to our core business domain?

And it's like, would it make sense to own this versus Is this just a thing that we're going to have to maintain for no purpose? But here's a really interesting one. I think it was the chain guard CEO said OSS is dead, but just doesn't know it yet.

And I was like, that's a weird one. That's a weird one to hear say OSS is dead. I was like, oh, oh my gosh, I hope it's not dead because I still want free labor for other people.

Rob:

Yeah, I mean, I think the economic economics change, right?

Yeah, the economic and like build versus buy or however you think about acquiring, you know, leveraging dependencies or whatever is all economic decisions. But. And the economics have changed significantly, right? Like the economics of building have changed significantly.

I don't know that we figured out how to apply LLMs or Gen AI or whatever to the economics of operating and maintaining, which is the part that we already blew that off in build versus buy decisions. Like, as engineers, we were always terrible and we're like, oh, no problem. I could just bang that out on a weekend.

Like, every company's CI platform is a bash script that someone wrote on a weekend. Right. And then suddenly there's three hundred engineers saying, why is your bash script broken? And they're like, I'm supposed to be on vacation. I hate my job.

We're like, why did you build that? Right? Like, and just repeat. Like, you know, copy and repeat for every sort of internal system.

But yeah, I think the economics will change and to your point, you'll make a better, like a clear decision. Is this something we should really own? Is it worth the risk and exposure of taking it from someone else?

Like, the math will change, but it'll still be there. I don't think it goes to zero.

Cory:

Well, I know we're coming up on time. I want to be respectful of your day.

I would just love to know, you know, before we go, like, what is the most concerning shift that you have with the way that we're writing software? Like, what do you think is going to be one of the biggest problems we have to face coming up?

Rob:

That's a big one.

Cory:

Sorry.

Rob:

Let me know. It's all good. I think, you know, I think we've covered a few of them.

Cory:

You guys are like. You guys are like, right at the. I feel like you're at the heart of, like, so much of. We're going to see the change.

I feel like you have a perspective that most folks don't.

Rob:

Yeah, absolutely. Although I think the things that we think about a lot, I'm not so worried about. I just think they'll happen. Right. Like, we will.

We'll optimize agent flows, we'll remove humans. We'll figure out how to measure the quality of software without having to sit and read every line. Like, I just.

I think that's almost inevitable and it's a question of who does it and how well they do it and what shape it takes. Yeah, and we're, you know, we're very excited about that, but I don't. It doesn't worry me.

It's just interesting what worries me, you know, how we all manage cost. I think there's efficiency, things that we haven't quite figured out, although, like in does in industrialization.

Generally we're good at sorting out efficiency once we decide there's value in something. So I think honestly, like the junior thing and educating and growing people, like, again, it's a little inevitable. We'll figure it out.

But the thing, like, if I'm going to worry about something, it's going to be impact on people, right?

And these transitions, like, it would be easy to not create opportunities for junior folks because they don't have the tools to navigate a complex transition like this. Right.

It would be easy to underinvest in training them because we don't know what you're going to need to know, so just figure it out for yourself kind of thing. And when I said, like, I don't know how to train you, but I'll bring you along, like, that takes. It's like apprenticeship, right?

Like, which is kind of how I became a software engineer. I didn't study it. I just went into a company where someone took me under their wing and was like, this is how we build software.

I was like, like, what a great experience. Right? So if we can offer that to people, then I think that's great. But I don't know that as a, as an industry, we're great at that.

You know what I mean? And so I think that's going to take some real motivation from some specific people who are driven to really grow folks.

And I think the open questions, like, when we face open questions as humans, sometimes we get stuck in indecision rather than saying, I don't know what the answer is, but I'm going to lead this charge and I'm going to take these folks under my wing and we're all going to be great at this. Right? So if, if I could worry about something, it would be, you know, that kind of human impact.

Cory:

Yeah.

I think that, you know, one of the things I've been thinking a lot about is just like, what does the value extraction of AI look like to a business versus the labor in the business? And I think, like, that right there is one of those places where you can kind of like, you know, it's all numbers. It's not.

It's numbers on, on two sides of a ledger.

Like, you can move them around a little bit, but I think like the value that you can shift back to labor and not just the org getting value from AI is in that apprenticeship. I think that's one of the places that we really probably could focus.

A lot of the value that we're getting back as engineers is like, how do we spend more of this time? Right. We're producing a lot more code, we're producing more revenue, right? With all the new code, we're producing more revenue.

Definitely questionable for some folks. Right. But, like, now that we're making more, we're getting more done.

The backlog is not getting shorter for some reason, but we're getting more stuff done. Like, how do we spend more time on the people in our orgs?

I think that, yeah, I think that apprenticeship could definitely be one of the places we could shift value extraction that would be beneficial to teammates and not just the org.

Rob:

Yeah, well, the answer, by the way, is we have Gen AI to create tickets in backlogs also. Right? So, like, it wasn't just engineering that got to take advantage of this amazing new capability. Right.

And so we're, like, sneakily delivering tickets and, like, PMs are sitting there typing out, like. And I mean, speaking of humans, just how all those roles even shift is super interesting. Not again, not worried. But I think it'll be.

It'll be interesting. To your point of, you know, how do we bring everyone along and it'll be different. Like, I'm not gonna.

I'm not gonna pretend everyone's just doing the same job and they're doing it faster. Right. And it'll be bumpy. And I just hope that we invest in it and making sure that. That we do this in a sensible way.

Cory:

Yeah. Awesome. Well, Rob, thanks so much for coming on the show today. It was awesome to get to talk to you. Where can people find you online?

Rob:

LinkedIn's probably the easiest place. I'm easy to find there, so I'll just leave it at that instead of trying to rhyme off a bunch.

I also do have my own podcast called "The Confident Commit", if anyone wants to check it out.

Cory:

Ooh, yeah, put that in the show notes.

Rob:

Nice.

Cory:

Awesome. Well, thanks so much for the time. Really appreciate it.

Rob:

Yeah, thanks for having me. Cory, this has been awesome.

Show artwork for Platform Engineering Podcast

About the Podcast

Platform Engineering Podcast
The Platform Engineering Podcast is a show about the real work of building and running internal platforms — hosted by Cory O’Daniel, longtime infrastructure and software engineer, and CEO/cofounder of Massdriver.

Each episode features candid conversations with the engineers, leads, and builders shaping platform engineering today. Topics range from org structure and team ownership to infrastructure design, developer experience, and the tradeoffs behind every “it depends.”

Cory brings two decades of experience building platforms — and now spends his time thinking about how teams scale infrastructure without creating bottlenecks or burning out ops. This podcast isn’t about trends. It’s about how platform engineering actually works inside real companies.

Whether you're deep into Terraform/OpenTofu modules, building golden paths, or just trying to keep your platform from becoming a dumpster fire — you’ll probably find something useful here.