Safer Agent Automation with GitHub Agentic Workflows
Bret: Say something's gone wrong in
your repository, you don't wanna have
to get up in the morning, and say,
oh my God, something's gone wrong.
Should I have coffee or
should I investigate it?
You shouldn't have to make that
trade-off because the agent should
have already investigated it for you.
So you should be able to just go to
your breakfast, have your coffee, read
the report, read the analysis, and
it says, here's a possible fix for it.
Here's actually meta pool request for it.
And you go, oh yeah man,
uh, we, we gotta fix that.
Let's get that in.
it should be there, ready for you.
the agent should be proactive
and should be immersed in a
world of cooperative agents.
Bret (2): Welcome to the Agentic DevOps
podcast, and I am your host, Bret Fisher,
back with another episode about one of my
most exciting things that I'm working on
this year, the project that I am adopting
on GitHub, and I am trying to dig into
the weeds as much as possible because
I think that GitHub is onto something.
Specifically GitHub Next and Microsoft
Research, which are both the research arms
of both those organizations, and they're
working together to evolve GitHub Actions
to what it would be if it was AI native.
What would it be if AI was there in a
safe and reproducible way that we could
sandbox and protect in a, in a very
detailed and heavily scrutinized way?
And it's not what you think.
It is not simply just adding
LLM prompts into GitHub Actions,
which you could do today, and
you've been able to do for years.
It's not simply just adding Claude
Code as a step in your GitHub Action.
Those are things that already existed.
But out of GitHub Next last year,
we heard about the early alpha beta
release of something called Agentic
Workflows, which is technically what I
would call a feature of GitHub Actions,
but it's a whole website now with tons
of examples, a team working on it.
And when you really dig into the details
of what this is, I think this is the
only way we should be doing anything
with an AI inside of GitHub Actions.
If we are prompting in GitHub
Actions, they should be using
this tool, in my opinion.
As I dig more into this, and then
the recent Claude Code security
concern that we had, and then all
of the security concerns we've had
around GitHub Actions lately related
specifically not to the supply chain
per se, but specifically to workflows
that are getting basically prompt
injected through untrusted prompt input.
That is one of the biggest risks anywhere
we put a model, whether it's in a
chatbot or it's in our automation or
it's in front of some of our systems.
If someone can put untrusted text in a
place that somehow doesn't get verified
by a trusted member of our team before
it goes to a model, that's a risky
place to be, and that's part of our job.
Platform engineers, DevOps,
security engineers, like we're all
very concerned about that, right?
We're maybe a little bit trepidatious
on what we should be doing with these
things today, which is automating more
of our systems because we don't wanna be
that person on the team that gets caught
putting risky AI stuff into automation.
'Cause as we know, automation
can make good things really easy.
It can reduce toil, but it can also
automate the bad things if we're not
careful, and you can sometimes automate
yourself all the way into an outage.
So we have on the show this
time Don from GitHub Next.
That's their research arm full of PhDs
and experts that are trying to figure
out tooling for the future of GitHub.
Personally, I just love GitHub Next.
I'm always on their website,
githubnext.com, and looking at what
they're working on because to me it's
like reading the tea leaves of where the
big money is researching for the future
of software development life cycle.
And we also have Pele from Microsoft
Research, who's also on the team building
this product And using it daily, which
we get into exactly how they use it.
And I'm really excited because I feel
like this system has a lot of rigor
to it, which is what I'm looking for.
And when I'm thinking about
implementing AI anywhere in my
automation, especially when it comes
to CI/CD automation, possibly anything
around my code repos, I'm wanting
that to be as secure as possible.
So I look to make my steps deterministic
with traditional programming, and then
only as a last resort when I need a
judgment or a, maybe a consolidation
of text or a summary of text, that's
when I consider putting in models.
And I used to just attach Claude Code,
maybe build it a workflow around that,
or attach Codex or Copilot as a step,
but this is something totally different.
You use command line tools to generate
something that is a very long workflow.
You use CLI tools to create a lock file
to make sure that that doesn't change.
You establish rigor around making
sure that the prompts and the
things that you're trying to create
with AI are trustworthy and can be
sandboxed and protected properly.
And I feel like the depths that I've
gone into after this show that we're
gonna get into in a second, I feel
like I now can say this is how I'm only
gonna be doing the AI in GitHub Actions.
So welcome to the show Don
and Pele, and let's get into
it.
Bret: Don tell us who you
are and how you got here.
Don: Bret, thank you so much.
the introduction is through our
mutual friend, Ben, who works as a
product director at GitHub Actions,
and yeah, we are having a blast.
I work at GitHub Next, and I am now
working on, I guess, what we'd call
ai, agentic DevOps so continuous
ai, agentic software, automated
agentic, software engineering,
and lots of associated topics.
And yeah, I got a long background in
kind of programming language design
and runtime design and done a lot
of product delivery over the years
i- in Microsoft and other places.
A lot of DevOps along the way, and
yeah, I've been at GitHub Next a few
years now, and I absolutely love it.
Yeah, we got a charter to investigate
the frontier of software development,
and there's never been a more exciting
time to Mm-hmm … be doing that.
A time where things are so nascent
and things are in formation and in
change and, turning upside down,
And you're really able to kind of
radically rethink a lot of things.
I mean, a lot of it's, turns
your head upside down, just
how much things are changing.
But, you know, I, I'm
enjoying that very much.
It's
Bret: a good moment to be here, I think.
I keep reminding myself, I don't know, a
couple times a week, it feels like that
just reflect on the fact that you're in
the middle of this and that, like, when
you look back on it, these are the stories
we will tell and like, be present because.
When I look back at the cloud and the
pc, the mainframe, to PC migration, and
like all the sort of large infrastructure
evolutions that I've been a part of,
you know, when you're in the middle
of it, it's, I was too young to
realize that it was of significance.
and as I've grown older, I've started
to see the patterns and so I can
sort of feel like I, this is it.
This is a thing.
It's, this is exciting.
this will not happen, this
will not be like this forever.
and we will look back on
this as, wow, that was crazy.
so Yeah.
yeah.
Yeah.
Welcome.
I'm glad to have you here.
Peli: my name is Philippe Duhaleu.
I'm a engineer in Microsoft
Research in a group called
Research in Software Engineering.
We're also very interested in things
like verification testing, uh, all
this kind of stuff over the years.
And I had been looking at, LLM Automation
back in the days before agents, and
got in contact with Don in GitHub Next.
And, we decided to kind of work
together on this idea of continuous ai.
This kind of started this project,
and that was a year ago, roughly.
Uh, but we had been working with
Don from a distance from Microsoft
on various projects It's--
we've been around for a while.
Yeah.
yeah, so that's the intro.
I've been working on developer
tools for professionals, but
also for kids for a while.
I've built, K12 coding platforms.
Bret: Nice.
Yeah.
Is there an analogy between a K12,
learning platform and agent harnesses?
It is, it
is
Peli: There's a lot of things that people
don't realize in the way it's designed.
It is designed as a sandbox,
just like I build the coding
infrastructure for Minecraft.
Mm. So when kids learn to code in
Minecraft, we, so there are some design
patterns that, that are applicable.
Uh, you know, these agents
are finicky little monsters.
so we gotta talk a lot about kind
of sandbox design and API design and
things we do under the hood to make
it more reliable when people say,
"I know, agents aren't reliable.
or- Right.
So we do a lot of work under
the hood to remove that.
But also we talk a lot about safety.
' cause when we have a real sandbox, you
can't just experiment You can build
castles, without destroying the world.
So there's a lot of that is baked in.
but it's made for professional, it's
made for DevOps, it's made for people
who are actions users and stuff.
Yeah.
But yeah, it's uh, actually, it happened,
I didn't realize I, I was designing a
system like that and then after the fact
it's like, whoa, this kind of feels like
I've rebuilt a system that I've built
for so many times for that environment.
yeah.
Bret: Okay.
So it like it became apparent
after it was happening, yes.
and I, I should back up because I think,
uh, we got connected because I heard
about Agentic workflows last year,
when it was before, GitHub Universe.
And it was like early beta
or maybe even before that.
And I think I was like requesting early
access to get in because it was right up
my alley and I had no idea what it was
and just knew that, uh, at the time I was
building a GitHub Actions course, still
am recording videos, uh, week by week.
And I've been a big GitHub Actions fan
for, gosh, at least five or six years,
since before we had reusable workflows
and a lot of the niceties we have
nowadays ' cause I really kind of saw
that as the evolution of the everyone's
CI almost kind of what Jenkins represented
in the industry for so long, where it
was sort of the default and I was so
anxious for the, another winner to,
to replace Jenkins 'cause I was tired
of maintaining the infrastructure.
And then, you know, got onto GitHub
Actions, saw the advantages of having it
built right into where our code storage
was and where everything else was.
And that just really fit, I
think with a lot of my clients
and a lot of my community.
And so over the years, uh,
GitHub Next, which we should
probably talk about what that is.
I've always looked at GitHub Next, you
know, a couple times a year to try to see,
try to re- see the tea leaves that you all
are reading and figure out where, what's
next for GitHub, what's the exciting
thing that might come to fruition?
Whether it's a font, uh, font
package, which is my favorite fonts
that I use everywhere today or, some
feature in the UI of GitHub that
was considered a new experiment in
how we represent the information.
And it's been fun to watch that.
'cause y- you feel like you're
kind of seeing the skunk works
or the, like the, the deep think
of an organization, in public.
And that's been a really cool thing that
I don't think a lot of people know about.
Obviously Microsoft research is the
same thing that's been around for
decades, but, um, uh, it was cool
to see that so close to the coding
platforms that we're all using every day.
So last year, if I'm setting
this up correctly, there was an
announcement around GitHub Agentic
workflows as an idea, and then
you announced it at Universe,
And then since then, it, I feel like it's
been like off to the races, incredible
documentation, slide decks that are
available that you all are putting out.
and there's a lot of information that
I think has to come along with this
because it's not an obvious evolution
necessarily of how it's not just like
one little feature added to a workflow.
So who wants to take up the mantle
of describing, elevator pitch,
the agentic workflow concept?
Don: I mean, I'll just run through
the kind of three principles, right?
One is the idea of agentic
repository automation.
You know, uh, we, the… when people
talk about agents, I think there's been
something missing in the conversation.
Okay.
Which is about, you want
agents that are proactive.
That's what I think of
when I think of an agent.
I don't know where the idea
came in around the world.
That agent was something you invoked
from your chat session, right?
I don't know who invented that.
because when I think of the
word agent, I think of something
that kind of is there around me.
It does stuff.
It knows the, it knows of What's
happening in the world around me.
And in the context of a repository,
that means it kind of knows what's
happening in the repository.
And, when I think about things like, I
want to, refactor my code so that all
the files are under a certain size, I
wanna check, uh, my error messages to
check the language is age-appropriate
for a particular target audience,
or hundreds and hundreds of other
things I wanna do in my repository.
I, those are things I
wanna do continuously.
They're not something I
wanna just do once, okay.
I don't wanna have to stand up
every day and kind of say, oh
my, we had some error messages.
Gotta re reestablish
that kind of principle.
that the error messages are all
in, in good shape or whatever.
You know, A GitHub repository
fundamentally is a continuous growing,
evolving, collaborative kind of space.
And it, it's got a history that's very
interesting and long from wherever
it came from, and it's got future.
and the agents, uh, the, anything
that happens in the repository
has to be established to
happen on a continuous basis.
It's gotta be able to be adaptive to
what- to the change that's happening in
the, in, in the, re- in the repository.
If you're gonna have AI working
in a pull request is also a
continuous kind of object.
You know, it's gotta kind of go with you.
the journey all the way through
the repository's a journey.
And we've been digging, to me
that's the heart of what GitHub is.
it's like that's just central.
That's why it's the place where
everybody comes to work together.
It's why continuous integration
and continuous, uh, deployment,
make sense there and.
So if you're gonna do agentic working, you
wanna establish it on a continuous basis.
we all know the examples.
here, continuous documentation,
continuous code improvement,
continuous fault analysis,
proactive, kind of fault analysis.
You don't wanna have to,
Bret: say something's gone wrong in
your repository, you don't wanna have
to wait, you don't wanna have to kind
of get up in the morning, and say,
oh my God, something's gone wrong.
Should I have coffee or
should I investigate it?
Don: Right.
Bret: You shouldn't have to make that
trade-off because the agent should
have already investigated it for you.
'cause you, the flow is to
investigate a fault in CI or a
fault in your website or whatever.
there's where the signal,
the fault signals are coming
from the flows are encodable.
you know, all the steps you wanna do.
You've probably written them out in your
Mm …kind of documentation and so on.
So
you should be able to just go to
your breakfast, have your coffee,
read the report, read the analysis,
Don: uh,
Bret: and you know, it says,
here's a possible fix for it.
Here's actually meta pool request for it.
And you go, oh yeah man,
uh, we, we gotta fix that.
Let's get that in.
Right?
So it should be there, ready for you.
the agent should be proactive
and should be immersed in a
world of cooperative agents.
Don: And when you think about
that kind of vision, there's,
that's quite close to the idea of.
uh, Assistance in your repository,
sort of a virtual team in a way.
They're not humans.
Okay.
We, I don't like the
anthropomorphization, but they,
Peli: yeah, it's hard
not
Don: to though.
they're gonna
Bret: be
Don: It's hard.
Bret: to, yeah.
Don: Yeah.
And, and there's so many times
in my working life I've wanted
assistance in various repositories.
I think, there's so much, you know,
I, I, there's, I don't know how to
do really good engineering uh, in,
in various dimensions, in every
repository I have to kind of touch.
Uh, so bringing in this
proactive assistance.
I know.
And so, when you think
about it, where are we?
Let's make that real.
We're in, sitting in 2025,
and let's make that real.
What's the ideal experience?
What's the most, let's platform
fit that onto GitHub as a concept.
And you go Actions actually, yeah.
Actions has got a lot of trouble.
Yeah.
We, we, you know, you start thinking,
well, we automatically generate
some big complex YAML, which kind
of is like compilation of some
intent or something like that.
And then you go, then you realize
actually, no, I just wanna run
those amazing coding agents, which
will also appear around that time.
I just wanna run them in actions
on a kind of continuous basis.
Okay.
And what would be the ideal
simple, let's make that simple.
Let's make that as simple as possible.
That would be Check in a markdown
file, which describes your prompts.
and It runs and you have, check and it
and give some triggers like in actions.
Everything just works.
That's the kind of core idea of
GitHub agentic workflows is to make
it, to capture the simplicity, the
beautiful simplicity of GitHub Actions.
YAML, you know, people have got all
sorts of complaints about GitHub
Actions, and YAML and so on, but
it's got a massive advantage, which
is, it is damn really simple to use.
You just check in a file
and you have magic, right?
Peli: Yeah.
Don: And that's uh, existing's ya
YAML A- and one way that people put
it, I think Ben puts it this way, is
that actions is a way for a- average
developers repository maintainers to
get access to some cloud resources.
They get access to cloud compute
networking and some storage
in the context of a repo.
And they don't have to go ask the
security team a permission grant
has been given by the company that
decides to let you use Git, GitHub,
or your open source repositories.
You get this amazing playground
and it's not just a… It's a, it's
actually like a factory space, right?
It's actually like, it's got storage, it's
got network, it's got so much compute, and
these days it's now we're bringing kind of
coding agents into that kind of factory.
And, uh, so that's the simplicity of it.
You're running through the kind of,
uh, add wizard kind of thing here,
which is kind of, uh, saying, you know,
yeah, helping you, uh, set up your
tokens as well, uh, in, in the video.
And it's, it, it says, would you
also like to trigger the workflow?
And then bang, you've got your first
repository, automation running, and, uh,
that thing I think is set up to run daily.
Uh, you can kind of change it to run,
weekly on different kind of schedules,
uh, different triggers and so on.
And bang, you've just got, you
just made your, brought the most
powerful coding tools into your
factory, onto your factory floor.
And the, the potential is limitless.
Now.
You have got all the power to automate
everything, and creative things,
analytical things, uh, problem-solving
things, code improvement things.
everything is set up, in, in,
in the GitHub repository now.
Okay.
So that's the vision repository.
Automation with the coding
agents you know and love.
You can use Claude.
You can use copilot, CLI,
you can use Gemini, CLI.
And I think we're adding some more.
codecs as well.
And, uh, we, we will take, uh,
proposals for new ones as well.
So that's it.
Repository automations with the
coding agents, you know, and love
safely with strong guardrails.
We'll get onto that in GitHub Actions.
That's the formula.
Nice and simple.
Bret: Nice.
That's a good elevator pitch.
one of the challenges I think in all of,
in all, like, like just if we think of
the entire software life cycle on GitHub,
I've always experienced this with every
piece of software I've implemented as sort
of a DevOps ops person, is anything that
I put in, even if its goal as a product
or a tool to, to automate something.
Like even putting in GitHub Actions
as a thing to help me automate
traditional, bef- before we had agents.
There always was an addition, the,
a, a cost of toil that was added on
top of that, that I could not escape.
And when I teach GitHub Actions, a lot
of, you know, it's obviously there's,
there's the day zero creation of these
YAMLs, but the day two stuff is arguably
even more important because it rarely
gets discussed, like in, in getting
started guides and stuff like that.
And so over the years, GitHub Actions
has added functionality to, you know,
reusable workflows and templates and,
the GitHub repo op opportunities there.
And there's, there's been a certain level
of trying to manage this giant beast that
we've all created when you have hundreds
of repos, and now that means possibly
thousands of YAML files to manage.
And, uh, you know, I hate to, I don't
like using the word at scale, I think
like it's way overused and a lot of
us aren't even dealing with scale.
We're just small teams.
but I always felt like there was this
challenge of, oh yeah, we're gonna
implement this CI tool for you, but you
probably need someone's, you know, half
a half a day, a week, maybe, or more
to, you know, fix the broken workflows,
to update the GitHub, or approve the
PRs for the action upgrades to make
sure that the Dependabot is configured
correctly with all the latest stuff.
And so there's a lot of
this sort of hidden toil.
And I was gonna throw the question to you
of, do either one of you see this future
where it, it feels like there might be
a possibility here where we can add.
We can add features or functionality
or, or, or solve problems without
adding additional toil, but also
maybe that toil might be managing
the agents, like up updating the
agent skills or the configuration,
or like, that becomes the new human
toil layer that we have to maintain.
Or do we think that this is
the, the gain is far exceeding
possibly the, the toil involved?
Um, with that, what do you,
what do you think about that?
You-
Peli: So essentially no,
we compile down to action.
Yeah.
So our markdown becomes an action.
So we inherit all the toil today, Today
we have all the toil that you have to do.
However, we're working closely
with the GitHub action team,
and we're looking at that toil.
we're looking at the sources of that,
and we're looking for solution to get
rid of these, you know, the requirement
to, oh, there's no way to push an
action over an entire org, you know?
Mm-hmm.
You always have to push
files, things like that.
It's like, can we do make this better?
Uh, so we're looking at these
problems where you start to have
hundreds and thousands of repos and
everything becomes a scale problem.
so there is definitely work being there
for the action product itself to be able
to fix those oh, you mean But in general
sort of a
Bret: deterministic level, like something
that's just a feature of GitHub is a
deterministic feature of action to be just
Peli: yeah …better at this.
But aside from that, we can go and.
Attack any toil and automate it, anything.
Because now we have, we have a, we
have the ultimate reasoning hammer
that we can just point at a problem,
just go systematically on every
repo, do some reasoning, right?
You can take, in fact, any deterministic
tool and wrap it into, with an
agent and point it at any repo
and, you know, hope for the best.
It might give you a result.
But, let's say your Dependabot thing, it's
always slightly different, For every repo.
But now you have an agent that actually
pretty good at dealing with that.
So you can go and, and run campaigns
over your repos and, and fix complying
issues and things like that, that
a deterministic tool would fail.
These agents are able to go and, and I
think that's what you mentioned by toil.
Yeah.
Or I, yeah, but we also inherit Yeah
…the, the limitation of actions today.
because, we're working on that,
but, you know, that takes time.
but
Don: Yeah …it's on
Peli: our radar.
Don: I got a different thing for the toil.
Mm. It's kind of like, yeah.
it does take time, take costs something to
kind of create automations that take kind
of cost something, costs you personal time
and there's ongoing maintenance time for,
to create existing, uh, GitHub Actions.
YAML.
Couple of things kind of change a bit.
One is that because we're
dealing with coding agents, it's
possible to create extremely um.
You might think of it as ambiguous,
but it's actually kind of
general, kind of, um, workflows.
So you can kind of create, one
of my favorite workflows is to
do with, repository maintenance
and it's called Repo Assist.
and, um, and it's a multitask
workflow and it kind of, each day
it's got like 11 tasks I think, and
it kind of rolls the dice each day.
to say, Hey, what am I going to do today?
Okay.
So, because you don't
want it to do all of them.
because it, you know, Yeah … only do the
first three, so it just rolls the dice.
Just to kind of get
things nice and balanced.
One of them is, for instance, to
label issues, to just to make sure
all the issues are, can check most
recent issues It can look at a backlog
and just make sure all the labels
are kind, you don't have to write
that out algorithmically, right?
You just kind of tell it the end state
that you want and maybe some hints about
how to get there, about what the journey
is, it has to make and, uh, you know,
pretty much what I just said really.
And it'll work out all
the rest of the details.
Uh, it, you know, in the sense that
that's the prompting that will be
running each day or every third,
fourth, fifth day or whatever it runs.
Uh, and, um, it sorts out the rest in any
repository for any set of labels for any.
Language, you don't have to
work in English, you could
be working something else.
So whereas in a traditional setting, you
might have had to configure the exact
labels to use, configure the kind of,
uh, the, the, the, the kind of heuristics
to use to label issues and so on.
In this setting, it can be
made to work everything.
And that's really golden.
'cause that means you have these
very generic workflows, which can
be used in many different settings.
I mean, other Yeah …tasks in that repo
are things like take, uh, work through the
issues and just analyze them one by one.
not all in one run.
just do a bit of that frontier work, uh,
uh, and, and do a depth investigation,
do a reproduction, and give some advice
about what to do about this issue.
And it's sort of what you'd
do with a coding agent today.
You might check it out locally and
investigate the kind of thing, but
it's all done for you, proactively.
Um, and again, it's very general, right?
You don't have, it's the sort of thing
we could never have programmed up
two or three years ago because it's
amazing, you know, these things Yeah
these agents do.
But it's also very general and that
means the burden I use repo assist
in, I think, um, 12 different repos.
They're all different and I haven't had
to change it really between any of them.
So you can have these very generic,
powerful, tools which help you
make progress, uh, expressed at
the right level, low enough that it
actually kinda knows what it's doing.
It doesn't just do something canned.
but general enough, it's really
applicable to everything.
Bret: I'm starting to see enough
teams that are tiptoeing into agents
as, in their CI, essentially, right.
P- Somehow, somewhere sticking a model
and writing a prompt in some fashion to
a model, whatever we wanna call that.
And they've, I think they, a lot
of the teams that I'm working with,
see, the first experience is sort of,
it's a checkbox feature in GitHub.
Like they might turn on the PR review
agent for copilot and it's sort of an o
on/off thing, or you, you can opt in to
each PR and it just becomes a feature.
So it doesn't really feel, I mean, even
it's providing automation and obviously
it does, you know, it puts comments
in the PR it does all these things,
but it, it's not something that they
have to hobby kit implement, right?
They're not writing a YAML file
necessarily on day one of that.
So I feel like that's the first phase.
And the second phase is where.
They wanna actually ha- you know, see
everybody else putting Claude, code into a
CI run, or they, you know, they, I notice
nowadays it's actually getting pretty
rare to look at the, uh, maintainers
or, or the contributors to a repo.
And that Claude logo is always
there, it seems like nowadays.
people are sort of figuring out how they
can either write code or, or review code.
That feels almost like, for me, that
wasn't the most interesting part to me.
So, someone who had to maintain the
CI, I was always looking for things
that would, you know, automatically
troubleshoot a failed check, right?
And try to provide an automated, uh,
'cause I, I'm that person who's usually
responsible when the, the checks fail
because the dev team's gonna push back and
say, yeah, there's a configuration issue.
It's not our fault, blah, blah, blah.
And so we're gonna have
that back and forth there.
And I, I've, I've started to describe
this, because people aren't even really
sure where to start, and I love that
the Agentic Workflows website, the
documentation really is starting to
categorize these things into certain
areas where you're finding success
and you're seeing the good metrics
coming out of tho- those results.
But I've often tried to describe it
to them as just find a place where
there's huge human judgment involved.
that You've previously had to involve a
human, but it wasn't a deliberate gate.
because a lot of people get nervous
about, I think, the idea of AI in their
CI because the first thing they might
even, l- or one of their first thoughts
might be, well, I don't want it to
automatically deploy to production.
Mm. And to me that's like, of course.
Yeah.
That's probably the last thing I'm going
to automate Mm. with any sort of, Mm.
uh, model it, you know, that is to me,
sometimes a lot of teams, it's a manual
gate we intentionally put there, and
so we're enforcing a human stop point.
But there's so much other, that's like
you mentioned, labels, uh, automatically
labeling automatically, you know, maybe
approving like low risk, patch releases
of Dependabot updates or something.
Like, there's probably some
low hanging fruit there.
I certainly have been a part of teams
where we've had to implement that
cross repo l- auto labeling, even just
synchronizing back when we didn't have it.
Uh, synchronizing label names across
repos, you know, just a lot of sort
of silly stuff that maybe just wasn't
a feature in the product yet and we
were backfilling it with some sort
of manual automation that feels
like the ripe, uh, opportunities
for, uh, people getting started.
But do you, f- do you frame it that way of
look for opportunities where there's, y-
human judgment that we have today, but it
we, it wasn't because we enforced a human
to get involved, we just didn't have a way
to automate that with a, for each loop.
Is that something that you,
you frame it that way for.
Don: Yeah, I personally like to begin
with uh, a chat about what are the
problems people are having in the repo?
Like what are the actual
struggles that are happening?
Okay.
So, um, in the case of, repo assist,
you know, having that chat with
myself, the problem is we've got
an issue backlog of 200 issues that
go back years and years and years.
And every time I come to this
repo as a maintainer, I don't
know what to do with them.
I don't wanna close them out.
I, 'cause there's value there.
I know there are bugs there, and I
don't want to leave bugs that people
have found lying around this software.
so, my problem is one of kind of
the burden of being a maintainer,
the guilt in a way, you kind of, you
know, it's not, I, haven't got on top.
I love to get on top of the repo.
I'd love to get it meaning get
that issue count down in the below
100, below 50, below 10, uh, and.
So the flow is designed to help me solve
that problem and reach where I wanna be.
So I kind of like to have the
discussions about like, what
are you trying to achieve here?
What are your goals?
What are your quality goals?
You're after, uh, you know,
is performance the top thing?
That's your, your problem?
is sort of, is quality and crunching
out the bugs, the problem is integration
and kind of cross-repo kind of working.
That's absolutely in a lot of
settings, a ki- kind of problem.
Making things regular across
multiple repos is a good example.
Uh, so once you start the
conversation about, tell me
what, what's making life painful?
What's Yeah …causing you to lose sleep,
then, what causes you to disengage from
the repo or, whatever I don't know, Peli,
How, how do you begin conversations?
Peli: I'm doing extreme agentic
development in the way that I'm
at, what, 400 plus PRs a week?
exclusively through, I mean, a-
agentic workflows is written with
agents and the challenge is how do you
create quality software doing that?
Yeah.
And this involves many, many agents,
uh, looking at the generated code,
cleaning it, adding tests, extracting
specs, generating tests from specs.
So there's an entire.
Intricate set of agents that are running
and that are powered by agentic workflows
that are running in the repo and that are,
look, you know, it's really exploring,
okay, what is, so we can, we can generate
code at a incredible rate now, but we've
always known that creating a feature
was a tiny piece of the equation.
You had to do test plans, you had to
design features, research, you had
to write documentation, maintain it.
All that stuff is ripe for automation.
and where does the human fit in?
Where does the engineer come
in and, you know, say no.
In my, stat, I think 20%
of the PRs, I refuse them.
there's still quite a bit of
engineering involvement I do maybe
three up, three intervention per PR.
so there's still quite a bit of steering
from the, from me, but I have a lot
of tools that are agentic workflows
that are optimizing very specific
angle of a code base, like reducing
code duplicates, fixing linter issues.
And you know, I mean these are
kind of sound practices that
software engineers have been doing.
Um, Yeah.
and then we go down the rabbit
hole and we're like, oh, well.
We always fix something.
So now we extract linters.
So we infer linters from
our, our own practices.
your devs are always fixing the same bugs.
Maybe it should be a linter.
then once you have a linter, you
have something that is very scalable
in terms of compute and cost.
And then, but there's, there's
something we discover with the
agent they're very meta agents can
help, agents can generate agents.
And agentic workflows
are an instance of that.
They, the workflows are really
specialized tools in my mind.
You know, they use agents and
everything's an agent, so we
have to put different names.
But I build tools, uh, and one of
the unblocker of this experiment
is that I can go from an idea to a
tool in five minutes that is running
in the CI and giving me a result.
The first version will be trash, you
know, it will not work, it will crash.
And then there's a self-reporting loop and
two or three iteration, you have something
that actually creating the value.
and then, you know, add more loops.
You get, you start optimizing
and, and saving tokens.
But …that's essentially the
key thing is zero friction from
your idea to an automation that's
bringing value to your project.
That's paramount.
Yeah.
Don: Yeah, the repository as the
kinda agent foundry, the agent host,
the place where you can just create
and deploy your automation and, um,
get it to do everything over, you
know, the fabric of the repository.
Create issues, It can add issue
comments, It can read existing issues.
It can create pull requests, it
can add to existing pull requests.
Those are the kind of, uh,
discussions of the other elements
of the information fabric.
But that's, and it can read the
security reports, it can look at the
actions, it can look at the CI runs.
And so that's the fabric information
fabric you're working over.
And, um, it, it makes you look at a lot
of the stuff that people are doing with
agent harnesses differently because
you don't really need to think about
where you put your to-to-do list.
Let's just put it in an issue, right?
You don't need to think, where do you
put the output, the analysis of, say
you've got an agent, which is kind of,
uh, checking performance every, night,
different dimensions of performance,
and running through them, uh, uh, or
checking your, get your, your, your
getting started guide, reading your
docs, and kind of running through the
kind of getting started material and
making sure it's simple and making
sure everything kind of works right.
Those basic kind of
walkthrough kind of things.
where does it put its output.
I don't.
wanna have to think about deploying
this to some Agent Foundry and
some other platform, right?
I I, it's just gonna run in GitHub and
it's gonna write its output to GitHub then
it's gonna create an issue or add, right?
A comment to an existing issue.
Uh, or it's just gonna create a pull
request and fix the the thing directly.
Depends on the design that you want.
And, th- so it's that it's, you mentioned
it earlier with GitHub Actions about
how having it right there next to
your code is just a, a great thing.
And, uh, it's the same with this.
It's like right there on the information
fabric we all know and use it knows how
to use the GitHub information fabric.
really well.
It's all about issues and how
to query, them, how to search,
about pull requests and so on.
Uh, so, um, I love that.
I lo- I love that it's operating in
my home, in my, my fact, my place
where we get work done together.
And Yeah …I love, love that I
can dig back to the actions log
and see exactly what happened.
I love Right … when, when the
agents of the actions log to
work out what went wrong, right?
Peli: Yeah.
The inherent every time you come
back, yeah, every time you come
back to the graph, to GitHub,
the engineer can intervene.
The human is back in the loop.
Right.
The agent does some
computation, creates an issue.
Uh,
Don: th- That's right.
It gives that natural place for
the human to be in the loop.
Yeah.
And, and that's, that can be
so confusing when you kind of
disassociate from the factory.
you take it to the outside, from the repo.
Uh, and you wonder where's the, h-
where's the human gonna be l- in the
loop in this kind of stuff, right.
I'm drawing out.
And the answer's really simple right?
it's in the pull request.
Right?
It's okay if the, uh, the, i- yeah.
Okay.
First of all, it's actions.
What it does to the issue set of issues
has to be really tightly constrained.
And that's where we can start
to talk about guardrails.
You know, it's not allowed
to delete every issue.
It's not allowed to close every issue.
It's not allowed to comment on
every issue or randomly kind of
write ha ha ha all over the place.
It's got really strong limits
over what it's, what it can do.
It can add a comment to a
single issue, for example.
That's a super strong limit.
Uh, okay.
But it can do that automatically.
it's not gonna boil the ocean
or whatever its expression to
let it act on the issue section.
so it's kind of n- it's a big enormous
scratch pad for these agents to work on,
right?
Uh, and, th- and that again,
it makes it clarifying.
'cause you know, it's
not writing to the repo.
It's not like writing a to do or goals.md
or tasks.md into the repo, like a lot
of people are doing, it's very tempting
to use the repo as the scratch pad.
Uh, But it's not gonna do that.
Uh, uh, So it's, it's, we've got our
scratch pad that is issues and it's
got its way of proposing forward action
in the world, which is pull requests.
And, uh, or you can also propose
an issue or propose a pull
request, an actual concrete change.
It's very common.
We get the, uh, agentic workflows
to create issues instead of
going straight to a pull request.
Because again, it, g- it's, it kind
of divided into a world where there's,
it, is it, it's roughly a, like,
is it gonna take one kind of check?
Yes.
Do it thing in the pull request, or
does it kind of …offer choices?
If it's offers offering choices,
then you'd better create an
issue first, because the human
really needs some guidance.
The agent really needs
some guidance about Right.
What's the next step?
But in that, the human in
the loop is really simple.
It is just at digesting what the agent
does in the issue space, the comments
it adds the options it gives and acting
in the pull request space to make a
actual, big, uh, actual change progress
forward in, in the code towards the goals
that, that everyone's pursuing together.
I love that.
I lo- I love that I know what
human in the loop m- means.
I love that.
it just very clarifying in what is,
and it's a- also something I can trust.
if all it's gonna do is create a
pull request and just one of them
is that, it's almost certainly
going to be useful what it creates.
So at first like, that's great, but
it's certainly not, you know, I can
trust, I can sleep well at night
that the thing is doing things in a
positive direction or certainly not
in a significantly bad direction.
So guardrails and security,
super, super important.
Do very worried about something you said
earlier, Bret, just to be controversial.
you were saying, run Claude
code or these, uh, coding agent
CLIs directly in GitHub Actions.
I understand the temptation of that
and, uh, in fact, the origins of GitHub
agentic workflows are in that kind of
space, but it's really dangerous, right?
The- We call that kind of running
naked, We're kind of Mm-hmm …running
without a security architecture or
rolling your own security architecture
or attempting to do the re-analysis
of the security architecture.
At every time, every workflow you do,
That's really easy to make mistakes.
And you're being exposed to inputs.
You you've got the world's
most powerful coding tools.
And frankly, they can be used for good
and they can be used for bad, right?
And they are running in GitHub Actions
with potentially access to secrets under
the direction of your, of arbitrary
people working up, walking up to the
repo of people feeding in information.
Okay?
So it's a little bit like you
hire an amazing team of people.
They're sitting in an office and you're
allowing people to walk in through the
front security gate and just feed them
notes about what to do under the door.
And they read them.
They go, oh yeah, I'll do that.
You know- And they think those are
"I'll just do
Bret: it" … just as
important as the boss's notes.
Yeah.
And they'll do
Don: any They'll just… You know, I
mean, yeah, you can try and box it and
try and tell it's not so important.
And of course you do all those things,
but it's still, you kind of want the
security guard at the desk, and you
want the security guard you particularly
want, I mean, you want any outward
action, any write action from that
team, any act external action in the
world, any information sent outside.
if you're, if you're worried
about private information leaking.
you want, it's gotta have a security
architecture, you've gotta, you, you,
must, if you're gonna run automated
coding agents in the context of anything
sensitive whatsoever, you must have
a security architecture full stop.
That's Do not leave home without that.
in any serious way.
And- And that's what GitHub-
the security landscape
Peli: is different than
what people think is secure.
Mm-hmm.
It is not just running in a container.
Right.
There's a lot more threat.
These things are intelligent.
Don: Yeah.
So for instance, running in a
container, but in our model, we
have some very strong guarantees.
Uh, And, we really care about security.
not just because we're trying to stop
people getting work done or anything, but
because we think the better the guardrails
you have, the faster you can run.
Okay?
The faster the automation can go,
the better the train tracks are.
The faster a train can go, the
more you can ramp it up, ramp
up the TGV up to high speed.
because you trust the rails you're on and
nothing's gonna go wrong along the way.
And that's what, that's how it actually
works with GitHub agentic workflows.
Now, let's just run through some of
those guardrailing kind of things.
Perhaps one of the really big ones is that
the agentic step, the actual coding agent,
the reasoning the… runs read-only, okay.
With just one Mm …narrow safe
output that it's allowed to make.
Okay.
And that is a really
strong thing, All right?
That it, we are not giving these
things, write Access to MCPs.
uh, In our MCP docs, we say, if you're
gonna add in extra mcps, you should not.
Give them write Access
to the external world.
You should design these safe outputs
instead, which are very tightly
controll- controlled handover points.
And when you really digest
that, it's running read-only
without access to any secrets.
'cause we go through our gateway
to access, uh, inside the
container and got the ga- gateway.
Those two together, huge relief.
There's no chance it's gonna
leak all your repository secrets.
Right, Uh- right.
'cause it never had access
to
Bret: begin with.
Yeah.
It
Don: never had access.
And it's got a firewall around
it for network access as well.
Peli: Yeah.
There's
Don: still concerns about there,
but they're known concerns.
you can calibrate what's
going on in that landscape.
And, uh, and I, it's that allows me to
sleep well at night when these things run.
When we first started running coding
agents and actions, I was worried that we
were creating actions that were gonna be
hackable, com, subvertible, compromisable,
and, and you do get those people.
Uh, and, but it's this very tight
containerization, readonly, no access
to secrets and, um, and, and very narrow
scope of action in the safe output.
And I, t- together these allow me, I, I'm
a guy who worries, Worried about things.
as we should,
And I think that
Bret: will- as we should.
I think that's been a hot topic.
In fact, I'm doing, uh, some workshops and
I'm actually speaking at a conference this
summer around GitHub Actions security.
And I've also got a, a little
plug for a open source tool I'm
about to release called, uh, GASA,
GitHub Action security assessment.
That really just kind of takes the
top 10 things that I see, or the top
dozen that teams I work with or, uh,
you know, and I've been studying a
lot of these GitHub action, uh, supply
chain attacks and trying to understand
where the core misconfigurations are.
'cause that's really what
we're talking about Mm.
-a lot in almost all these cases Yeah.
It's really
Don: just where's the problem
in this seven-step attack.
Bret: Yeah.
Don: where did that go wrong?
Yeah.
Right.
Bret: What, what step in there
can we secure immediately and, and
without Mm. consequence or, you know,
usually without breaking anything.
And so that's one of the, I, I, I
mean- I'm trying to build a little
tool that helps people discover that.
'cause a lot of these things are just
like Mm. features of the platform that
people don't thoroughly Mm. understand
that event in a GitHub action workflow
or that particular Mm. checkbox in
the security settings of Mm. GitHub
Actions in their repo settings, that
they just, they left it by default
'cause they didn't understand it.
They don't know the caveats.
And, and so I've, go ahead.
I was gonna set you up real quick Yeah.
for this and just say, Okay.
Go, go ahead.
Yeah.
Like, what if people aren't fully aware
of what's what we're talking about here?
If you just add like a step in a
workflow and maybe your workflow
today, your workflow before didn't
exist, and you add a new workflow to.
Take, to assess an issue, right?
Like you they- Issue triage.
you need, you need some automation around
the title, and maybe for it to, you're
thinking, I wanna have a model intuitively
s- select the label based on what it
sees in the description, and the title.
And so we've seen some m- disastrous
often cases where people just, they add
a step to a workflow, they put a prompt
in there that basically says, please read
through this, and then decide the label.
And here's my GitHub, uh, you know,
workflow, my action token out of the
gate by default, which may or may not
have full privileges to the entire repo.
and you think that "Well, I'm giving
it a prompt to just change the issue
or just uh, to just change the, label
of an issue or, maybe add a comment or
something at most, but they don't really
understand the effect or the causality of
what they just did and prompt injection.
So can you maybe set that up, as like,
the problem with that and how this helps
prevent that, obviously you said like
removing secret access and all that.
but- I, I can take on
Peli: on that.
Yeah.
looking at the token you have
with the agent, let's say
you have to write an issue.
So you're doing issue write Can
create, delete, update any issues in
your repo at an insane rate, right?
And the title might be a prompt injection
that changes the goal of the agent.
Suddenly the goal, the agent has been
reprogrammed to close all your issues.
Or to implant malware to all your
issues or, you know, but basically
there's nothing preventing the agent
from saying, oh sure, let me go and,
you know, I'll curl 500 times and
pass the token and do it, or so.
So that's the danger that these
agents are extremely powerful.
Uh, and the security posture.
And the security construct we
have, you know, when we say
read-only, it's at the token level.
It is a deterministic guarantee
by the GitHub action platform.
It is not based on any kind of prompting.
This is a strong guarantee that you
basically adhere to because you're
using the permission object in action.
The same thing where we say there's no
co- there's no sequence in a container.
Then you kind of, some level of
trust containers and so forth.
So our security story is based on
DevOps primitives that are well-known
in industry containers, and permission
scopes and things like that.
And that is the deterministic
secure box we built.
Yeah.
So we don't rely on agentic.
we, we have some agentic protection,
but as much as possible, we wanna
build a box that is deterministically,
provably safe to some extent, right?
Some guarantees.
Don: Vibe security, Yeah.
Yeah …that's what we call it.
So when you, um, when you use GitHub
agentic workflows, you write this
markdown and you write some front matter.
Okay.
It's, it's, it's lovely, right?
The front matter looks a lot like actions.
YAML.
It will be fa- familiar to people.
There's some differences, but, uh,
it's pretty, pretty familiar territory.
Um, and you, you run this step
called, uh, GHAW Compile, okay?
And it can produce the lock.yaml, which
is a YAML that actually runs and that,
that compile actually, I, I, I wish
we'd chosen a different word for that.
And we might make it a synonym,
which would be something like Harden.
Okay?
Mm-hmm.
'cause what you're actually
doing is taking that prompting
and you're kind of hardening it.
You're m- you're putting a
security architecture around it.
You're saying, I'm gonna run
that prompting, I'm gonna
run it in in a coding agent.
But that's gonna be, you know, we're
gonna create the YAML, which puts it in
a nice box and gives it a nice, secure
thing and puts in a threat detection
step to as a, as just a extra step.
And I think that's the right kind
of model to use is what I want an
automated tool, which will just make
me feel good, safe about running
that, uh, about r- uh, running that
coding agent in GitHub Actions.
And where that, we run that YAML
through several checkers as well.
Actionlint, and uh, zizamor and,
uh, there's another one as well,
Probably- Poutine … Poutine.
and, um, Poutine, Runner
Peli: guard.
Don: Okay,
And
Bret: so we run- Ooh,
uh, Spell that for me.
'cause I don't know about that one.
Peli: P-O, P-O-U, you know,
like the food, like the Canadian
Bret: fries, like the
gravy Cheese and the fries.
I love, yeah, the gravy.
Yep.
Yeah.
Okay.
I'm already in.
Sign me up.
Yeah.
Don: Yeah.
so we, run those tools, uh,
they caught some things in the
YAML, uh, in our hardened, YAML.
And, um, so there's a
lot of goodness there.
And there's something you can hand off
to the, it means your security team,
if you're doing this in an enterprise,
can actually check that YAML as well.
They get to see what's going on.
They get to see the full
security architecture and check
it matches up with what we say.
They get to see the exact
container settings and its firewall
settings and what's mounted and
what, what's not, and so on.
Uh, and yeah, so yeah,
the guardrails are good.
And I love the security architecture
and the confidence is it gives me to
run fast, uh, with agentic automation.
Peli: Yeah.
So back to, building educational runtimes
and, you know, you think about velocity,
what this sandbox gives you is in
the strict mode, a guarantee, because
that you're not gonna leak secrets.
So you're gonna have a read-only
token when you're a, you know, the
agent won't see your agents, and that
you have a very specific, you know,
precisely where the agent's gonna
be able to mutate the world, right?
Because all the writes are transactional
and then we validated them.
there's a layered, a number of layer
of guarantees, and you're in full
control saying, I will allow you
to do one issue as Don said, or "I
will allow you to open an, a PR.
Right?
So from a practitioner's point of view.
These are guar- deterministic guarantees.
Then this allows you to go wild on
the prompt side and do back star and
do YOLO because you're gonna YOLO
inside of the container, not on your
dev box where all the secrets are.
Um, you're gonna YOLO in a container
that has no secrets that, you
know, you can't escape unless
you break out of the container.
But, that is insanely empowering.
because now you can try things
without second-guessing everything.
You can try things faster, more tools,
without taking down the whole house.
That will take you from trying
things very carefully with AI to
actually go, go, go, go, go much
faster because you have safety.
So people underestimate the fact that,
To,
Don: to give an example, uh, on,
on, on it means you're gonna write
prompting of things like, okay,
agent, work out the test coverage
in this repository and improve it.
Okay?
Find the big holes and assess the value,
uh, and choose the highest value bits
and fill in improve the test coverage.
That means it's actually going
to in- possibly install tools
to be taking test coverage.
It's gonna be working out the command
line invocations to kind of do that.
you know, stuff, if this is a c
thing, you know, the impossible
stuff of ever taking coverage of
a c repo or something like that.
And, uh, and it's gonna
be reading the files.
Uh, and it's gonna be, uh, it's just
gonna be doing everything right.
Yeah.
And you can keep your, it means
you can keep your prompt in general
and the agents will use the full
power of the software engineering
toolkits that it's got, uh, available.
And one of the magical things as well,
you're running in GitHub Actions and,
people take those VMs for granted,
but, and the system side of what's
built there, 'cause that's a Sure.
I don't know how big are the images now?
Like hundreds of gigabytes or something?
Terabytes.
Oh, really?
there's a lot of secret
ingredients There's- …that
Peli: make action an amazing platform.
Yeah.
Don: Yeah.
Mm. So that means every time the agent,
every time your agentic workflows, or
in fact your YAML workflows are kind
of waking up, uh, or, or running.
They're running with all the world
software engineering tools, kind of very,
very efficiently available and well-known
install locations and all sorts of things.
Yes.
And, uh, that gives them a super
powerful, um, I mean, they're just,
they're just incredible what they can do.
Peli: So, for example, if you think
about the actions and the features you
have, you know, every run is recorded.
Every run you can store artifacts.
You can, You have APIs to read them.
So we store the agent session, the
sessions that are sitting on your dev
box individually, they're kind of lost.
We store them.
So we analyze them, we optimize
them, then we debug that.
Uh, so the automated agentic workflows
are primed to be optimizable,
debuggable because we have full history.
You run things five times, you realize
you always do the same MCP calls.
What happens?
You tell the agent, move
these MCP calls to steps.
And guess what?
Because we're action, we can
do a mix of deterministic, just
good old steps and agentic.
So let's say your agent does, you
know, give me the the 10 first pull
requests, and it does the GitHub MCP.
That is an agentic step.
It eats a bunch of token.
But now you move that into a GH PR call
as a step, drop it into a JSON file
and let the agent do JQ bash on that.
Suddenly your agent is eating
10, 10 less turns, 20 less turns.
Mm. you've moved the dial.
That's one of the interesting, these are
all interesting things we've discovered
in actions that being able to pull the
dial between deterministic and agentic.
Because in your CI we're 100%
deterministic historically, And
people have gone 100% agentic.
But the truth is Yeah …it's
gonna be in the middle.
And the more you are.
Yeah.
The more you're deterministic,
the cheaper it is.
The most, powerful- And the less…
Don: Yes.
they are, uh, that's right.
So, and that's why we love being in the
actions ecosystem, because those steps
can also use the full existing, GitHub
Actions, everything in that ecosystem.
Yeah.
And, uh, and that's,
that's really powerful.
Peli: You can see this gradient coming
right in your CI, you know, 100% CI, 100%
deterministic, but now you start squeezing
in a bit of agentic as much as you want.
Maybe just a little reasoning at
the end of your test run, you know?
Yeah,
Don: yeah.
I do like to point out to those,
uh, people who are really big
on, like, CI, and CD need to be
deterministic, and we absolutely
have to kind of nail that in order to
'cause it is That's absolutely true.
we don't challenge that.
Okay.
We say there's a, a third area,
a new area, like a third leg of a
stool that we didn't know was there.
Right, Which is like continuous ai.
Right.
Which includes some of these subjective
steps and it's got different properties,
but the continuity is a big part.
The automation is what is and the kind
of always aligning with the state of the
repo as it is today, which is what we
kind of mean by continuity, continuous.
Um- the claim
Peli: is bolder.
The claim, you need CI and
deterministic and fast.
This is how you're gonna tame the agents.
The agents are, you know, little monsters.
Don: Absolutely.
It's the ultimate guide.
This is the
Peli: golden era of CI.
Don: Yeah.
Peli: Yeah.
The Teams
that don't have CI will
not benefit from this.
The Boost.
Bret: Yeah.
I've often thought or described it to
people that, uh, are asking me like,
you know, where I get started and I,
I talked to them about that, you know,
you're probably not gonna rewrite your
Docker build workflow to be agentic.
that's probably not the area of focus.
I would imagine that a lot of these,
the majority of these are n- net new
workflows or expanding Absolutely.
An existing workflow to do
things it couldn't previously do.
Not rewriting my reliable test runs.
I mean, maybe there's an AI that's gonna
help me automate parallelism and all that,
but it's separate from maybe the run.
itself.
Yeah.
And It's
Don: really, really important.
We get that deeply because the way one
of my friends put it is like the CI/CD.
And in fact, GitHub in many ways is
where the grownups are in software.
Development process, right?
That that's, Yeah …you know, we're all
going crazy about doing our coding agents
and YOLOing on our local machines and
you know, whatever kind of cra But in the
end, the place where you build confidence
in an organization where you feel you
can deploy, where they're in your CI/CD.
and we've got to keep that
grown-up mentality about quality.
And we want, you know, when we talk
about code improvement, we we're talking
about proposed code improvements, which
have to get through the gates, right?
Yeah.
We're not talking about things that they
should come with test improvements, right?
That kind of match that just like
you'd expect from any pull request.
so we absolutely have to keep CID being
the place where the grownups are and keep
the mentality that goes along with that.
Uh, as well as having a more
flexible idea about automation.
And we wanna really
empower the DevOps people.
This is like, what it's all about is like
we feel there's just this missing piece
of the puzzle in the AI story, which is
where we empower the people who run the
repositories to use AI to their benefit.
And we know some people are suffering
in the open source world from
AI coming in from third parties.
Right.
We wanna empower people.
So the maintainers and the people
who create the repos decide What
automation runs in their repositories
for what goals, under what c- cost,
trade-offs, what, you know, what
quality trade-offs and everything.
They're the ones who can balance
those things in the context of the
business goals or the open source
goals that they kind of have.
And, uh, yeah, empower them and, and
not, not, don't just make them suffer
and kind of the recipients of Make them.
Yeah.
the- And they'll see so
many new uses for it.
That's one of the things, this is job
creation all over the place in the sense
there's so often, so many opportunities to
do work we could never have done before.
Performance optimization.
is a good one There's, there's-
this is literal job or at least
work creation because the people up
close to the repos are the ones who
know what, where the suffering is.
where the, where the un, where the
unenforced invariance, unenforced
quality, uh, the, the opportunities for
improvement, which were never explored.
Uh, and the, the legacy code, which
can actually be brought back alive and
actually serve a, a, a role going forward
or transition to a new system or whatever.
Peli: Yeah.
Don: So many opportunities for
work, once you get into the
right mindset and, uh, yeah.
It's, it's a golden age for DevOps people,
uh, who, um, to create There's lots
to learn, but it's, it's a golden age.
Bret: Yeah.
D- I, I have often, The more I've
understood the mindset behind your
creation of agentic workflows and how,
you know, like my first realization
was if I start looking at this lock
file, most of this is deterministic.
Yeah.
it's relatively long.
It's not, it doesn't look anything like
something I would write in a GitHub
action workflow, but it is really
just a, mostly a framework around, at
least when I first got started around
controlling and protecting and guiding
it, it is to this very small part
that's actually a model prompt, Yeah,
Don: yeah.
It's, it's, yes, it's, that's right.
There's somewhere in the middle there's
a invocation of a coding agent, but
you put all this apparatus around it
to say, we were gonna make that safe.
We're gonna make that guardrail.
And we, and yeah.
Go on, Pavel.
Peli: Yeah.
There is something new.
Actually, you know, if you look at
agentic workflows, it's, it's a big YAML.
I mean, by this time, we support
a lot of features, but you
don't even edit this yourself.
I mean, at least for on
the, in the ideation phase.
And, you know, until you reach
your 90% done, this is gone.
This is done through an agent.
You don't actually have, you
have to come in with your intent,
what you're trying to achieve.
And we, we've done a lot of research
and we've done this, we have this prompt
that is gonna try to generate the best
agentic workflow for you as a starter,
but there's also a completely new
experience where as an automator, you use
an agent to design that automation, and
then you can fine-tune, you know, and,
and run the compiler deterministically.
But this will get you from 0 to 80%.
Without actually having
to read the documentation.
'cause you come in and say, um, you
know, you, you name your scenarios and
you, you know the keywords, you know,
issues PRs, build workflow, run test.
The agent has access to your agent md
the agent has access to all your actions.
So if you already have CI/CD,
the agent can read your CI build,
figure out how you build your
software, how you run your test.
And one thing that is great about action
is that it is baked into the LLMs.
LLMs today know very well action.
They know how to write the YAML, they
know the entire schema, they know how
to refactor it because people have been
blogging about yeah actions forever.
Uh, so there's like this free, you
know, we have this format is actually
designed to be close to what the
agent would expect because then you
get this magic where it just knows
it, there's no fine-tuning needed.
You can tell it to refactor
the prompt into steps.
And it's like, sure.
I mean, yeah, I, and
it knows the ecosystem.
It's gonna go and pull in the right
custom actions to, you know, to
do actions GitHub scripts, or to
do checkout and all this stuff.
So that is also part of the magic here, is
that not only we're leveraging a platform,
but we're leveraging the fact that the
platform is already trained in the model.
Yeah.
You don't need to load a
skill to learn actions.
It is already in- This
is some new product.
Bret: Yes.
Don: Bret- I- uh, ca- can I just
share, uh, my screen briefly?
Oh yeah, sure.
Yeah, I, just wanna, just wanna show one,
one thing to kind of get across why Yeah.
why I'm so excited by this.
uh- yeah,
let
Bret: me, um,
Don: All right hold
Bret: on a second.
let me pull that in.
I don't have that button
on my stream deck.
One second.
You can see it.
I can see it, but I need to put it on
the screen for, everyone else to see it.
I need to give it a guest place.
And then, let's
Don: see.
There we go.
All right, brilliant.
I, so I just wanna briefly mention
this, this particular workflow.
This is a, uh, this is a workflow.
You just install one of these
in your repo and it kind of
helps you maintain the repo.
Okay.
This is the thing I
mentioned before, repo.
This is… and, uh, it, it's
really simple getting started and.
it's super, and this is kind of how,
how it works There's a diagram here
kind of selects a couple of tasks.
It reads the memory, and these
are the different tasks It
might do issue labeling for you.
It might do an issue investigation
and the other things.
And you can configure this and
you can edit, you can say, add
a new task to do this or this.
And, you know, and it kind of just
works on a daily rhythm or hourly
rhythm or whatever rhythm you want.
And I kind of wr- I've written up, uh,
written that up in a kind of blog post
and you can kind of see how it works.
but I just wanted to sh- share
this really, which is this
report we've written, uh, on
the impact of using repo assist.
So if you kind of look
at what's on the screen.
You can probably guess where we started
to use repo assist in this particular
repository There's a number of issues
that were open in the repository.
So this was a pretty much dormant
repository, but with a backlog, right?
It's got a… you know, I, I as a
maintainer sort of stopped engaging with
this because I didn't actually, each of
these issues would've taken me sort of
a, a night, probably in the traditional
way to kind of reengage with the issue.
And even if I was doing it manually
with a coding agent, it would've
taken me significant, 20 minutes,
30 minutes, an hour for each issue.
Okay.
And instead you've got the automated
AI effectively r- burning through
the backlog, commenting on it,
making pull requests for it, and
like actively, proactively kind of.
make solving all the, all of that backlog.
And I mean, boy, it allowed me either
to close out the backlog or actually fix
the backlog and make I think three major,
some of this was feature requests as well.
So it actually kind of took the
repository forward as well as implementing
features and got three major new
versions of this, uh, of this component
out in, uh, as open source releases.
And, that repository
is now in a good state.
Reposys continues to run.
So if there's any more, uh, it's
now running sort of on a weekly
kind of basis to as a cost control.
And if any new input comes into
the repo, uh, new issues, uh,
it will, it will start to do.
Its, its kind of magic.
It will look after that for me.
Uh, and of course I'm still in control.
The human's still in the loop, but
you can just get the very dramatic
difference it makes between software
with bugs to software that is actually
maintainable and fully usable.
And it's not just one repo.
Here's another repo.
The same workflow, uh, with,
uh, here's another one.
This is a different maintainer.
So it's not just me, other,
other maintainers picking it up,
a slightly different percent.
Uh, but you know, after, after a month
of sort of this thing ticking away,
uh, it, the repo's in excellent shape.
Uh, Here's another one.
This is something I co-maintain
with somebody else, uh, it's
a, slight different trajectory.
And, uh, and here's another one where
there were good reasons to leave a lot
of, uh, feature suggestions, uh, lying
around the repo at the end of the kind
of, uh, as it acquiesces at the end.
Yeah.
So, um, That's cool …super,
super happy with how this is, uh,
this is going, uh, this one, the
maintainer actually only comes back.
He said, So didn't wanna
work on this full, time.
didn't wanna crunch a whole lot away.
was just kind of happy just to
kind of come back to it every
few months and kind of this graph
will keep going down step by step.
So the, the report we're looking
at is, uh, just to bring it up to
the top, is the impact of automated
repository maintenance assistance.
and you wrote a blog about this, right?
on our GitHub Next site.
And yes, there is a, uh, blog, the blog
about Reposys in general, and on my,
uh, just grabbing this is a, is a link
to our new report, from GitHub Next,
it's on our GitHub Next site as well.
Yeah, so check that out.
Uh, where was that report?
Um…
Bret: Yeah.
Uh, I think to me one of the most
exciting things about all of this,
and, and we're, we've been hinting
at this the whole time, is that the.
As a CI maintainer, uh, uh, a-as someone
who's m- I call it the middle gray area
of the software development life cycle,
where it's post-commit of the developer,
but it's, it's pre-production running.
And everything in that middle has been
like, I, I, r- I can remember, we, if we
go back to even 2018, I can distinctly
remember at DockerCon and at KubeCon,
we were talking about what was the next
wave of innovation, because we felt like
the container ecosystem had matured, and
we kind of knew what that looked like
and how to move things around as images.
And that was all well-defined.
And w- at the time, we were all
talking about the CI platform
as the next piece of innovation.
And there was all this discussion around
different startups that were getting
funding because that was gonna be the
n-next opportunity for innovation.
And uh, we, there was experiments
that, you know, GitHub Actions, uh,
workflows were probably part of that.
Like that was a part of that wave.
It didn't, it was awesome, but also
didn't seem to always fulfill the
promise of what we were trying to
innovate on and reinvent in the CI space.
But I feel like we're
finally at this moment where.
I might just be able to do
all the things I always wanted
to do to fixing the platform.
You know, the maintenance, the
toil, the backlog, the endless
backlog of things that needed to be
optimized or locked down or scanned
or improved, like documentation.
And that w-we… management always tended
to, you know, focus on the feature set.
You know, th- those of us in DevOps
are always trying to help, help
them understand and that there's
more than just adding features.
We need to maintain the system.
SREs are a thing now, so we all
get this, you know, we at least get
someone in production that's helping
to optimize the production information.
But I feel like the CI platform is still
this sort of, uh, if redheaded stepchild
is a thing we still say like, it feels
like the thing that still doesn't get the
love and nurturing that it always needed.
and- Mm. I mean- …I've lost
count of the number of places going
Don: Yeah …my view, it's
the center of the factory.
It's the, it is the software factory
where all the grown-up stuff happens.
So much of the forward progress
happens beyond the- yeah.
Maybe feature development, which
might be done by agentic, local,
or Pele actually does a whole lot
in the CI It's just Everything.
He wants a feature implemented, he just
writes an issue and, uh, or, or yeah,
everything comes through it's CI system.
Amazing.
It, You know, the, the software
factory- So you're- …is real.
And
Bret: so you're not prompting you're
issuing You're issue prompting?
Uh, no.
Peli: I skip the issue.
Uh, I set a prompt
directly, but a lot of it.
Are agents that create issues.
So the issue is Oh, okay …a work queue.
Right.
Mostly.
So a part of the work, which is
maintenance and code improvement,
uh, or documentation updates, would
be produced daily by, by workers.
but you know, what you're saying is right.
It's, you know, up until now you could
have a sloppy CI and for example, you
could rely on a good dev team to kind
of trust your, you know, they would do
the right thing, you know, the quality.
This is not true anymore.
The only thing, the only way you're
gonna leverage these agent is to have
a very, very tight CI with a very good
test suite and not just one test suite.
You need test of the test
integration test first test.
I mean, you bring it.
Yeah.
Because you need to triangulate them
so that they cannot escape that box.
And then you, and then once you
have that, you can get the boost.
The boost is the cloud, your dev box.
There's only so many eyes and so
many terminals you guys can handle.
So I don't know if you're, if you're
a spider, you get eight eyes and
you can maybe do 64 terminals in the
cloud, I can easily run hundreds.
Okay.
So let, me- There's,
there's like no compare.
Bret: Yeah.
Uh, let me ask real quick, 'cause
I think one of the things I love
about this, or I, I try to optimize
on this show is to change behavior.
Give people an insight that will actually
cause them to do something different
rather than just executive over-overviews.
not that we've been doing that, we've
been digging in the weeds, but I,
I'm, I'm actually very curious like.
if When you're on the forefront, can
you talk through what it, what your
activity looks like when you want
to create something new with the CI?
So you're, you're prompting the
LLM on your local harness, right?
It sounds like it's creating
the issue on your behalf.
You've got a bunch of automation
running in the background.
Are you asking it to like develop the
PR and then you're gonna wait for it
to tell you when the checks are ready?
Like are you even going to GitHub
or are you like harness first?
Like t-talk to a little bit through
that so that we can get an idea.
Okay.
I'm,
Peli: a bit extreme.
Sure.
Um- That's what I want, I want,
Bret: I want the red pill all the way.
Peli: down.
So first of all, it's fully async.
Okay.
there is no discussion on
my part with the agent.
I fire and forget through, git, the
GitHub cloud agent most of the time.
So, you know, you go either to github.com
and you do new agentic session.
I mostly use my phone.
So I do that from the iOS app.
So I'm not waiting.
That means I can have five to
10 agents running at all times.
And it's just like playing
chess on multiple boards.
Now they take time.
So if I forget what do you do between
the agents, you think about your future.
you have more time to think or you
talk to people In fact you have more
time to talk to people because the
agents are doing the work when they,
s- how is this gonna change your work?
So.
That's a very big one.
people are into the token, you
know, they look at the token
flowing down and it becomes a slot
machine, and they get addicted.
They get headaches and stuff.
They're tired.
Yeah.
Don't have, you know,
just schedule the work.
let it churn for a while, come back.
so that's One big thing.
Now when I determine that there's
a pattern that I'm doing the same
stuff all the time, or there's like
something I'm thinking in my head of an
automator, whoa, I could do that again.
You know, there's like,
there's some value to that.
So the most obvious one, one of the
first one we wrote, was like, this
agent creates a lot of duplicate code.
And it did it in kind
of a, a very subtle way.
it would rewrite string functions,
like string start, string, you know,
string split, but in different ways.
so I had this idea I was like, okay,
maybe we look at the, we look at
the function titles and we ask the
agent to bucketize them by intent, So
because the body of the function may
be s- completely different, but the
intent of the function, is the same.
so you open your phone and you say, create
me a daily agentic workflow that uses some
LSP to list all the functions or regex
to list all the function in code base.
Bucketize them by intent, pick the
biggest bucket and now generate
a prompt that says, remove all
these duplicates." That's it.
That's all you have to do.
Wait five minutes, you
get an agentic workflow.
As a PR, you review the PR,
you look at the safe outputs,
which are saved by default.
This is your first version.
probably not efficient, but it's gonna
run, it's gonna burn a lot of tokens.
It's not, it doesn't have the
right MCP, you know, it's not
optimized, but it kind of works.
Yeah.
And you're like, concept
this is real value.
Then, you know, we iterate and so forth.
And this concept of, I've got something
that, that annoys me in my code base
and I can get goodness I can get slight
improvement in a human consumable
way, which is basically the daily
newspaper concept is insanely good.
First of all, it's super fun.
You feel good because like, "Whoo,
you know, my code is better now.
it's a new way to also handle the
agent, because upstream that means
you don't need the perfect PR.
You can work in a feature branch.
You can go faster by
not doing the 17 nits.
Mm. because you have cleaners now.
You have tools that are
looking for patterns that are
known to happen with agents.
You know, they're gonna happen.
by the way, humans were
terrible at coding too.
We forgot that.
But, uh, so now you clean
everything, human or non-human,
you clean all the patterns.
and we've been, I mean
by now we have what?
27 or 50 running.
So we have also a summarizer
that looks at them.
I don't have time to look at them.
I have something that
mines them, We track them.
But it's always started from there is
something I'm doing all the time and I
wonder if the agent could actually help.
And this is important, like, I don't know
at the moment I'm writing the agentic
workflow whether it's gonna work or not,
Mm. but my time, my experiment costs
me five minutes, 10 minutes, So I'm not
investing three months to build a static
analysis tool like we used to, like,
Hey, maybe this works and some of them
are just terrible or just too expensive.
You, know, or,
they're, but a lot of them
are like, surprisingly good.
Then you put the right MCPS and so forth.
Uh, then we optimize.
By optimize.
I mean, think of an agentic
workflow as a concretized plan.
You've done sash plan, you paid
for opus, you burn a lot of tokens.
You've got a really good plan.
That one is now set in stone
in your agentic workflow.
That means you can go for
a lower model to implement.
Mm-hmm.
And then you can start splitting
into submodels and everything.
Lower your tokens.
And all of these are hyper-specialized
tools because they, the clearer the
goal, the better the agent gonna be.
So these are all kind of intuitions
that we build, that we've kind of
measured, that we use, um, so it's like
a plethora, like a, an insane amount of.
Very, very specific tools.
Yeah.
Linters are a good example you know, this
kind of tools we've built over the years.
Yeah, I love- Linters have rules.
Bret: My my favorite thing, and I think
I might even have it in my global agents
file on my machine is always, always lint.
Uh, at the end of every edit you make,
uh, or at the end of every run of an edit
you make, because I, I don't even wanna
s- I don't even wanna look at what your
output is unless it's passed linters and,
uh, at, I think the minimum, like you
ment- you mentioned the actionlint and the
zismore and, yeah, these are Yeah …like
table stakes for me, for GitHub workflows.
Uh, I was just curious real quick.
Yeah.
What is a, what does
optimization look like?
Is that just improving prompt,
it, uh, when you're So many
Peli: things?
Yeah.
So you wanna have the same
performance, the same reasoning,
at a lower cost, right?
that's really the end game.
you don't wanna degrade your performance,
but also you don't wanna pay, you
cannot just pay opus all the time, Uh,
so there's a lot of tricks in the bag.
And we talked about moving,
turns into the step side of
the action Pre-computing data.
Also, this grounds the agent.
You do the computation, you cook
some python, you're given the pre-fed
computation with all the aggregates and
say, now reason on this and don't do any
computation, don't make up numbers, right?
So that's also a trick too.
But then other things
are using small models.
So splitting a monolithic prompt into
a prompt plus small models, typically.
For example, if your task says, go and
summarize files, the file summary can be
done by a small agent, a subagent, Yeah.
Then comes back with the summary.
Right now, you've actually went from a 6X
model sonnet style, you know, 6X, to 0.3.
you've dropped by 20X your cost per token.
Um, and of course, you know,
there's a prompting and we have
AB testing in the platform.
So when you run at scale over thousands of
repos, you can start doing campaigns and
field test, prompt improvement and measure
in a scientific way, in a reasoned way.
Just like, you know, think these
are really websites, right?
at the scale you're going
to run these agents.
My belief is that it's not really an eval
thing, it's more like a website where you
do ab testing on features, you're going to
do ab testing on prompts or model and so
forth, and measure as you're spending the
money measure whether they work or not.
It's really hard to build evals
when you have hundreds of agents and
everything's moving all the time.
Right.
Uh, but- That's
that's
Bret: a lot of eval runs in addition
to the agent runs themselves.
Yeah.
But AB
Peli: testing is a proven way of
fielding, uh, improvements and measuring.
Don: Yeah.
And in terms of, you know, the
people on the call watching this, uh.
you know, One of the big behavior
changes and mindset changes is
like, what's your future work
gonna be in an enterprise, right?
It's not just running over one repo, but
you're gonna be the agentic maestro or a
team of agentic maestros who are able to
do things at scale across tens, hundreds
of repos in say, a major, uh, a major
organization that might be about like,
uh, applying security patches, uh, or
making judgment calls about how the impact
of what it means to roll things out.
It might be, uh, making things
more regular across those repos.
It might be, um, improving
the test, uh, assessing the
test coverage across the repo.
You know, if I was the CTO of a major,
um, all companies, software companies
these days, I'd kind of want a report
a summary report, up of like, what is
the, what are all the repos we have?
What are, what's the status of them all?
What, how do they Right
…cluster together?
How do they, how, what
technologies do we actually use?
And not just based on what we think
we use, but actually assessing,
what we actually depend on.
And so there's so much that's even on
information reporting, kind of working,
kind of going up even before you get
to kind of taking, improving all of
those different, uh, repositories.
Yeah, I, I, I, I think there's a, you
know, I've been worried for a long
time that some of these, you know,
I'm in London and the there are a
whole lot of investment banks down
the road, I've worked with some of
them over the years, and I actually
am really worried about some of them.
You know, I'm worried
about their software.
They, they tell me they've got
20,000 production systems or
something like that, right?
And It's just like,
Yeah … uh, it's, it's insane.
It's, the, the, the, The software
legacy debt that they have across
those systems is just huge.
Luckily for them, the tools have now
come along, which can deal with that.
But they need agentic Maestros to come
up with the workflows to get a grip
on that software complexity in uh,
through summarization and action and
all sorts of other, yeah, along various
dimensions are kind of kind of working.
So we're not just talking one repo,
we're talking a whole, the agentic
organization and how that actually
maps down to actually working
with real software artifacts.
It's not just something in theory
or something kind of, Some of these
automation platforms like, uh, you
know, Asana and NAD and the other
ones, they're much more on the
kind of information working side,
like working with the HR systems or
your, uh, ERP systems or whatever.
this makes it very concrete to me,
like I now know what the agentic
enterprise means for the whole
software side of the enterprise.
Yeah, it's very real.
It's like, it, it, there's a
lot of work to be done to make.
To actually crank the handle on
that and auth, a lot of this becomes
auth constrained, for example, who's
allowed to do this stuff, right?
Yeah.
the ideal job in this world is where you
have maximal auth and you are trusted.
And that is what the maestro really, the,
uh, the grand wizard of the enterprise.
That's the ideal job to have
in this kind of situation.
You've got lots of power, lots of
tokens to spend you, or can find
out what the actual business value
work, uh, to be done actually is.
And you can actually make it happen
not just shout from the sidelines a-
across some organizational divide.
So yeah, you want to be part of the grow-
If you're looking for a new job to, or
a new career direction you want to be,
you want to be leading the conversation,
the agentic software, automated software
conversation in your whole company.
However, that is
Bret: the continuous AI czar.
maybe we'll workshop the name.
Or the group of czars.
Don: title, I mean Yeah, The group.
I mean, it doesn't have to
be a single czar, but yeah.
The, the maestros, the, uh, the thought
leaders in the company, the people who
see this in multidimensional ways, uh, who
aren't just, they're not just nutty and
evangelistic, they've grown up about it,
but they can use re- repository automation
at scale for positive action across the
Peli: whole.
org.
it will be the catalyst for a
reorganization of software production.
The way we build software will change
because we will design new processes
between agents and humans where,
you know, we've been doing the pull
request for a while, the agents are
kind of, you know, kicked in the
door and starting to shake things up.
But there will be new ways to
build software and they will.
And you know, and we are seeing, in
a way we're experimenting with all
these new kind of flow, information
flow and production flow that involve
agents, human agents But, we do it fast.
We have the means to do it safely.
Yeah.
Don: My, yeah, the one I'm
currently, um, which is a kind
of a software factory, uh, image.
I've used that kind of
terminology quite a lot.
And, uh, when you said
like, what's the action?
How do I, how do I
start my design process?
At the moment it's about saying, let's
build a factory, uh, where there's
actually lots of inputs flowing in.
They might be issues of some
kind, but they might not be just
issues in a maintenance sense.
It might be like in, in the case of
GitHub, we have automated tools which
find problems in, in, in GitHub problems,
in the logic of how we use our databases
N+1 problems and things similar.
So you're kind of going to get these
to flow in and then you've got a
whole automation human, a mixture of
human and the factory is a, is a place
where both agents and humans work.
Crucially.
Okay.
And When you think of it as a
factory, then things can get
blocked at the human point.
Like the, uh, even when you get all
the automation set up, which is really,
really great, you, it can still get
blocked by overwhelming the human
with too much kind of generation.
And you can either scale that back
or you can increase the quality.
There may be good reasons why
they're doing that, or you can
actually turn off the whole
factory because it's not actually
serving the humans' needs properly.
Okay.
So the, the aim of the agentic
maestro is to design that human, i-e,
that, that, that agent-human factory
and make it flow, make it work.
'cause when it does flow, you get those
really dramatic results on quality.
Bret: Awesome.
I feel like it's been, this has been
a good discussion around, uh, agentic
workflows because I think that's not,
it's not a feature that we see in the UI
n- yet, so I feel like it's still really
early days in terms of getting everyone
to be aware that this thing exists
and how to go about implementing it.
So I'm excited about talking more about
it, and especially now that I, I feel
like you're giving me more reasons
to pay more attention to it because
I'm realizing that I've, even though
I've, I've dove into some of it and
implemented some of it that I'm, I'm
still, I feel s- I'm a babe in the woods.
I'm a babe in the woods
right now, but, um, so, so
are
Don: we.
Yeah.
So are we, yeah.
I wanted to ask you.
It's an exciting
Bret: time the last question or the
last topic before we wrap this up.
you're both, you're bo- o-
one of you is GitHub Next one
of you is Microsoft Research.
You're basically both
already thinking years out.
my assumption is that there's,
there's things that are coming.
So like what, this is all brand new to us.
It's, I'm sure it's still very new to you.
Like what, what is the thing that's
coming, not that this is ever replacing
this, but like what, what are you
excited about for the rest of the year?
what else do you, is it more pl- finding
more places that this can operate or like
sussing out the real value of where these
workflows are running in sort of m- like
maybe, maybe maybe making t- the top five
list or the top 10 implementation list?
Like where is it that you, you're
looking to in the short term?
take this.
Anyone wanna I'll let
Don: you first?
I'll
Peli: let you we're
still in techno preview.
Yeah.
So In a sense it's, we haven't,
we're still in first gear.
so very excited to see where, where
the product is gonna go and we're
gonna really be able to, to go out and
try to, and we have intuitions about
what's gonna happen when life scales.
We haven't really done it.
Yeah.
And very excited to actually learn.
Everything's fine when you
have one repo, 10 repo.
But we are very much looking forward
to 1000, 10,000, 100,000 scenarios.
And now looking at large scale,
agentics and all these scales, economies
are gonna happen, uh, through that.
personally I think it's
a golden era of CI.
I mean, if there's one thing out of
this discussion is yeah, stay in CI,
it's gonna get good because everybody's
gonna turn to you and say, how do I
run my agents in your, in your CI?
how do I do more agent stuff and
you know, and do all these, I heard
this and I wanna do it in your
CI and they're gonna turn to you.
Don: Yeah.
Peli: And they, they're
they're also, they're,
Don: al- they're also gonna
turn up and say, Hey, can I use
OpenClaw inside the enterprise?
You know?
'cause I wanna automate my,
like, uh, I wanna automate doing
my PRs inside the repository.
And it's just like, you know, maybe
you could go learn ag- get agentic
workflows because that's, that's
actually like a, you know, pre- it's
pretty safe way of doing automation.
Right.
And they're gonna turn up.
with… there's, lo- we're seeing
lots of other, We already kind of
touched on the, like run the naked
coding agents kind of approach.
And it's like, the answer to that is
go use GitHub agentic workflows, right?
'cause that's got, that's
got a security architecture.
You can also make, there'll
be other options as well.
There'll be other security, architectures.
Yeah.
but there's an answer to
a question that's there.
And, and we, we chose a continuous
AI framing because it was.
It's An industry, we wanted to create an
industry neutral term that a ju- uh, just
like you've created Agentic DevOps and,
they, and they're more or less, two, two,
very closely affinitized, which is great,
and, and they, they're questions with
an answer, with a natural answer, which
is GitHub Actions and, G- and GitHub,
which from a product perspect- product
development perspective is, I'm very
happy to have made those contributions
and where we've landed with all of that.
Uh, in terms of looking forward,
there's all the kind of rollout of
this kind of agentic workflows at scale
or agentic working in the enterprise.
And that's gonna take years to roll.
the enterprise turns slowly,
development teams turn slowly.
They've got their own opinions
and their own skilling.
As I said, this isn't, this is about
as much as I use the agentic maestro
thing, the Agentic maestro also listens
to the dev- developers, listens very
closely to them because they're the
ones up close to the coalface who
know how to maximize, they merge
the
Peli: PRs that are not,
Don: and they probably
merge the PRs or not.
If you're really lucky as an agentic
maestro, you get to merge PRs too.
It's like, it's, it's
good to have that power.
uh, but there's the, I'll,
I'll leave you with one thing.
that's a little bit further out,
uh, which is the… when we put
these workflows together, Previously
o often they're simulating what
we'd imagine a human to do, like
test improvement or test coverage.
They're doing one thing and you can
imagine getting someone in to do, improve
your tests and assigning them that job.
But nowadays we can actually get them
to use multiple kind of tools and
methodologies all at the same time.
And so if we look at say, performance
improvement, for example, uh, y- the-
this thing not only knows how to do
the profiling runs and how to write the
benchmarking kind of, tools and how to
do garbage collect and optimization.
It can also go read the
assembly code, right?
You know, which no- which n- we
none of us can do, Read right?
We can't interpret that and
it makes good sense of that.
And so, um, you can set up, if it's, if
you really needed to squeeze that last
2 or 3 or 5% out of, uh, a performance
out of, some say Go-based system, and
this might apply to say, GitHub or
something, then you could actually
set the agents also optimizing, adding
new optimizations to the go compiler.
Okay?
Like it can take us a private copy
of the Go compiler and make a, uh, a
kind of maker, si, you know, make its
own bots, compiler patches to that to
actually improve the register allocation.
Uh, and that's a multi-skilling thing
where you could never find a single.
Person who could-- had all
those skills across the board.
Right.
Ta- uh, Performance optimization-
The agent … it's full
of those kind of problems.
Right?
So, you know, the people who actually know
how to make their .NET or Java garbage
collectors, the memory hierarchies and
use them re- uh, so that everything flows
really, really nicely in those systems.
They're really rare.
The people who can do that, the agents
kind of know how to do that kind of work
where you can ki- you can encode code it.
So these kind of multi-skilling
flows, are super interesting.
Uh, I think, that, are beyond, they're,
they're a bit beyond what we, the frontier
of what we imagine these AI systems doing.
'cause they're kind of like little
teams of people or little teams of, of
cooperating kind of agents all taking.
a… It's not just critique
or different roles.
It's actually entirely different
compatible sort of skills, which kind
of compose together really nicely.
yeah.
So it's one idea.
Bret: All right.
I think the tagline for this,
this is, uh, besides that,
is another potential tagline.
'cause it, it sounds like I've
got five different options
for the title or the tagline.
It could be, uh, GitHub Actions
is the OpenClaw for grownups.
Dude.
I gotta get, I gotta think of Yeah …all
the buzzwords I gotta put in there.
Yeah.
Peli: I
gotta have the word agent somewhere.
in there.
That, that's your show.
Yeah.
your show.
Don: Absolutely.
it.
You know,
Bret: it's got cron, it's got A-
absolutely, we can, we've got memory.
It's got, yeah.
We can, it can learn over time.
Yeah.
The addition
Don: of the addition of memory to
these GitHub agentic workflows, Yeah
… makes huge, huge difference to that.
'cause now they can do research.
The first task they do when they
haven't done it before is they can
go research your code base and, and
actually work out how to do all that.
All those, uh, depth kind of
engineering things, uh, that, and
kind of keep their own private notes
on that, and update those notes.
Amazing.
Peli: We have action cache.
I mean we mount memory on action
caches on repo on comments in the wiki.
Plenty of places to store.
Uh, and then, you know, you've got
these long-running, We have a practical
Auto Loop, which is the auto loop.
Think of the auto researcher ref loop,
not just days, weeks, month, because
it's mounted on top of an action run.
Saves its data into a branch,
then action restarts on a
branch, and then it keeps going.
So if you think about all your inner loop
that are doing route for, let's say a
day now you have the outer loop of that.
Yeah.
That's gonna run for a month.
So you can point it at uh, an
entire code base and say, convert
this stuff into something else.
Don: Yeah.
But, uh, Wow OpenClaw for grownups.
Yeah.
No, automation is an incredible thing.
Very empowering.
And, uh, it's just, yeah, a lot you
can do with it, but make it safe.
Make a guardrail- And, yeah,
we're very happy to help provide
a basis for doing this, uh, at
scale in the enterprise and with
Peli: Um, we are very, we are a weird
open source project, but we've closed,
we take, uh, bugs as, uh, as specs.
but we've closed so far.
See the number today, 633
community bugs since we shipped.
so if you're using agentic Workflows
and you find something, uh,
run our agents on your workflow
and tell it to file an issue.
And, uh, yeah, we've been, uh,
running as fast as we can to
answer the needs of practitioners.
Most of the bugs we get are from
professional CI/CD engineers.
And this is, these are really deep GitHub
action feature that we didn't know.
Bret: Yeah.
Right.
Okay.
Yeah, so- All right.
Right.
Like the, the esoteric
edge cases of, uh, yes.
Of GitHub Actions when someone knows
every little nook and cranny of the Yeah.
Peli: Arc Runner on GHS
with something, something.
Uh, we've been looking at that and
you know, there's this thing where
the sandbox is closed by design,
but we also have the hooks for the
pros, for the people who know to
go and plug in the stuff they need.
GitHub apps, custom jobs, custom steps
uh, Uhashicorp step to get your secrets.
All this stuff is basically, you
know, we, we wanna fully leverage
the platform and we want the problem.
When you do a box that doesn't have the
escape hatch for the pros that then people
turn up that, you know, you need to get
stuff done, you turn up the security.
So we designed it.
Yeah.
bad
Bret: habits.
Yeah.
Peli: designed it so that you don't
have to turn up security to get into
these enterprise scenarios that are,
very complex CI/CD with multiple apps,
multiple security, multiple tokens, and
th- things become, you know, the sample
we see on the landing page becomes way
more, you know, when you start factoring
in real life CI/CD constraints, Then it
becomes, but that is because we build on
top of action and we're just an increment.
We inherit all that goodness.
We're in all the ecosystem.
So, I mean, the call for
action is try it out.
There's something you don't like.
We've been running as fast and
responding to, uh, we only have
a backlog of 30 issues and we've
closed 630, so we're, we're on it.
Bret: That's awesome.
Peli: Yeah.
Bret: Well, yeah.
Your agents are on it.
You're on top of the
Peli: agents.
No, the, we don't take PRs.
People run the agent on their side.
There's a full, ana- uh,
full investigation on their
run with their secrets.
Then it's anonymized and then
the agent files a generic issue.
Nice.
And that's just built in also it's
a crazy way to do all software.
Bret: You're right, It's built in,
it's built in, a feedback loop.
all right, so this has been awesome.
I'm so glad to have both of you here.
I'm very excited about the future of this.
I've been a heads down GitHub Actions
guy for a long time, and I feel
like This is my whole new thing.
And this episode has convince me that
I should have spent a lot more time
the last three months on it than I did.
you've convinced me and I'm excited
to get into the weeds of this.
Where can people find both of you,
Don: certainly if you, if you wanna
contribute ideas, uh, to the design of
GitHub agentic workflows or even just
feedback on using it or examples of using
it, there's the, the repo You can make
your pitch give us good feedback and
find us, make a pitch for a new feature,
a new coding agent or some new thing
you, you think sh- should be supported.
And it's all open source so you can
kind of, uh, s- see it all and work it
all out even before it, it gets to us.
and ship on Mondays.
Peli: You ship on Mondays?
Yeah.
You, c- and,
and
Don: uh, for me, you
can find me on LinkedIn.
I do a lot of posting on LinkedIn
and my blog as well, which,
Peli: yeah, I'm mostly on GitHub and
you can find me on LinkedIn, but Nice.
Please file an issue with your idea
why you think we should have it.
With the agentic plan, you know, burn
some opus tokens on making your claim.
a lot of them were one-shot in BB Kitcode.
Yeah.
Awesome.
And the turnaround, we've got some
three-hour turnaround sometimes if you're
at the right moment in time between
filing the issue and getting a release.
this is agentic speed.
Bret: Yeah.
Uh, I'm, I've been playing around
with GitHub mobile more and more.
And so it, it, the fact that you, that's
your workflow, that's your process
is convincing me that I need to lean
into more of that because I, I, it's
a, it's a habit that I don't have yet.
Like, I don't have the muscle
memory to go, I have an idea.
Let me jump into my GitHub app.
so I need to break that.
for everyone listening, it's
github.com/github/gh-aw, Obviously
there's a bunch of websites.
You can go to githubnext.com
to see all the exciting stuff
coming out of the research.
Uh, basically just a, a bunch of, it's
a whole list of smart people that I
basically wanna invite on, all, of
them on the show for some things.
I think I, I just saw the presentation
Love it …from AI Engineer
Linden The, n- yeah, the the new,
team-based agentic harness mindset.
I am like, that is my next, I am.
So I saw that demo and thought this
is exactly what I've been missing
and what my teams probably want,
and, a replacement for Slack,
and like all these other things.
So I'm very excited about that one.
and I was like, immediately
went to sign up for the beta.
I don't know if she realized that there
might be a whole lot of signups, but
when I shared it out, I got a bunch of
responses from people going, oh, yes.
You know, not just yes,
but hell yes on that.
So I'm excited to see how these two areas
merge and I can have the prompt cr uh,
crowdsourced from my humans and like
we perfect the ai- output of possibly
how these GitHub action workflows
are gonna be created by an agent.
Don: Bret, Thank you for having us on.
It's been a lot of fun
Bret AI July 2025: Thanks for joining
us, and I'll see you in the next episode.
Episode Video
Creators and Guests