writing / Pieces of Autonomy

June 23, 2025

Being a software engineer right now is crazy. The day to day tasks required of our profession are fundamentally changing and we're living in that moment of change. Your grandkid will show you their first piece of software and they'll gasp when you lean in on your hover-chair and tell them stories of typing each individual character of a program by hand.

Right now we're still at a point where skeptics will try and say "this is terrible" while the true believer at the next desk over claims English is the only programming language that matters. We can squint and see a magical future but the road from here to there is hazy. So I've been thinking:

What does it take to get fully autonomous software engineering agents?

The first question that comes to mind is

Wait, what is a "fully autonomous software engineering agent"?

Defining autonomy is tricky. This question is asking "what do we actually want these agents to do?". I'm of the opinion that an agent is and ultimately will be a powerful tool wielded by an intrepid human. The best tools are extensions of the craftsperson and allow them to focus on their expression.

That means in the utopic agentic future, you (the hero!) spend your time on matters of exploration and taste. You're guiding and creating and responding to what is created. In this world, bugs are due to ambiguity. The agent will make reasonable, well-informed but sometimes incorrect assumptions about ambiguous situations. These moments will either feel like "oops I forgot to mention that thing in my head" or like finding an edge case you hadn't considered but now realize is important.

On the other hand, a non-autonomous agent requires continual coaxing. It gets stuck on something it can't quite figure out or needs you to point things out that it really should just have been able to figure out on its own. In today's world this looks like spending your time picking the right task size to not overwhelm the model or making sure you've tagged every relevant file and piece of context ahead of time.

A puzzle with 4 pieces

My mental framework has four pieces to the puzzle of fully autonomous coding. We're currently doing the collective engineering work to jiggle the pieces together and make things work.

Four pieces of autonomy diagram showing Intelligence, Environment, Control, and Context

Intelligence

The autonomous agent needs to be able to plan multi-step changes and write high quality code. The current frontier models are really good at writing code but they still have lots of headroom. Today's models start to fail over long and complex changes which is why we see a focus on improving use of long context windows.

Who is working on this → OpenAI, Anthropic

Environment

The human engineer and agent engineer and production should have identical environments so that engineer's work and testing translates cleanly to prod.

People and agents develop by generating hypothesis and testing them, making mistakes and fixing them. They need an environment they can quickly explore and safely operate before settling on a solution. That environment needs to be realistic enough that they can be confident in the change.

Who is working on this → Your company's dev tools and dev ops teams that manage the various dev, staging and prod environments. These are the people tasked with moving the org forward and capturing the productivity gains with AI. Also worth noting is Mechanize who are working on a similar but not identical problem.

Control

The autonomous agent needs a control surface that allows taking extremely precise action while also being able to smoothly move between levels of abstraction.

The control surface is how the human engineer expresses intent to the coding agent, looks at the work product and provides feedback. This is where the flow state happens which means this is the tool that engineers will love.

Current IDEs are built to get into flow while doing fine precision work. As agent mode and background agents grow in popularity, our tools will adapt to allow engineers to stay in that hyper-focused productive state while orchestrating a team of agents.

Who is working on this → Claude Code, Codex, Cursor, Windsurf, Devin and many others.

Context

The autonomous agent needs to be able to look up all the information a human engineer would lookup.

Writing software includes some amount of time spent simply looking up relevant facts. Sometimes these facts are publicly accessible and sometimes they are unique to an organization. No matter where the facts come from, whenever a human engineer has to manually gather and share facts with the agent that will break the feeling of flow and autonomy.

Who is working on this → The MCP community and dedicated search servers like Ref or Exa.

Predictions

The fun part of writing these sorts of things is making predictions. Here's a couple.

Environment will be the firsts problem that feels completely solved but there will be no single solution.

Having a realistic environment the agent can work in autonomously that translates cleanly to production is a problem we've been solving for developers already. And yet every devtools team at every org does things a little differently based on the needs of their org and product. There will likely be common atoms and patterns like containers and CI but little standardization beyond that.

Complete Context will feel increasingly important as adoption of coding agents increases.

Right now for most engineers, Context feels least critical because it's not always necessary. For example, LLMs can write simple scripts or self-contained changes without external information. However, I predict as coding agents become more commonplace, the failures due to lacking context will be more obvious and this will feel more important. This is why I'm building Ref.

Exercises for the reader

P1. Consider any type of knowledge work. What are the types of intelligence, environment, control and context needed for an agent to do that reliably?

P2. Consider any task you do on a computer. What are the types of intelligence, environment, control and context needed for an agent to do that reliably?