Long Running, Complex Development with Agents

Learning to use coding agents effectively tends to involve a lot of wasted time first. Here’s how to skip some of that.

The Journey

Most developers follow a predictable path when adopting coding agents:

Stage	Approach	Result
Naive	”Build me X”	Looks right, works wrong
Detailed	”Build X with requirements Y, Z”	Better, but still inconsistent
Planning	”Plan first, then execute”	Much better, but hits limits
Persistent	”Plan → Tasks → Subagents”	Scales to complex projects

The Naive Approach

The journey with coding agents like claude code tends to follow a predictable pattern:

“Build this complex app, it’s like existing app but different because of things”

The result usually looks like what was asked for. Right up until you actually try to use it. On closer inspection: AI slop. What follows is a long, frustrating back and forth, especially once compactions start.

Adding Requirements

If you’re still invested at this point, more reading follows. The common advice: be more specific, provide more details and requirements.

“Build this fancy new app that has all sorts of requirements.”

Results go a bit better. It’s a bit less AI slop, a bit more functional, and sometimes even gets the result you were looking for. But it still requires back and forth. Things are missing, weird choices were made, shortcuts taken. Then you try to add a feature, fix a bug, or expand on the existing project only to have it go wrong when using a prompt like:

“Add complex feature to this project with all of these requirements.”

At this point it’s a roll of the dice to see if you end up with AI slop, duplicate code, way more files than necessary, and generally a mess. This isn’t necessarily a function of the coding agent going wrong, but just how it was prompted. Specifying lots of requirements doesn’t mean the agent will approach the problem the way you would. You might do research upfront before starting. The agent just starts.

Enter Planning

Assuming no table-flipping occurs, more reading follows (much like you’re doing now). Eventually you come across folks talking about “planning” first, then executing. Prompts tend to take on this shape:

“You will be building this fancy new app (or complex feature) that has all sorts of requirements. Before writing any code, create a plan, we will iterate on it, and then write it to a markdown file.”

A slight variation on this, especially for feature development, is to ask the agent to research the existing codebase or problem before making the plan.

Once the plan is generated and reviewed, you tell the agent to:

“Execute the plan (by reading the markdown document)”.

The coding agent then happily starts coding everything laid out in the document. More often than not, what pops out the other side is better: less slop, fewer shortcuts, something that might actually be maintainable.

The Compaction Problem

Warning (Context Window Limits)

Under the hood, tools like Claude Code are turning the plan in the markdown file into an internal todo list and then checking off items as they complete them. This works well until it doesn’t. Sufficiently large or complex problems may contain a LOT of tasks on the todo list.

A single agent working through a long todo list burns through context fast. Reading files, writing files, making tool calls, tracking progress: it all adds up. When the context window fills up, tools like Claude Code compact: summarize what was happening, reset the window, inject the summary back in.

The problem is that everything gets compressed: which tasks were in progress, tool call results, current status. All of it reduced to a summary. The summary inevitably loses nuance: which tasks were half-done, what edge cases were discovered, why that one file was left untouched. While there are efforts in the compaction prompt to retain the important parts, it’s by no means perfect.

Another challenge: the agent often stops to ask if it should continue, breaking the flow.

The Solution: Persistent Tasks + Subagents

The fix: one more step in the planning process, combined with on-the-fly subagent tasks.

The extra step is to make the todo-list that Claude would generate into something permanent and not ephemeral.

Tip (qask)

One tool that works well for this is qask. There are other variations on this concept like taskmaster, GitHub issues, etc. However, all of those come with a lot of extra features that distract from the crux of the problem: making the todo list less ephemeral.

The idea behind qask is relatively simple. It’s an MCP server that exposes only 5 tools:

Function	Description
`create_task()`	Create a task (type: task/bug/todo/note, priority: low/medium/high/critical)
`get_task()`	Get task by ID with full description
`update_task()`	Update task fields (status: not_started/in_progress/completed)
`list_tasks()`	List/filter tasks (returns metadata only for token efficiency)
`delete_task()`	Move task to trash

Each task gets its own JSON file. A root-level JSON lists out the title, id, and file path for each. All of the tasks are stored in the project’s “qask” folder (which is determined by where you run the claude command with the MCP server active). There’s no dependency ordering; that’s sorted out by the agent.

With the todolist now stored as tasks on disk (rather than managed internally by Claude), the next step is to use subagents to complete them. The advantage of the subagent is two fold:

Each subagent starts with a clean context window, focused on one task. The main agent just coordinates: pick task, delegate, repeat.
The main agent (the one responsible for interacting with the user) is only responsible for picking the next task from qask and launching the subagent.
- A nice side effect: the main agent saves context and can tailor each subagent to the specific task.

Putting It All Together

What this looks like in practice:

“You will be building this fancy new app (or complex feature) that has all sorts of requirements. Before writing any code, create a plan, we will iterate on it, and then write it to a markdown file.”
Iterate on the plan.
“Turn the plan into a series of small, completable steps that always make forward progress on the plan.”
(Optionally) Review the tasks that were created. Qask has an electron app to make this easier.
“/orchestrate” — For each task: dispatch an implementor subagent to do the work, then a reviewer subagent to verify. If the reviewer returns NEEDS_CHANGES, re-dispatch the implementor with feedback. Once APPROVED, move to the next task. Repeat until done.

Tip (Custom Commands)

/orchestrate is a custom slash command that encapsulates this workflow. You can create your own to codify patterns that work for you.

Once you press enter on step 5, it kicks off a fairly long execution loop. Execution only stops when permission is needed or all the tasks in qask are complete.

Summary (Key Takeaways)

Don’t just prompt and pray — planning upfront produces better results
Make tasks persistent — external storage (qask, GitHub issues) survives compaction
Use subagents — fresh context windows for each task, main agent just coordinates
Iterate on the plan — the plan document is your contract with the agent

What started as “build me an app” becomes a workflow that actually holds together.