Loop Engineering Explained: A New AI Programming Paradigm for Automating the Software Development Loop

Highlands

From Manual Prompting to System Architecture: A New Paradigm in AI Programming

In June 2026, a pivotal conceptual shift occurred in the AI programming landscape. On June 7, developer Peter Steinberger publicly stated: “You should not be prompting the programming Agent anymore. You should be designing the loop system that prompts the Agent.” A few days later, Boris Cherny, founder of Anthropic’s Claude Code, echoed a similar sentiment, noting that his work had transformed into “writing loops” that prompt the AI and decide on the next action. This concept, dubbed “Loop Engineering,” is elevating the role of engineers from direct task executors to architects of automated systems.

A look back at the evolution of AI programming tools reveals a clear path of abstraction:

2023: Code Autocompletion, where AI acts as an assistant and humans are in the lead.
2024: Prompt-Driven Generation, where humans guide AI to complete coding tasks through prompt engineering.
2025: Parallel Agent Collaboration, where humans act as project managers, coordinating multiple AI sessions.
2026: Loop Engineering, where humans design and maintain an autonomous workflow that includes AI agents.

Boris Cherny’s work provides a compelling case study: in the Claude Code project he leads, 100% of the code in 259 merged pull requests (PRs) was written by AI. This demonstrates that by building efficient loop systems, AI can autonomously handle large-scale coding tasks.

Dissecting Loop Engineering: The Six Core Technical Components

A fully functional Loop Engineering system is not a simple script but is composed of six interconnected core components that collectively ensure its autonomy, stability, and effectiveness.

1. Automation: The System’s Trigger and Heartbeat

Automation is the starting point of the loop system. The system must be triggered automatically, not manually. Triggers can be time-based (e.g., a cron job at 9 AM daily) or event-based (e.g., a CI/CD pipeline failure, a new GitHub Issue). This mechanism guarantees the system’s autonomy, distinguishing it from one-off manual scripts. In existing tools, Claude Code’s /loop command and Codex’s Automations feature embody this idea of automating the “task discovery” and “task initiation” processes.

2. Work Trees: Isolated Environments for Parallel Tasks

A significant technical challenge when multiple AI agents work in parallel is file access conflicts. If two agents modify the same file simultaneously, it can lead to corrupted code or overwrites. The solution is to create a separate workspace for each agent or task. The worktree feature in the Git stack offers an ideal implementation, allowing multiple independent working directories to be checked out from the same repository. Each agent makes changes in its own worktree and then merges the changes back into the main branch, ensuring the isolation and safety of parallel tasks. Production experience shows that failing to properly manage and clean up these temporary work trees is a primary cause of chaos.

3. Skill Library: Persistent Project Knowledge for Agents

A newly initiated AI agent often lacks project-specific context, such as coding standards, build commands, commit message formats, or design patterns. It would be extremely inefficient to provide this information repeatedly in prompts for every task. A Skill Library stores this project knowledge in a structured, persistent way. The AI agent can then call a specific skill by name when needed, rather than relying on a massive and hard-to-maintain initial prompt. The Anthropic team, for instance, maintains a CLAUDE.md file of about 2500 tokens that documents common error patterns, code styles, and architectural decisions, which is continuously updated as the project evolves.

4. Connectors: The Interface for External Toolchains

An AI agent that can only read and write local files has very limited applications. A truly powerful loop system needs to interact with the outside world—for example, updating a task status in Jira, sending a notification in Slack, querying a database, or submitting a pull request to GitHub. Connectors provide standardized interfaces for these interactions. Designs like the Model Context Protocol (MCP) aim to give AI agents the ability to call external tools (APIs), allowing them to integrate deeply into existing development workflows rather than just being isolated code generators.

5. Sub-Agents: Applying the “Separation of Concerns” Principle

A crucial but often overlooked principle in loop design is the “separation of concerns,” meaning the agent that generates code (the “maker”) and the agent that verifies it (the “checker”) must be independent. Models exhibit a “cognitive bias” towards their own output, making it difficult for them to effectively spot their own logical errors or flaws. Therefore, the best practice is to set up a separate checker agent (ideally using a different underlying model) that evaluates the maker’s output against independent criteria (e.g., unit tests, static analysis, code style checks). Issues found by the checker are fed back to the maker as new instructions, forming a “generate-verify-correct” closed loop until the output passes all checks.

6. Persistent Memory: State Management Across Sessions

Large Language Models are inherently stateless; they cannot retain memory between separate sessions. This means a long-running loop task, such as processing a backlog of issues, must store its intermediate state outside the model’s context. This persistent memory can be a simple Markdown file (like TODO.md), a project management board (like a Linear board), or a knowledge graph. The loop reads the current state upon startup and writes back the updated state upon completion. This external memory mechanism ensures task continuity, allowing the loop system to resume from where it left off even after interruptions, enabling autonomous work over days or weeks.

Practical Challenges and Risk Management

While Loop Engineering holds great promise, its practical application comes with three major risks that require careful management.

Runaway Costs: A loop without a clear termination condition will run indefinitely, leading to a sharp increase in token consumption and compute costs. The solution is to design quantifiable exit conditions, such as “all unit tests pass and code style checks show no errors,” before launching the loop. It is advisable to start testing with a longer trigger cycle to understand the cost model before accelerating iteration.
False Sense of Completion: An unsupervised system might mark a task as “complete” without performing any substantive validation. This highlights the necessity of an independent checker agent. The checker’s core responsibility is to determine whether the result “looks right” or “is right,” ensuring the final quality of the delivered task.
Silent Degradation of the Codebase: The speed at which a loop system merges code can far exceed human review capacity. Over time, even if all automated tests pass, the development team may gradually lose control over the actual state of their codebase. Automated tests can only ensure code conforms to predefined rules; they cannot replace human judgment on system architecture, business logic, and long-term maintainability. Therefore, conducting spot checks or full reviews of critical AI-merged code is a necessary step to maintain a healthy codebase.

Implications for the Future of Engineering

The rise of Loop Engineering signals a shift in the value proposition of software engineers. In the future, the most effective engineers will not be those who write the most code, but those who are best at designing, building, and maintaining reliable AI agent systems. This does not mean coding skills will become obsolete, but it does emphasize a higher level of abstraction—the ability to automate and systemize the very act of “prompting AI.”

In this new paradigm, human engineering judgment becomes even more critical. Deciding when to delegate and when to intervene, designing reasonable validation standards, and taking ultimate responsibility for the entire system’s output all require engineers to have stronger systems thinking and architectural skills. This is not technology replacing humans, but technology amplifying human intellect and judgment to a new level.