OUR THOUGHTSAI

The AI coding partner: a generational shift in product engineering

Posted by Gareth Evans . Apr 14.25

We’ve recently witnessed a fundamental change in how software is engineered. Large Language Models (LLMs) are reshaping the landscape of product development in ways that extend beyond code completion and assistance. This shift isn’t just about productivity gains – it’s a new way of conceptualising, designing and building software products.

The traditional approach to LLMs in development has been limited. Many engineers use AI coding assistants as glorified search engines or code generators, treating them as passive tools rather than autonomous agents. Michael Feathers, in his GOTO 2024 talk, notes that we need a new mental model for working with LLMs – what he calls a ‘surfacing model’.

This ‘surfacing model’ visualises LLMs as vast networks of interconnected concepts, where mentioning a term in your prompt ‘pulls’ that concept and its associated ideas toward you. This mental model helps us use LLMs better by explaining why context management is crucial, why precise terminology yields better results and why the beginning and end of conversations are better remembered than the middle. Understanding this model enables more strategic prompting where we can deliberately surface relevant concept clusters, create custom terms that act as shortcuts to complex ideas, recognise when context becomes polluted and know when to start fresh sessions rather than continuing in a cluttered conversation. By viewing LLMs as conceptual spaces rather than black boxes, we can more effectively navigate their capabilities, limitations and idiosyncrasies to produce more consistent, accurate and creative outputs.

The key insight is that LLMs aren’t just tools for generating code – they’re collaborative thinking partners that can help us explore design spaces and alternatives.

From implementation to specification

The most significant shift we are currently witnessing is moving from implementation-focused development to specification-driven development. Geoffrey Huntley articulates in his observations of how engineers use Cursor that “The internal implementation of an application matters less now. As long as your tests pass and the LLM implements the technical steering lessons defined in your ‘stdlib’, then that’s all that matters”.

This represents a dramatic inversion of traditional software development. Instead of manually writing code to implement specifications, engineers are now defining specifications and letting AI generate implementations. Engineers then verify, test and refine based on results.

Building your AI toolkit: the standard library or ‘stdlib’ approach

A powerful emerging technique is what Huntley calls the ‘stdlib’ approach – creating a standard library of prompting rules and composing them together like Unix pipes (an approach that connects the output of one command to the input of another). Rather than treating each AI interaction as isolated, engineers can develop reusable patterns to steer AI behaviour.

In essence, the stdlib approach involves:

Creating a repository of prompting rules: Rather than approaching AI with one-off ‘implement X’ type requests, you can build a collection of rules that define how the AI should behave, what technical standards to follow and how to handle specific scenarios.
Composing rules together: These rules work like Unix pipes where you can chain them together to create complex behaviours from simple components.
Storing rules systematically: Huntley recommends keeping these rules in a .cursor/rules directory, with each rule in its own file using the .mdc extension.
Learning from mistakes: When the AI makes an error and you correct it, you update or create a new rule to prevent that specific error from happening again.
Programmatic control: Rules can include conditional logic (IF-THIS-THEN-THAT) and trigger automated actions like adding license headers or making git commits.

A key insight from Huntley is that you’re not just using the AI as a coding assistant – you’re effectively programming the AI’s implementation behaviour itself. He explains “Instead of approaching Cursor from the angle of “implement XYZ of code” you should instead be thinking of building out a “stdlib” of prompting rules and then composing them together”.

This approach creates a feedback loop where each interaction improves future interactions, gradually building a customised AI assistant that understands your codebase, follows your conventions and avoids repeating past mistakes.

This technique allows engineers to achieve consistent results by programming the LLM’s behaviour – not just requesting outputs.

Pair programming with an AI driver

Here the AI essentially becomes the pair programming partner who handles the actual typing of code. Traditional pair programming involves a ‘driver’ who writes code and a ‘navigator’ who reviews and directs at a higher level. With LLMs, humans can evolve into full-time navigators while the AI takes on the driver role.

This evolution changes the nature of the collaboration:

Focus on the what, not the how: Engineers can concentrate on defining what needs to be built – the business requirements, constraints and acceptance criteria – rather than the mechanical details of implementation.
Faster feedback loops: Unlike traditional pair programming where feedback comes after code is written, this model offers immediate generation and iteration. The human partner can quickly assess multiple approaches faster than manual coding.
Amplified expertise: The AI partner brings pattern recognition from millions of codebases. This is great for divergent thinking and exploring a solution space. The human brings domain knowledge and critical judgment. Once a preferred solution is identified, the stdlib can be used to reshape the implementation into something that is considered good based on stdlib implementation rules. This combination of divergent solution generation and fast feedback can produce solutions neither human nor AI would arrive at independently.
Reduced cognitive load: The mental burden of syntax details, API specifics and boilerplate code shifts to the AI, allowing engineers to maintain focus on architectural decisions, business logic and steering emergent design.
Continuous learning: As engineers define specifications and review implementations, both parties learn – the human develops better specification skills, while the AI (through the stdlib) learns the team’s preferences and standards.

This partnership represents a reallocation of human cognitive resources away from syntax and implementation details toward higher-level design thinking, architectural decisions and business value alignment – potentially unlocking developer productivity and satisfaction while maintaining or improving code quality.

The projection technique

Another powerful approach Feathers discusses is what he calls ‘projections’ – viewing code through different lenses to gain new insights. For example:

Converting code to mathematical notation to highlight computational patterns
Translating between programming languages to see alternative implementations
Visualising code as state machines or other diagrams

These projections provide different perspectives on the same code, helping engineers uncover insights that might not be apparent in the original form.

Multi-agent development

Another potentially revolutionary change is the emergence of multi-agent development workflows. Huntley describes a process where multiple AI instances work concurrently on different aspects of a system – “You can launch multiple sessions of Cursor concurrently and ask each copy to chew on src/ui and src/core concurrently”.

By decomposing specifications into domains and assigning different agents to different domains, teams may be able to parallelise development in ways previously impossible.

The ideation-validation loop

Both Feathers and Huntley highlight the importance of establishing an effective ideation-validation loop:

Use LLMs for broad ideation and generation of alternatives
Implement rigorous validation through testing, compiler feedback and human review
Feed validation results back to the LLM to improve future generations
Encode successful patterns as rules in your stdlib

This loop ensures quality while maintaining the creative advantages of AI-assisted development.

Implications for engineering teams

The AI partner is reshaping how engineering teams operate as the most valuable skills shift from implementation expertise to specification prompts, creating effective validation mechanisms and building reusable prompting patterns. As Huntley predicts, we’re moving toward a future where developer tooling departs from traditional IDEs toward reviewing PRs from agents autonomously working through backlogs in parallel, with engineers transforming into AI orchestrators applying more sophisticated validation techniques rather than line-by-line code authors.

Quality assurance is becoming deeply integrated into the development process itself, with layered test automation, programming languages with strong type systems and helpful compiler errors gaining advantage and automated security and compliance checks being incorporated directly into feedback loops.

Evolution of validation techniques

This shift will demand more sophisticated automated validation approaches. Fitness functions – automated tests that continuously verify a system’s adherence to architectural and design principles – provide objective measurement of whether the AI-generated code meets non-functional requirements such as performance, security and maintainability. Engineers now define these fitness functions as part of their specifications, creating guardrails that steer AI implementations toward desired architectural outcomes without micromanaging implementation details.

Layered test automation becomes increasingly crucial in this new workflow, with engineers establishing validation rules expressed as tests that protect against regressions at multiple levels. Fast-executing, property-based tests can verify the mathematical properties of functions, expanding to component tests validating business rules, integration tests confirming subsystem interactions and end-to-end tests that validate complete user journeys. This layered approach creates a safety net that allows engineers to more confidently delegate implementation to AI partners while maintaining quality standards.

Contract testing may also gain more prominence as teams define explicit interfaces between system components that AI agents must respect. By clearly specifying the contracts that define how components interact, engineers create boundaries that AI agents can work within simultaneously without breaking the overall system architecture. These contracts could become part of the specification language, allowing different parts of systems to evolve independently while maintaining compatibility.

We’re witnessing an exciting shift in how software is engineered as LLMs are embraced as collaborative partners rather than passive tools, leading to productivity improvements that were previously not possible.

As Feathers reminds us, “Humans are in the loop – the AI suggests, we decide”. But the loop is widening to encompass more possibilities than ever before. The most successful teams will be those that develop effective patterns for steering AI capabilities toward their specific engineering goals.

The challenge now is not just learning to use these tools, but reimagining our approach to software development in light of their capabilities in bringing ideas to life.

Posted by Reuben Dunn . Jun 24.25

Lately, I’ve been thinking about the concept of principles. I coach basketball based on principles rather than set plays and, like basketball, platform engineering represents complex adaptive systems.

> Read