AI-Powered Development: Months of Iteration | Blog

After several months of experimenting with AI-assisted development, I’ve finally settled into a workflow that feels natural, efficient, and actually sustainable. Like many developers exploring this space, I started with enthusiasm, tried every tool and methodology I could find, and eventually circled back to something simpler and more aligned with how I actually think and work.

This post is a reflection on that journey and a detailed look at the setup I’ve landed on. If you’re working with AI coding assistants and feeling overwhelmed by the options, or if you’re curious about what a practical, day-to-day AI development workflow looks like beyond the hype, this might resonate with you.

The Foundation: Windows, WSL2, and VSCode

Let me start with the basics. I work in a Windows environment, but like many modern developers, I’ve found that WSL2 (Windows Subsystem for Linux) gives me the best of both worlds. I can use Windows for my daily computing needs while having access to a full Linux environment for development. This setup has become incredibly stable over the years, and it handles the demands of AI-assisted development without breaking a sweat.

VSCode serves as my primary editor, which probably doesn’t surprise anyone. It’s become the de facto standard for a reason, the extension ecosystem is unmatched, the performance is solid, and the integration with WSL2 is seamless. I can edit files in my Linux filesystem with the full power of a native Windows application, and everything just works.

But here’s where my setup diverges from what you might expect: despite using VSCode, I don’t primarily rely on its AI extensions. Instead, I’ve gravitated toward using the Codex CLI for the bulk of my AI-assisted coding work.

Why the CLI? Understanding My Tool Choice

When I first started exploring AI coding assistants, I naturally installed the VSCode extensions that everyone was talking about. They’re convenient, they’re integrated, and they work reasonably well for many use cases. But after several weeks of use, I found myself increasingly drawn to the Codex CLI instead.

The CLI interface is simply more advanced and capable for the kind of work I do. It gives me more granular control over context, better handling of multi-file operations, and more sophisticated reasoning about my codebase. When I’m working on a complex feature that spans multiple files and requires understanding of intricate dependencies, the CLI’s ability to maintain that context and reason about it holistically makes a significant difference.

There’s also something to be said for the mental model of using a CLI tool. It creates a clearer separation between “I’m writing code” and “I’m collaborating with AI on code.” With an inline extension, these boundaries blur in ways that sometimes made me feel like I was fighting the tool rather than working with it. The CLI feels more like pair programming with a colleague, there’s a conversation happening, proposals being made, and I’m clearly the one deciding what gets implemented.

That said, VSCode remains essential to my workflow. I use it for editing, debugging, and all the traditional IDE tasks. The Codex CLI and VSCode complement each other beautifully, they each do what they’re best at, and I switch between them fluidly throughout my day.

The Workflow: From Idea to Implementation

Over the past month, my workflow has crystallized into four distinct phases. Each one leverages AI in different ways, and understanding when and how to use AI in each phase has been key to maintaining both productivity and code quality.

Phase 1: Brainstorming and Ideation

Every feature or project starts with brainstorming, and this is where AI has proven surprisingly valuable. I used to keep these sessions entirely in my head or in scattered notebook entries, but now I use AI as a sounding board for initial ideas.

The beauty of this phase is its low stakes. I’m not committing to code or architecture decisions yet; I’m just exploring possibilities. I’ll describe what I’m trying to build, the constraints I’m working within, and the problems I’m trying to solve. The AI helps me think through edge cases I might not have considered, suggests alternative approaches, and sometimes points out potential issues before I’ve written a single line of code.

What I appreciate most about using AI for brainstorming is that it’s infinitely patient with half-formed thoughts. I can ramble, contradict myself, change direction mid-sentence, and the AI simply rolls with it. It’s like having a colleague who never gets frustrated when you’re still figuring out what you’re trying to say.

This phase is purely conversational. I’m not generating code, I’m not making commitments, I’m just thinking out loud with something that can think back at me. By the end of a good brainstorming session, I usually have a much clearer picture of what I actually want to build and a few different approaches to consider.

Phase 2: Planning and Task Definition

Once I have a clear direction, I move into planning. This is where my workflow has evolved the most over the past month, and where I learned some important lessons about the difference between what sounds good in theory and what actually works for me in practice.

Initially, I tried using formal specification tools like SpecKit and BMAD. These tools promise structured, comprehensive planning with detailed specifications and clear acceptance criteria. In theory, they sound perfect, who wouldn’t want complete, unambiguous specs for their features?

In practice, I found them far too verbose for my needs. I’d spend an enormous amount of time writing specifications that felt more like documentation for an enterprise project than planning for actual implementation. The overhead was killing my momentum, and I’d finish a planning session feeling drained rather than energized about building.

So I simplified. Radically.

Now I use either backlog.md or vibe-kanban, depending on the project. What drew me to these tools is their philosophy: they’re full-featured task management systems, complete with kanban boards, milestone tracking, and task organization, but built on markdown and designed to stay out of your way rather than impose rigid workflows.

Backlog.md is particularly interesting because it gives you a proper task management system while keeping everything in readable, editable markdown files. I can organize tasks into sprints or milestones, move items between stages (backlog, in progress, done), add priorities and tags, and even track time estimates. But unlike heavyweight project management tools, I’m never locked into a particular structure. If I want to reorganize my milestones, create custom statuses, or completely restructure how I’m tracking work, I can do it instantly because it’s all just markdown under the hood.

The kanban view provides the visual organization I need to see what’s in flight and what’s coming next. The milestone tracking helps me group related tasks and maintain focus on larger goals. But there’s no ceremony, no mandatory fields, no fighting with a UI that assumes I work a certain way. I define the structure that makes sense for my project, and the tool adapts to me rather than the other way around.

Vibe-kanban takes a similar approach with its own spin on lightweight, markdown-based task management. It excels at giving you that visual kanban experience while maintaining the simplicity and control of working with plain text files. I can drag tasks between columns, nest subtasks, set due dates, and track progress, all the functionality you’d expect from a proper task manager, but without the bloat of enterprise tools that assume you’re managing a team of fifty.

The key insight here was realizing that I needed real task management functionality, kanban boards, milestone tracking, the ability to break down work and track progress, but I didn’t need the complexity and rigidity of traditional project management tools. These markdown-based systems give me the structure and features I need while maintaining the flexibility and control that keeps me productive.

When I work with AI during this phase, I use it to help break down larger features into implementable tasks. The AI is good at thinking through dependencies and suggesting a logical order of implementation. I can ask it to suggest milestone groupings or help me estimate the scope of different tasks. But the final task organization lives in my chosen system, where I maintain complete control and can iterate on the structure as I learn more about the project.

Phase 3: Implementation with MCP Servers

The implementation phase is where everything comes together, and it’s where my use of MCP (Model Context Protocol) servers really shines. I learned quickly that not every available tool needs to be in my toolkit. In fact, loading up too many MCP servers just adds noise and confusion.

I’ve settled on a small, carefully chosen set of MCP servers that each serve a specific purpose:

Codanna handles code inspections and quality checks. When I’m implementing a feature, Codanna can analyze the code for potential issues, suggest improvements, and help ensure I’m following good practices. It’s like having a senior developer doing code review as I write, catching problems before they make it into a commit.

Context7 provides access to framework documentation. Rather than constantly switching to a browser to look up API references or check how a particular framework feature works, I can query Context7 directly through the CLI. This keeps me in flow and ensures the AI has access to accurate, up-to-date information about the frameworks I’m using.

Playwright gives me browser control, which is essential for web development. I can automate testing, take screenshots, verify functionality, and even debug issues in real browsers without leaving my development environment.

Finally, I have MCP servers connected to my task management tools, whether that’s a custom server for my backlog.md or an integration with vibe-kanban. This lets the AI understand what I’m working on, check off completed tasks, and help me stay organized without manual context switching.

The restraint in this selection is deliberate. Early on, I had nearly a dozen MCP servers loaded, convinced that more tools meant more capability. Instead, it just meant more confusion, both for me and for the AI, which would sometimes suggest using the wrong tool or get overwhelmed by options.

With this focused set of tools, the AI can be genuinely helpful. It knows what’s available, knows what each tool does, and can make smart decisions about when to use them. The implementation phase becomes a smooth collaboration where the AI understands both my code and my workflow.

Phase 4: Memory and Verification

One of the challenges I encountered early on was that AI has no persistent memory of what we’ve done in previous sessions. Each conversation starts fresh, which is great for privacy but terrible for multi-day projects where context matters.

My solution is simple but effective: I maintain a markdown file that tracks the current technical progress. Think of it as a journal for the project, but written for the AI’s benefit as much as mine. After each significant session, I update this file with what we accomplished, decisions we made, and any important context that would be expensive to regenerate.

This file might include notes like “Implemented authentication using JWT with refresh tokens. Token expiry is 15 minutes, refresh is 7 days. Tokens stored in httpOnly cookies.” Or “Decided against using Redux for state management, project is small enough that Context API is sufficient.” These are the kinds of details that would take paragraphs to re-explain but are crucial for the AI to give good advice in future sessions.

When I start a new session, I simply provide this markdown file as context. The AI immediately understands where we are, what we’ve built, and why we made certain decisions. It’s like bringing a colleague up to speed with a quick status update rather than a lengthy meeting.

The other critical piece of this phase is testing. I write tests not just for the usual reasons, ensuring code works, preventing regressions, but also because tests are something the AI can run and verify. I can ask the AI to implement a feature, have it write or update tests, run those tests, and get objective feedback on whether the implementation is correct.

This verification loop is crucial. Without it, I’d need to manually test every AI-generated code suggestion, which would slow me down significantly. With good test coverage, I can move faster while maintaining confidence that things actually work.

Tests also serve as a form of specification that’s harder to misinterpret than English. I can tell the AI “make sure all these tests pass” and have much higher confidence in the result than if I’d just described the desired behavior in prose.

Reflections and What I’ve Learned

A month isn’t a long time in the grand scheme of things, but it’s been long enough to move past the initial excitement and hype and start understanding what actually works. A few key lessons have emerged:

Simplicity beats features. The most sophisticated tools aren’t always the most useful. The lightweight, flexible solutions that adapt to my thinking often outperform the comprehensive, structured ones that try to impose their own methodology.

The CLI is underrated. While everyone talks about inline coding assistants and IDE extensions, the CLI interface has proven more powerful and flexible for my needs. It’s worth trying even if it seems less convenient at first.

Context is everything. Whether it’s through MCP servers, markdown memory files, or test suites, giving the AI the right context is far more important than having the most advanced model. A slightly older model with perfect context will outperform the latest model working blind.

Less is more with tools. Curating a small set of well-understood tools beats having every possible option available. It reduces cognitive load and helps the AI make better decisions.

Looking ahead, I’m sure this workflow will continue evolving. New tools will emerge, I’ll discover better approaches, and my needs will change as I work on different types of projects. But I feel like I’ve found something sustainable, a workflow that enhances my productivity without trying to replace my judgment or push me into somebody else’s idea of how development should work.

If you’re building your own AI-assisted development workflow, I hope some of these ideas spark thoughts about what might work for you. The key is experimentation, willingness to change what isn’t working, and focusing on tools and processes that genuinely make you more effective rather than just more busy.

What’s your AI development workflow look like? I’d love to hear what’s working for you, feel free to reach out and let me know.