Published at 2025-09-01
It feels like every week there’s a new AI coding tool or model promising to revolutionize how we write software. Especially since the beginning of this year, new tools come out left and right. Since the switching costs between these tools are so low, I’ve found myself trying out a lot of different options to see what works best for me.
So let me walk you through my current setup and how I actually use these tools day-to-day.
Claude Code is my daily driver. I’m on the Pro plan ($20/month) and that usually covers what I need, though I sometimes hit the limits. At that point, I’ll switch to API keys, but honestly, I haven’t felt the need to upgrade to the max plan yet (maybe that tells me to use it more?).
Codex (the cloud thing from OpenAI) is where things get interesting. This thing runs the code in a full cloud environment that spins up containers and actually runs your code. That means it can run completely autonomously - after you’ve gone through the hassle and set up its environment properly. You can give it a task, go do something else, and come back to find it’s figured out what needs to be done. Since it runs in the cloud in independent containers, it can work on multiple tasks in parallel.
For trickier problems, I’ll sometimes have Codex generate four different solutions simultaneously, then pick the best one. Sometimes they converge on the same solution, sometimes they diverge wildly. This is also helpful during the analysis phase because it might produce different perspectives on the same problem.
Crush gets occasional use, mainly when I want to test drive new models. When new models are released (sometimes stealth models on OpenRouter), Crush is excellent for quick tests. The quality isn’t always there compared to mainstream, but it’s model-agnostic, so it’s perfect for experimentation. And the development pace is rapid, so I expect it to improve quickly.
I also keep ChatGPT Pro and Claude AI subscriptions around. These are required to use Claude Code and Codex, but the web chat interfaces are also great for quick experiments or brainstorming (Special Shout-Out for Claude Artifacts).
This is also handy when I want to quickly generate a spec or a plan based on a feature request or bug report.
Underrated: Use Claude artifacts to explore the UX of a new feature, then have it write a spec sheet of the feature based on the artifact. Then use that spec sheet to have a coding agent implement the feature.
IMHO, GitHub Copilot Agent on the GitHub platform is really underrated. Not the VS Code version, but the one built directly into GitHub itself.
The workflow is one of the best UX I’ve seen yet: someone creates an issue (or I do), I assign it to Copilot, and it automatically creates a pull request. Then I interact with it through PR comments. It feels like I’m collaborating with another developer.
Its code environment can do the same things as Codex, but additionally, it has a way to run a browser and take screenshots for before/after comparisons.
The integration would be perfect if it could auto-read CI results and act on them. Imagine: Copilot makes a PR, CI fails because tests break, and instead of me having to tell it to fix the tests, it just does it automatically. I’m sure this is coming eventually.
This pairs beautifully with error tracking too. We use Sentry, and when a bug pops up, I can create a GitHub issue directly from the error report in Sentry. For simple bugs — wrong typecasts, silly mistakes — I just paste the error, add a quick note about where I think the problem is, and let Copilot handle the rest. Ten minutes later, I’ve got a PR with a fix. All without even opening my IDE.
The Model-Context-Protocol has become the de-facto standard for interacting with external tools.
The one MCP I use regularly is context7 which provides up to date documentation for most programming languages and libraries.
The Laravel MCP is something I’m looking forward to trying.
My approach varies depending on the complexity of what I’m building. For small fixes where I know exactly what’s broken, I’ll just write direct instructions: “look at this file, something’s wrong here, probably need to do X and Y.”. The risk here is low, and results are often fast, so I let the AI run with it. If it’s wrong, I can just throw it away and try again.
This is true for all coding agents I use.
For bigger features or mysterious bugs, I always start with a plan. I’ll paste in a feature request or bug report and say “analyze the codebase and write me a plan for how to fix this.”. Then, I review the plan, make any adjustments, and only then do I tell it to implement the plan.
Since the plan is a markdown document, it’s portable! Maybe I’ll use Codex for the analysis phase because it can run autonomously, then take the resulting plan and implement it with Claude Code because I want to test things locally and see immediate feedback. Or use a different Agent to refine the plan.
This makes the planning phase more important. The more detailed and clear the plan, the easier the implementation is for the coding agent.
Git worktrees complement the workflow nicely.
They let me have multiple checkouts of the same repo, so I can work on several features simultaneously.
I keep one worktree on the main
branch, then create others for different features.
While one agent is churning away for 10 minutes on feature A, I can be working on feature B.
Test-driven development works really well with AI. Write the test first, then let the model implement the feature to make it pass. This is where agents like Claude Code really shine, because they can run the tests, change the implementation, run the tests again, until it passes.
Just make sure to tell it not to change the tests, otherwise it might just delete them if they fail.
The biggest change with how these tools work is that you’ll need to spot when they are on the wrong track. It’s a lot more like guiding a junior developer who has memorized every language and library out there.
The AI coding space is moving fast. New tools come and go, and models improve rapidly. Claude Code and Codex are probably here to stay for a while.
At some point in the future, the VC-subsidized frenzy will die down, and the tools will need to start charging more realistic prices. We saw this recently with Cursor’s pricing changes.
I hope that local models will be good enough by then to replace cloud-based tools eventually, but we’re not there yet. This is the great thing about tools like Crush that let you experiment with different models easily (including local ones).
Because the switching costs are so low, I can keep trying new things without much of a hassle.
I’m curious to see how this space evolves and what my tool stack looks like in a few months.