Post by Ketan U.
Co-Founder and CEO at Union.ai | Flyte.org
I’ve been using Claude Code quite a bit lately, and overall I like it. I’ve gone fairly deep; sub-agents, skills, goal-oriented workflows, and I feel it has improved a lot in terms of pushing toward an end result (i guess what we call long horizon tasks). That said, I’m running into some consistent challenges. 1. Code quality. I often see repeated blocks, oddly structured files, and decisions that make the code harder to reason about and maintain. I’ve tried steering it heavily and even added sub-agents to review for readability, correctness, and maintainability, but the results are still mixed. 2. Assumptions. Even with spec-driven development, the initial spec can look fine, but once you read the generated code closely, you start finding strange edge cases and incorrect interpretations. I’m working on a large distributed system with explicit safety and liveness properties, and I ended up spelling those out very clearly, but this still continues to drift. 3. Side effects. Sometimes the implementation “works” and technically meets the goal, but introduces issues like unnecessary contention, extra database writes, or other inefficiencies that only become obvious if you really understand the system. Where I’ve landed: AI-assisted coding is a fantastic tool if you already know what you’re doing. It helps you move faster, but it doesn’t replace judgment. If you generate code and just ship it, you’ll almost certainly end up with something brittle and hard to maintain. Code reviews and specs alone aren’t enough either, especially for complex systems. For context, all of this was in Go. A type-safe, compiled language with fast build times makes iteration easier, but it doesn’t magically solve these problems. Curious how others are navigating this tradeoff between speed and long-term quality.