We use Claude Code to build xeve — a personal analytics platform with a Next.js web dashboard, native macOS and Windows apps, an iOS companion, and several integrations. In one session, Claude Code shipped a full light mode (155 files), built and published an MCP server to npm, fixed app tracking bugs, and debugged broken CI pipelines. It also made every category of mistake an AI coding assistant can make.
This is not a hit piece. Claude Code is genuinely good — it does things in minutes that would take hours manually. But the mistakes it makes are instructive. They reveal a pattern: AI is excellent at generating code and terrible at verifying the consequences of that code across system boundaries. Understanding this pattern makes you a dramatically better user of AI coding tools.
Mistake 1: Fixing One Side of a Two-Sided Problem
We asked Claude Code to fix "loginwindow" showing as the #1 tracked app on xeve. The Mac tracker was recording macOS's lock screen process as app usage, inflating screen time by 8+ hours.
Claude Code found the issue quickly. The macOS app's handleAppSwitch() already had a guard against loginwindow, but old sessions were in the database. It added filters in three places in the Swift code: the Supabase fetch, the live session aggregation, and the menu bar display. Good fix. We committed, pushed, tagged a release.
Then we refreshed the web dashboard. loginwindow was still there — 8 hours and 16 minutes, right at the top. The web dashboard queries Supabase directly. Claude Code fixed the data source (Mac app) and the Mac app's display, but completely forgot that the same data is read by a separate application (the web dashboard) with its own queries.
The pattern: AI tends to fix the specific component you point it at. It does not automatically think "what else reads this data?" or "what other surfaces display this information?" It solves the problem locally and moves on.
How to catch it: After any data-layer fix, ask explicitly: "What other applications or pages query this same data?" Force the AI to enumerate all consumers before declaring the fix complete. Better yet, show it a screenshot from the other surface — Claude Code responded to our dashboard screenshot instantly and fixed the remaining 20 files.
Mistake 2: The Self-Referencing Variable
During the light mode migration, the overview page had a variable called topApps (data from the database). We needed to create a filtered version that excluded system processes. Claude Code created filteredTopApps and then needed to replace all references to topApps ?? [] with filteredTopApps throughout the file.
It used replace_all to swap every instance of topApps ?? [] with filteredTopApps. Including the one inside the definition of filteredTopApps itself:
// Before (correct)
const filteredTopApps = (topApps ?? []).filter(...)
// After replace_all (broken)
const filteredTopApps = (filteredTopApps).filter(...)
TypeScript caught it: "implicitly has type 'any' because it is referenced directly or indirectly in its own initializer." But the error message is confusing enough that it takes a moment to understand what happened.
The pattern: Bulk find-and-replace operations do not understand scope. The AI applied a rule ("replace X with Y everywhere") without realizing that one instance was the definition site, not a usage site. This is the same class of bug as a regex that matches too broadly.
How to catch it: Be skeptical of replace_all operations. Ask the AI to show you every instance it plans to replace before executing. Or better: after a replace_all, immediately do a build. TypeScript and your compiler are your safety net. Claude Code did catch and fix this itself after the build failed — but the mistake should not have happened in the first place.
Mistake 3: Agents Making Inconsistent Changes
The light mode migration touched 155 files. To parallelize the work, we spawned multiple AI agents — one batch handling chart components, another handling special cases, a third handling org components. Each agent worked independently on its assigned files.
One agent renamed a constant from COLORS to BASE_COLORS in ProductivityBar.tsx — a reasonable change since we were adding a useThemeColors() hook that returns a colors object. But it only renamed the constant declaration and some references, not all of them. The component had two arrays using COLORS.productive — one got updated, one did not.
The build failed with "Cannot find name 'COLORS'. Did you mean 'colors'?" — a straightforward error but one that came from the gap between what two parallel agents did independently.
The pattern: When you parallelize AI work across multiple agents, each agent has a limited view. Agent A renames a constant. Agent B (or even Agent A at a different point in execution) references the old name. There is no coordination mechanism — each agent makes locally reasonable decisions that conflict globally.
How to catch it: Build after every agent completes. Do not batch all agent work and build once at the end. Run next build or tsc after each agent returns. The cost is a few minutes of build time. The alternative is debugging mysterious errors caused by inconsistent partial changes.
Mistake 4: Documenting Features That Do Not Exist Yet
Claude Code built the MCP server, wrote the README, and published to npm. The README said: "Get your access token from xeve.io/dashboard/settings → API Access." The blog post said the same thing.
There was no API Access section on the settings page. It did not exist. We had not built it. Claude Code wrote documentation for a feature that was in its mental model of what should exist, but was not actually implemented.
When we asked "where do users get the API key?", Claude Code immediately realized the gap and built the settings section — access token with show/hide/copy, Claude Desktop config snippet, Claude Code one-liner. It took five minutes. But if we had not asked, every early user would have followed the README to a dead end.
The pattern: AI generates documentation that describes the ideal state, not the actual state. It writes what should be true, not what is true. This is especially dangerous for setup instructions where a missing step means the user cannot use the feature at all.
How to catch it: After the AI writes a README or setup guide, walk through every step yourself. Every "go to X" instruction should have a working X. Every "run this command" should actually work. Treat AI-generated docs the same way you treat AI-generated code: verify before shipping.
Mistake 5: Not Checking CI After Tagging
We tagged macos-v0.4.3 for the loginwindow fix and moved on. It was not until the user checked Sparkle auto-updates and saw "You're up to date — xeve 0.4.0" that we realized something was wrong. Versions 0.4.1, 0.4.2, and 0.4.3 had all failed to build on CI.
Claude Code did not check whether previous releases had succeeded. It tagged a new version on top of three failed ones without looking at GitHub Actions history. A human developer who regularly ships releases would instinctively check CI status — it is muscle memory. AI does not have muscle memory.
The root cause was a Supabase Swift SDK update that changed .eq("key", value) to require a value: argument label. Three characters broke three releases. The fix was trivial. The failure to notice was not.
The pattern: AI operates in the current moment. It does not check the health of systems it is about to interact with. It will tag a release without checking if the last five releases succeeded. It will push code without checking if the deployment pipeline is healthy. It solves the task in front of it and does not look sideways.
How to catch it: Before any release, ask the AI: "Check the status of recent CI builds for this workflow." Make it a habit. Add it to your CLAUDE.md as a convention. Better yet, set up notifications for CI failures so you catch them in real time instead of discovering them a week later.
Mistake 6: The Wrong CSS Pattern for Next.js
For the anti-FOUC (flash of unstyled content) script, Claude Code put a <script> tag inside <head> in the root layout. This is a reasonable pattern in plain HTML. In Next.js App Router, it caused a cryptic webpack error during static generation: "Cannot read properties of undefined (reading 'call')."
The error message gave no indication that the <head> tag was the problem. Claude Code stashed all changes, rebuilt from the original to confirm the error was from our changes, and then tried replacing <head> with Next.js's <Script strategy="beforeInteractive">. That worked.
The pattern: AI applies patterns from its general training that are correct in one context but wrong in the specific framework you are using. A <head> tag with dangerouslySetInnerHTML is valid React. It is problematic in Next.js App Router. The AI does not always distinguish between "valid code" and "code that works in this specific framework version."
How to catch it: When you get a cryptic framework error after an AI change, isolate by reverting. Claude Code's instinct to git stash and rebuild from the original was correct — it identified the problem in one step. Framework-specific bugs require framework-specific knowledge, and AI's knowledge has gaps at the edges where frameworks behave differently from vanilla React/Node/Swift.
The Meta-Lesson: AI as a Junior Developer With Superhuman Speed
Every mistake on this list is a mistake a junior developer would make. Fixing one side of a bug. Regex that matches too broadly. Not checking CI. Writing docs for unbuilt features. These are not intelligence failures — they are experience failures.
The difference is speed. A junior developer would take a week to migrate 155 files, and the mistakes would be spread across days of PRs. Claude Code does it in an hour, and all the mistakes happen at once. You get the output of a week compressed into a session, including all the bugs that week would have contained.
This means the human's role shifts. You are not writing the code anymore — you are reviewing it at 10x speed. The skills that matter are:
- Verification instincts — "build after every change" is the #1 habit. If you build after each AI action, you catch errors in seconds instead of debugging them in compound later.
- System thinking — "what else does this affect?" The AI thinks locally. You need to think about the system: other applications that read this data, other pages that display this information, other pipelines that depend on this build.
- Screenshot debugging — showing the AI what the user actually sees is 10x more effective than describing the problem. A screenshot of loginwindow at the top of the dashboard communicated the problem instantly.
- Explicit checklists — tell the AI to check CI status before tagging. Tell it to verify every link in a README. Tell it to build after each agent finishes. These are things experienced developers do automatically. AI needs to be told.
The best AI-assisted development workflow is not "tell the AI what to do and trust the result." It is "tell the AI what to do, verify the result, show it what is wrong, and iterate." The AI is fast. You are correct. Together, you ship things that neither could do alone.