Hey thanks for the comment and the question.
I would say my workflow for any meaningful amount of work is (all in Claude Code):
- PRD: I discuss and brainstorm with Claude Code using something like the Grill Me skill https://github.com/mattpocock/skills/tree/main/skills/produc... but that I've modified a bit for my own style, until I have a good PRD (what the goals / design decisions are for what I'm building)
--- I run this PRD through multiple AI reviews (sometimes ChatGPT Pro for really important PRDs, because it seems to have some of the best critical feedback)
--- I read the PRD myself in detail before finalizing.
- PLAN: I have Claude Code develop the plan for implementing the PRD. Again, I have this reviewed several times by CC and sometimes by other tools for effectiveness, consistency with the PRD, consistency with the codebase, and internal consisenty.
- EXECUTE: I have an orchestration command I made that has CC execute the PLAN and use a build journal, using sub-agents whenever possible to save context, so that it can operate for up to several hours.
- QUICK REVIEWS: I have these commands /review-fix-loop /quick-dual-review which loops around running a Claude+Codex sub agents review and then fixing anything critical (deferring items needing human judgment)
- CODE REVIEWS: This is when I run between one and several of the adamsreview reviews, starting with /review --ensemble, then /walkthrough, then /fix; until I am satisfied.
Would it be useful if I packaged all this stuff into a GH repo to share with you and others?
I think there's no harm in that. But I will say, it sounds very similar to my own system, and probably a ton of other people's.
Yours might very well be better than most, but the thing that's missing from all of these is evals. You, me, everyone else, we're all vibe coding up these loops, we're getting work done and feeling excited about it, excited enough that we want to share, but nobody is doing real testing or benchmarks.
That is such a great point. We do need evals for this - and not just ones that the model companies use themselves. They have to be public and sharable and easy to use ourselves.
And in terms of sharing, I agree. On one hand, so many of us are already doing this themselves. On the other hand, when I was first learning CC and agentic engineering (vibe coding at the time :) ), I did find some of these random people's templates useful.