It's designed around Claude Code but the ideas are tool-agnostic. I've been a computer science researcher and full-stack software engineer for 25 years, working mainly in startups. I've been using this approach on my personal projects for a while, then, when I decided to package it up as scaffold for more easy reuse, I figured it might be useful to others too. I published it under Apache 2.0, fork it and make it yours.
You can easily try it out: follow the instructions in the README to start using it.
The problem it solves:
AI coding agents are great at writing code, but they work much better when they have clear context about what to build and why. Most projects jump straight to implementation. This scaffold provides a structured workflow for the pre-coding phases, and organizes the output so that agents can navigate it efficiently across sessions.
How it works:
Everything lives in the repo alongside source code. The AI guidance is split into three layers, each optimized for context-window usage:
1. Instruction files (CLAUDE.md, CLAUDE.<phase>.md): always loaded, kept small. They are organized hierarchically, describe repo structure, maintain artifact indexes, and define cross-phase rules like traceability invariants.
2. Skills (.claude/skills/SDLC-*): loaded on demand. Step-by-step procedures for each SDLC activity: eliciting requirements, gap analysis, drafting architecture, decomposing into components, planning tasks, implementation.
3. Project artifacts: structured markdown files that accumulate as work progresses: stakeholders, goals, user stories, requirements, assumptions, constraints, decisions, architecture, data model, API design, task tracking. Accessed selectively through indexes.
This separation matters because instruction files stay in the context window permanently and must be lean, skills can be detailed since they're loaded only when invoked, and artifacts scale with the project but are navigated via indexed tables rather than read in full.
Key design choices:
Context-window efficiency: artifact collections use markdown index tables (one-line description and trigger conditions) so the agent can locate what it needs without reading everything.
Decision capture: decisions made during AI reasoning and human feedback are persisted as a structured artifact, to make them reviewable, traceable, and consistently applied across sessions.
Waterfall-ish flow: sequential phases with defined outputs. Tedious for human teams, but AI agents don't mind the overhead, and the explicit structure prevents the unconstrained "just start vibecoding" failure mode.
How I use it:
Short, focused sessions. Each session invokes one skill, produces its output, and ends. The knowledge organization means the next session picks up without losing context. I've found that free-form prompting between skills is usually a sign the workflow is missing a piece.
Current limitations:
I haven't found a good way to integrate Figma MCP for importing existing UI/UX designs into the workflow. Suggestions welcome.
Feedback, criticism, and contributions are very welcome!
How can you tell if your prompt process works? I feel like the outputs from SDLC process are so much more high level than could be done with evals, but I am no eval expert.
How would you benchmark this?
I feel the problem of token wasting a lot, and actually that was the first reason I had to introduce a hierarchy for instructions, and the artfact indexes: avoid wasting. Then I realized that this approaches helped to keep a lean context that can help the AI agent to deliver better results.
Consider that in the initial phase the token consumption is very limited: is in the implementation phase that the tokens are consumed fast and that the project can proceed with minimal human intevenction. You can try just the fist requirement collection phase to try out the approach, the implementation phase is something pretty boring and not innovative.
- /tasks:capture — Quick capture idea/bug/task to tasks/ideas/
- /tasks:groom — Expand with detailed requirements → tasks/backlog/
- /tasks:plan — Create implementation plan → tasks/planned/
- /tasks:implement — Execute plan, run tests → tasks/done/
- /tasks:review-plan — Format plan for team review (optionally Slack)
- /tasks:send — Send to autonomous dev pipeline via GitHub issue
- /tasks:fast-track — Capture → groom → plan → review in one pass
- /tasks:status — Kanban-style overview of all tasks
Workflow: capture → groom → plan → implement → done (with optional review-plan before implement, or send for autonomous execution).It's currently in the coding phase, so the requirements definition and the design phase is done.
You can see the repo yourself, the most interesting artefacts generated are from the ojectives phase and are indexed in this file: https://github.com/pangon/local-TTS-web-app/blob/main/1-obje...
Another interesting output of the scaffold skill is the execution plan, organized in phases with milestones, where the new capabilities delivered can be tested after completing a phase: https://github.com/pangon/local-TTS-web-app/blob/main/3-code...
I feel that my scaffold is more adherent to old-style waterfall, for example it begins with the definition of the stakeholders, and take advantage of the less adopted practice to maintain assumptions and constraints, not just user stories and requirements.
A big difference is that I have introduced decisions, that are not just design decision, but also coding decisions: after the initial requirement elicitation phase whenever the agent needs to decide on approach or estabilish a pattern, that is crystallised in a decision artifact, and they are indexed in a way that future coding sessions will automatically inject the relevant decisions in their context. Another difference is that when using the scaffold you can tell high level goals, and if the project is complex enough the design will propose a split in multiple components. Every component can be seen as a separtate codebase, with different stack and procedures. In this way you obtain a mono-repo, but with a shared requirement/design that helps a lot in the change management, because sometime changes will affect several components, and without the shared requirements and design it will be pretty hard to automate.
The piece I keep running into with solo builders is that even when they have a good structure, the failure mode is trusting Claude's output too uniformly — treating fast generation as a proxy for correctness. The code looks clean, tests pass, ships fine... and then a month later nobody (including you) can reason about it because the decisions that shaped it never got persisted anywhere. Your decision capture artifact solves exactly that.
One thing I've been exploring from a complementary angle: rather than scaffolding the SDLC, building a clearer internal model of how Claude reasons — where it's reliable vs. where it needs human review gates. Working on a free starter pack around this (panavy.gumroad.com/l/skmaha) — would be curious if any of this maps to patterns you've seen with the scaffold.