I got 3 parallel agents to change 149 files with 17 errors instead of 500

So I have been coding with agents for what has been way too long at this point and ultimately you always get to a point where your coding agent will just cast any, make up new things, aka write slop.

The actual code for this is mostly what I experiment with to basically scale this up - but if you prompt your agent right you can literally use it as a simple prompt in your repo today, I personally use it in Antigravity as a workflow:

Løp Workflow: [truncated 10K chars, look in Biolab Repo]

So… How can I use it?

Step 1: Open your repo, I personally advice using a capable model for this step since I have seen a lot of laziness from GLM 4.7 and I assume smaller models will behave similar - but to put it bluntly: Send this workflow to Opus tell it to define „Slices“ for your codebase and write the /specs for each, tell it to continue until its finished

Step 2: Choose a spec and send this prompt: @your workflow file / your workflow context whatever

Lets do a digestable slice of improvements and code to spec alignment! - Review the projects.spec - Review all related components comprehensively assessing the current implementation and code based on REAL code reads - based on the spec sheet and the code - compare the both and write a implementation plan to address uncovered gaps functionality wise or otherwise and compile refinement/improvement/nexxttasks - Review the newly added code, test compilation / no new errors and update the spec to reflect the latest REAL code state, report to me how well it meets the Specs and line out next steps Keep up: spec <-> code? Review the spec, REVIEW all related code. Keep both in sync. Please fill all identifyable gaps and address tasks.

Step 3: You basically just loop until the Specs and Code align. You will notice that the agent will tell you „the spec and code are aligned“ instead of engineering the F* out of your code

Step 4: You know have functional slices of your codebase and you can now take the entirety of your specs (its not that much) -> send it to an SOTA LLM -> „What gaps are in my Spec“

Step 5: Take the Gap and fill it, I use this prompt:

[Put the task here]

- Review the relevant spec in /frontend/specs/ and - Review all related components comprehensively assessing the current implementation and code based on REAL code reads - based on the spec sheet and the code - assess the validity of the task and formulate an implementation plan - Review the newly added code, test compilation / no new errors and update the spec to reflect the latest REAL code state, report to me how well it meets the Specs and line out next steps Keep up: spec <-> code? Review the spec, REVIEW all related code. Keep both in sync. Please fill all identifyable gaps and address tasks.

The aftermath

So you are probably familiar with let me implement this one thing/refactor this/add these features and you end up grinding through 500 type issues until you get a somewhat working codebase again? This is what I get:

The Numbers

| Metric | Value | |--------|-------| | Parallel agents | 3 | | Files changed | 149 | | Lines added | +3,014 | | Lines removed | -2,881 | | Domain specs in repo | 47 | | Conflicts | 0 | | Agent communication | 0 | | Orchestration code | 0 lines |

Changes by Directory

frontend/server: +1,301 -640 frontend/app: +1,269 -1,687 frontend/specs: +416 -471

npx tsc: Found 17 Errors in 6 files

Repo (WIP) I am using this on (I have only started applying this pattern ~2 days ago)

https://github.com/Mvgnu/BioLabs

Does it scale So far I have yet to find the limit. If your code does not work you likely only need more loops against the spec. This also works in Claude Assistant Chat ironically - which produced the Løp. repo code