Why does it look like LLMs consistently overestimate implementation time?

3 points by bridgettegraham 13 hours ago | 12 comments

I have my suspicion: they estimate how long people would have taken to implement some feature, becasue they were trained on such data. I consistently see estimates of 2 week/3 weeks or 5 days, etc. But then implementation takes a day or 2 max using agents within Claude/GPT. Unless I am missing something? Anybody else notice this?

sieve 13 hours ago |
This is very common as they have no conception of time. They are just using ballpark figures based on what they see in the wild. I see estimates for modules in weeks/months when it can produce it in a single afternoon of prompting.
kspetkov79 12 hours ago |
They tend to turn a small change into the whole cleanup plan. Sometimes that is useful, but it makes estimates too large.
dnnddidiej 11 hours ago |
You ask LLMs for estimates? Interesting.
Probably big model providers should do calibratuons for that and add an estimation skill.
schappim 11 hours ago |
I've found that they declare estimates unprompted.
bridgettegraham 5 hours ago |
yes exactly - I have never asked them for an estimate
vishnukool 9 hours ago |
The problem is that LLMs do not have a conceptual grounding in actual time. They estimate based on statistical correlation found in their training data which is filled with standard corporate project management timelines legacy codebases and waterfall estimates.
jaredsohn 8 hours ago |
>day or 2 max
I've frequently seen tasks that it thinks will take weeks being done in under an hour. And it will often recommend doing X instead of Y because X requires so much extra work. Basically I just remind it that it is an LLM.
If it worries something is error prone, I ask it to write tools to verify it.
ZivenChang 7 hours ago |
they estimate based on how software is usually built in organizations, not how fast it can be built with modern AI tools and agents.
BeInLife 6 hours ago |
they estimate as if human will build it
bridgettegraham 5 hours ago |
yeah this is my suspicion too
sometimelurker 2 hours ago |
maybe they're rewarded for being under their estimated implementation time in their training. they could learn a similar behavior (to safely underestimate) in other contexts and that could've spilled over.
you can blame everything on wired quirks in the training (claude overusing the words "true" and "genuine" when their not needed, AIs using em-dashes because the pretrain has a ton of them)
gajo357 an hour ago |
If what you say is reality, I would keep that.
We have a tendency to give overly optimistic estimates, best case scenario, no other tasks, no roadblocks...
Whenever asked for an estimate, think how long it would take you to make it and multiply by 5.