But yes, working with TS feels kludgy. I haven't had to deal with them in over a decade, but there was one tool that made it all super easy, MP2TSME, that I hear is no longer available
I'm almost certain you've seen some of that output
As a rule, strong feelings about issues do not emerge from deep understanding.
One can find MPEG-2 TS in surprising places (see: DOCSIS encapsulating Ethernet frames into TS packets).
If I had to guess why MPEG-2 TS, it'd probably be the for the fact it's a well-supported streaming format in both hardware and software. If you tried using QuickTime or MPEG-4 containers, you'd have to rely on hacks like ensuring the moov atom preceeds mdat.
Matroska may be worth considering (especially the subset used by WebM to make it stream-friendly and quicker to seek), but no idea how widespread hardware support is for (de)muxing that.
But seeing how many uses people came up with for using MP2TS just shows it's flexibility and resilience.
Fast start is irrelevant because MoQ normally uses fragmented MP4, not progressive.
HLS, MPEG-DASH etc. do successfully work around much of that, but they're really mostly that – workarounds to present stream-like semantics over an HTTP + CDN based delivery mechanism.
There are significant gaps on the production/distribution side of things, i.e., everything that happens before the CDN (and for very low latency even beyond), and I suppose this is an attempt at filling those.
I suppose the main reason was compatibility with existing cable headend infrastructure, but I'm also curious to know if there were ever actual TV and DOCSIS data sharing a single physical transponder. Would be a nice way of leveraging spare bandwidth due to variable bit rate video encoding!