Developer experience - create a session with a model - stream tokens as they generate - optionally get a full response (There’s a code snippet on the homepage showing the exact flow.)
Why I’m posting I’d love feedback from iOS devs (and anyone shipping on-device inference) on: - What would make this a “must use” SDK for you - What models you’d want supported first (small, fast, good-enough vs bigger, slower, better) - What your biggest pain is today: performance, model downloads, UX, memory, app size, safety, etc.
If you want updates / early access when it’s ready to share broadly, there’s a waitlist on the site (no credit card).
500 Internal Server Error