/blog
notes from the runtime
Engineering notes on local inference, native bindings, GGUF, and what it takes to embed a model in a real application. RSS.
- 2026-05-12 · architecture · bindings
Six languages, one runtime: why Mullama ships native bindings instead of HTTP
Most local LLM runtimes give you exactly one of two things: a hand-rolled binding in the maintainer's favorite language, or an HTTP daemon. Mullama refuses the choice. Here's why.
- 2026-04-22 · ollama · compatibility · migration
Drop-in for Ollama: same CLI, same Modelfile, same port — plus things Ollama doesn't have
Mullama is wire-compatible with Ollama at the CLI, the Modelfile format, the model registry, and the HTTP port. Existing client code keeps working. Here's exactly what stays the same and what's new.
- 2026-03-15 · embedded · edge · deployment
Embed the runtime: why we put a static C ABI in the box
Daemons are great until you're shipping a desktop app, a CLI tool, or an on-device deployment. Mullama links into your binary as a static library. Here's what that unlocks — and what it costs.