/blog

notes from the runtime

Name: mullama
Author: Cognisoc

Engineering notes on local inference, native bindings, GGUF, and what it takes to embed a model in a real application. RSS.

2026-05-12 · architecture · bindings

Six languages, one runtime: why Mullama ships native bindings instead of HTTP

Most local LLM runtimes give you exactly one of two things: a hand-rolled binding in the maintainer's favorite language, or an HTTP daemon. Mullama refuses the choice. Here's why.
2026-04-22 · ollama · compatibility · migration

Drop-in for Ollama: same CLI, same Modelfile, same port — plus things Ollama doesn't have

Mullama is wire-compatible with Ollama at the CLI, the Modelfile format, the model registry, and the HTTP port. Existing client code keeps working. Here's exactly what stays the same and what's new.
2026-03-15 · embedded · edge · deployment

Embed the runtime: why we put a static C ABI in the box

Daemons are great until you're shipping a desktop app, a CLI tool, or an on-device deployment. Mullama links into your binary as a static library. Here's what that unlocks — and what it costs.

Six languages, one runtime: why Mullama ships native bindings instead of HTTP