A friendly, balanced chat model. Good default. 7B, 4-bit.
Run open models on your own machine.
Tessera is a quiet local runtime for open-weight language models. Pull a model, chat in your terminal, and point your app at localhost.
$ tessera run lumen-4 pulling manifest… ok pulling 4.1 GB … ████████████████████ 100% verifying sha256 … ok ready → chat started. > summarize the README in three bullets.
A shelf of open models, one command away.
From tiny 1B helpers to 70B workhorses — pulled once, cached forever, runs entirely offline.
Built for codegen and refactors. Long context, fast tool-use.
The heavy one. 70B parameters for research, long-form writing.
1B parameters. Runs on a MacBook Air. Great for tools.
See images alongside text. Describe, extract, caption.
Small, fast embedding model. Pairs with your local vector store.
Pull once. Chat forever.
Even offline.
A single CLI and a local HTTP server. No dashboards, no accounts, no telemetry you didn't opt into. Your prompts stay on your machine.
$ curl localhost:11434/api/chat \ -d '{"model":"lumen-4","messages":[ {"role":"user","content":"hi!"} ]}' {"message":{"role":"assistant", "content":"hey there — ready when you are."}}
Bring your own tools.
Tessera speaks a boring, open HTTP API. Drop it into editors, agents, RAG pipelines — whatever you already use.
Private, portable, and yours.
Install Tessera in under a minute. It's free and open source.