Run open models on your own machine.

Tessera is a quiet local runtime for open-weight language models. Pull a model, chat in your terminal, and point your app at localhost.

~/projects/notes
$ tessera run lumen-4
pulling manifest…   ok
pulling 4.1 GB …    ████████████████████  100%
verifying sha256 …  ok
ready → chat started.

> summarize the README in three bullets.

A shelf of open models, one command away.

From tiny 1B helpers to 70B workhorses — pulled once, cached forever, runs entirely offline.

lumen-4
general

A friendly, balanced chat model. Good default. 7B, 4-bit.

tessera run lumen-4
dev-code
coding

Built for codegen and refactors. Long context, fast tool-use.

tessera run dev-code
atlas-large
reasoning

The heavy one. 70B parameters for research, long-form writing.

tessera run atlas-large
nano-1
tiny

1B parameters. Runs on a MacBook Air. Great for tools.

tessera run nano-1
vision-v
multimodal

See images alongside text. Describe, extract, caption.

tessera run vision-v
embed-3
embeddings

Small, fast embedding model. Pairs with your local vector store.

tessera run embed-3

Pull once. Chat forever.
Even offline.

A single CLI and a local HTTP server. No dashboards, no accounts, no telemetry you didn't opt into. Your prompts stay on your machine.

~/app
$ curl localhost:11434/api/chat \
  -d '{"model":"lumen-4","messages":[
    {"role":"user","content":"hi!"}
  ]}'
{"message":{"role":"assistant",
  "content":"hey there — ready when you are."}}

Bring your own tools.

Tessera speaks a boring, open HTTP API. Drop it into editors, agents, RAG pipelines — whatever you already use.

Private, portable, and yours.

Install Tessera in under a minute. It's free and open source.