Run open models on your own machine.

Tessera is a quiet local runtime for open-weight language models. Pull a model, chat in your terminal, and point your app at localhost.

Download for macOS Linux / Windows

~/projects/notes

$ tessera run lumen-4
pulling manifest…   ok
pulling 4.1 GB …    ████████████████████  100%
verifying sha256 …  ok
ready → chat started.

> summarize the README in three bullets.

A shelf of open models, one command away.

From tiny 1B helpers to 70B workhorses — pulled once, cached forever, runs entirely offline.

lumen-4

general

A friendly, balanced chat model. Good default. 7B, 4-bit.

tessera run lumen-4

dev-code

coding

Built for codegen and refactors. Long context, fast tool-use.

tessera run dev-code

atlas-large

reasoning

The heavy one. 70B parameters for research, long-form writing.

tessera run atlas-large

nano-1

tiny

1B parameters. Runs on a MacBook Air. Great for tools.

tessera run nano-1

vision-v

multimodal

See images alongside text. Describe, extract, caption.

tessera run vision-v

embed-3

embeddings

Small, fast embedding model. Pairs with your local vector store.

tessera run embed-3

Pull once. Chat forever.
Even offline.

A single CLI and a local HTTP server. No dashboards, no accounts, no telemetry you didn't opt into. Your prompts stay on your machine.

Read the docs View on GitHub

~/app

$ curl localhost:11434/api/chat \
  -d '{"model":"lumen-4","messages":[
    {"role":"user","content":"hi!"}
  ]}'
{"message":{"role":"assistant",
  "content":"hey there — ready when you are."}}