BitNet-b1.58-2B-4T

A 1-bit language model running on a Sprite.

Model: BitNet-b1.58-2B-4T (2.4B params, 1.1GB)

Hardware: 8x AMD EPYC vCPU, 16GB RAM

Speed: ~50 tokens/sec

Uptime: 165h 56m 48s

Requests served: 17

Quick start

Grab the client (zero dependencies):

curl https://bitnet-llm-beony.sprites.app/client.py -o llm.py

Use it:

from llm import ask, classify

print(ask("What is a Sprite?"))
print(classify("server is down", ["bug", "feature", "ops"]))

Or hit the API directly:

curl https://bitnet-llm-beony.sprites.app/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"hello"}]}'

Try it

Open chat UI

API

POST /v1/chat/completions — OpenAI-compatible chat endpoint

GET /client.py — self-addressed Python client

GET /chat — chat UI

GET /health — health check