CLOISTER
the gateway · to private · local models

Sealed-room
LLM inference.

Your prompt enters a hardware-sealed enclave, the model runs, the answer leaves. The host can't read it. The model operator can't read it. We can't read it. Private by physics, not by policy.

live qwen3-32b · tdx deepseek-v3 · sev-snp llama-3.3-70b · h100 cc
§ I · WHAT

The gateway
to private inference.

Cloister is the doorway. Behind it, every model runs inside a hardware-isolated enclave — a Trusted Execution Environment. The OS around it cannot read its memory. The hosting provider cannot inspect its state. The operator running your model literally cannot pull your prompt out.

Every other inference provider dilutes the privacy pitch with non-TEE endpoints. We don't. There is no non-TEE lane. If it isn't sealed, it isn't here.

§ II · HOW

Three steps. Nothing else.

01 · Get a code. One click in the dashboard. A 16-character code drops out. We hash it (Argon2id) and forget the original. You save it.

02 · Fund with TAO. Each account gets a derived Bittensor address. Send TAO. We watch for it, credit your balance, sweep funds to the master wallet.

03 · Call the API. Standard OpenAI-shaped POST. The code is your bearer token. Balance debits per token at the call-time TAO/USD rate.

$ curl https://api.cloister.space/v1/chat/completions \
    -H "Authorization: Bearer cloi-XXXX-XXXX-XXXX-XXXX" \
    -d '{"model":"qwen3-32b-tee","messages":[...]}'
§ III · MODELS

TEE-attested
models only.

Every model in the catalog runs inside a verifiable enclave. We list nothing else. Tokens billed in TAO at the call-time market rate — no tiers, no subscriptions.

slugenclave$/M in$/M outstatus
qwen3-32b-teetee0.092+15%0.276+15%● live
gemma-4-31b-turbo-teetee0.150+15%0.437+15%● live
glm-5.1-teetee1.208+15%4.025+15%● live
kimi-k2.5-teetee0.506+15%2.300+15%● live
minimax-m2.5-teetee0.173+15%1.380+15%● live
qwen3.5-397b-a17b-teetee0.449+15%2.691+15%● live
kimi-k2.6-teetee0.851+15%4.025+15%● live
deepseek-v3.2-teetee0.322+15%0.483+15%● live
glm-5-teetee1.093+15%2.933+15%● live
mistral-nemo-instruct-2407-teetee0.028+15%0.112+15%● live
qwen3.6-27b-teetee0.575+15%2.300+15%● live
qwen2.5-coder-32b-instruct-teetee0.028+15%0.112+15%● live
§ IV · WHY

Private by physics.
Not by policy.

A Trusted Execution Environment is a hardware-isolated region inside a CPU or GPU. Code and data inside it are encrypted in memory and invisible from outside — even to the kernel running it.

The chip itself signs an attestation receipt saying "this exact binary is running, untampered, inside this exact enclave." You verify against Intel (TDX), AMD (SEV-SNP), or NVIDIA (H100 CC) directly. No middleman vouching.

"We don't log your prompts" is a policy. Policies change. TEEs are silicon. Silicon doesn't.

§ V · IDENTITY

16 characters. No email.

Account = one base32 code. Generated client-side at signup, shown exactly once. No password reset because there is no password. No support email recovery because we never had the original.

XL9F-K2RP-7N3M-Q8BH
80 bitsargon2idloss = total
§ VI · PRICING

Pay per token.
In TAO.

+15%flat markup · no tiers · no minimums

That's the whole pricing page. No seats, no minimum spend, no monthly fee, no plan tiers, no annual discount, no enterprise quote, no free credits.

§ VII · FAQ

Short answers.

What if I lose my code?+
We don't have it. We never did. We hash it on signup and discard the original. silentium.
Do you log prompts?+
No. Per-call we record timestamp, model slug, token counts, and cost in plancks. Prompts and completions never hit disk. Because the enclave returns ciphertext, our logs are useless against you.
Why TAO and not USDC / SOL / ETH?+
TAO is the native settlement token of the inference network we route to. One token, one rail, no swaps, no bridges, no extra surface area.
What stops you from MITMing the TEE?+
Attestation. Each enclave publishes a remote-attestation receipt signed by the silicon vendor. Verify against Intel/AMD/NVIDIA directly. We don't stand between you and that receipt.
Can you read my prompt?+
No. Our proxy receives TLS-terminated ciphertext, forwards it inside a sealed channel to the enclave, and returns the response. We never decrypt.
Where can I read more about the infra?+
Full docs at /docs — compute partners, attestation flow, OpenAI-compatible API surface, streaming roadmap.
Streaming responses?+
Not in v1. Pass stream:false. We'll add SSE once token-count metering is wired in — billing accuracy first.
Enterprise / SLA / SOC2?+
Different product. We don't sell to enterprise. Talk to chutes directly — they sell that.
scroll to enter