Lifecycle states
| State | What’s happening | Billing |
|---|---|---|
| Provisioning | Pod allocated, image pulling, SSH config writing | Not billed |
| Running | SSH reachable, work happens | Metered per second |
| Stopping | Terminate request sent to provider | Final partial second settled |
| Stopped | Pod released, IP unbound | No further charges |
| Failed | Allocation or boot error | Refunded automatically |
Templates
A template is just a starting OS image plus pre-installed software. Six are wired in today:Ubuntu 24.04
Plain. CUDA drivers, that’s it. Default pick.
PyTorch
Ubuntu + CUDA + PyTorch 2.x + transformers + accelerate.
vLLM
Inference server preloaded. Pass a model with
--model org/name.Ollama
Ollama daemon on port 11434, ready to
ollama pull.A1111 (Stable Diffusion)
AUTOMATIC1111 WebUI on port 7860.
text-generation-webui
Oobabooga’s text-gen on port 7860.
Per-second metering
Every running pod sends a heartbeat every second. The metering worker:- Reads the pod’s hourly rate
- Converts to per-second:
rate_per_hour / 3600 - Atomically debits that amount from the rental’s wallet ledger entry
- If the wallet reaches
$0, sends a terminate signal to the provider
Stopping
Click Stop in the dashboard or hitPOST /v1/rentals/:id/stop. The pod
is unreachable within ~5 seconds. We settle the final partial second so you
pay exactly for what you used, not a rounded-up minute.