Credits and billing
How tenants pay for using your tool — the credits model from a tool author's view.
Vendo's payment surface is a prepaid credit balance, per tenant. Tenants buy credits in USD; every proxied API call drains a small amount; when the balance gets low, the tenant tops up. There are no subscriptions, no invoices, no usage caps you set. As a tool author this means most of the billing work has already been done — you mostly need to know what defaults you're getting and where the levers are.
The model in one paragraph
A tenant clicks "Add credits", pays via Stripe, and gets a positive entry in their credit ledger. Each call your tool makes through {provider}-proxy.vendo.run writes a negative entry: the upstream cost (e.g. OpenAI's price for those tokens) plus a margin Vendo configures per integration. The balance is the running sum. When it hits zero, the tenant's deployments are suspended — paused, not destroyed — and a top-up resumes them. Data is preserved for 90 days; only after that does the deployment get torn down.
The full ledger and suspension flow lives under Pricing & revenue. This page is the tool-author-facing summary.
What you declare: marketing.pricing
Your vendo.yaml has one field that toggles the wizard's payment step:
marketing.pricing | What the wizard does | When to pick it |
|---|---|---|
credits (default) | Asks the tenant to fund their wallet before launching | Any tool that calls the proxy |
free | Skips the payment step | Tools that never call the proxy (pure utility, no LLM/TTS/etc) |
If your tool calls the proxy and you set pricing: "free", the wizard skips payment and the tenant will hit a 402 on the first proxy call. The flag does not exempt anyone from billing — it only controls the wizard.
pricing: "free" does not make your tool's calls free. It only suppresses the wizard's payment step. Tools that proxy any metered call must use pricing: "credits".
What costs what
Cost is per call, computed by the proxy. The formula:
cost = upstream_unit_cost × quantity × (1 + margin_pct / 100)upstream_unit_costcomes from a rates table Vendo keeps in sync with each provider. For LLMs it's per million tokens; for TTS it's per character; for transcription it's per second.quantityis what the upstream returned in the response (input + output tokens, characters, seconds).margin_pctis Vendo's markup. The cascade is six levels —tenant_meter,tenant_integration,tenant,meter,integration,global(in that order); the narrowest match wins. The global default is 20%, but most calls hit a narrower row first (negotiated tenant rates or per-integration overrides), so the actual margin a tenant sees varies.
You don't set rates or margin from vendo.yaml. The public pricing page at vendo.run/pricing reads live from the rates table, so what a tenant sees on the marketing site is what they'll pay.
What your tool can do at runtime
The SDK gives you read access to the tenant's balance:
import vendo
remaining = vendo.billing.balance() # USD float
if remaining < 0.10:
return "Top up to continue"This is purely advisory. The proxy is the enforcer — it'll 402 a metered call past zero regardless of whether your tool checks. The reason to read the balance is UX: showing a friendly banner before a long-running operation rather than letting it fail mid-stream.
vendo.billing.balance() is a Vendo-only feature; it raises VendoOnlyFeature in OSS mode. The OSS mode (no VENDO_API_KEY) doesn't go through the proxy and there's no balance to read — it's the tenant's own provider keys.
Suspension is gentle (until day 90)
When a tenant runs out of credits, their deployment goes through a state machine: running → suspending → suspended → resuming → running. The container is stopped, the database is paused (where the integration supports branch-pause), the public URL serves a status page. Everything is preserved.
A top-up via Stripe webhook auto-resumes any deployment suspended for insufficient_credits. Tenants who manually pause stay paused — Vendo doesn't override a deliberate decision.
Suspension is not infinite. The suspension-reaper worker warns the tenant at day 83 and day 89, and at day 90 transitions a still-suspended deployment through destroying → destroyed. Once destroyed, data is gone and the deployment must be reinstalled from scratch.
For your tool, this means: write your code assuming the deployment can be paused and resumed at any time, with no warning to the running container. Persist anything you care about to a database or an R2 bucket. The Build a tool section covers what is and isn't preserved across suspend cycles.
Test-mode tenants and admins
Two categories of tenant bypass the credit gate: those marked test = true in the database (used internally for E2E suites) and Vendo admins. Both can deploy and call the proxy at zero balance. Your tool sees them like any other tenant — the SDK doesn't surface "test" status. If you need to gate something on it, do it at the application layer, not the SDK.
What's not covered here
Revenue share — the question of whether you, the author, see any of what tenants pay — is being designed separately and will land in Pricing & revenue. The rest of that section covers ledger semantics, rate cards, the deploy-time billing gate, and auto-reload (Vendo's opt-in mechanism for tenants who want a card on file to recharge automatically).