VendoVendo Docs
Deploy & publishOperate

Updating a tool

Shipping a new version to deployed tenants — what changes, what doesn't, and how upgrades are sequenced.

A "update" can mean four different things depending on what changed. Each maps to a different Vendo workflow:

You changedThe operation isTriggered by
The image (new code, no manifest change)Restart (with new image tag)Cut a new release, then upgrade
The manifest (env vars, integrations, wizard)UpgradeCut a new release, then upgrade
User-editable env vars (kind='user' rows)RestartEdit + click Restart
Nothing — just want a fresh bootRestartDashboard button

This page covers shipping a new version to tenants. For the publishing pipeline that produces the new version, see Versioning & releases.

Cutting a release

The full flow lives in Versioning & releases — push a new manifest file to vendo-templates, write a release migration that marks the old row inactive and inserts a new active row, run supabase db push.

After the migration applies, new deploys get the new version. Existing deploys keep running on their pinned snapshot until upgraded.

The upgrade workflow

POST /api/deployments/[id]/upgrade triggers UpgradeWorkflow in the deploy worker. The workflow:

  1. fetch_deployment — load the current row.
  2. resolve_upgrade_env (or fetch_upgrade_image_services for image-only bumps) — re-resolve env vars so new placeholders / integration bindings are picked up.
  3. snapshot_current — write deployments.previousTemplateVersion so rollback has a target.
  4. upgrade_compute — call serviceInstanceUpdate({ source: { image } }) on each Railway service and redeploy.
  5. await_ready_* — poll the readiness endpoint.
  6. finalize — flip back to running, progress: 100. The web API writes deployments.bundle_version to the target version before firing, and deployments.manifest is updated to the new snapshot.

Pinning is via deployments.deployed_template_version (varchar) and the manifest JSONB. There is no tool_release_id column on deployments — the dashboard compares deployed_template_version against tool_releases.template_version to decide whether an update is available.

Step 4 is the load-bearing one. If you forget to bump the image tag (e.g. you re-cut 1.0.0 with new code instead of releasing 1.0.1), Railway redeploys against the same image SHA and nothing changes.

Tenant-initiated upgrade

The dashboard shows an "Update available" banner on any deployment whose deployed_template_version is older than the active release's template_version. The tenant clicks Update, the workflow runs, and on success their deployment is on the new version. Failures show in deploy_logs and the deployment goes to failed — they can retry, or roll back via POST /api/deployments/[id]/rollback.

This is the path you want by default. Tenants control when their workload updates.

Author-initiated upgrade (mass rollout)

For a security patch or a bug fix that should reach every tenant:

  1. Cut the new release.
  2. Query for deployments still pinned to old versions (see Rollout & upgrades).
  3. Script the per-deployment upgrade calls. Rate-limit; you don't want every tenant of your tool redeploying simultaneously.

Automatic update path (customize-enabled tools)

A second, distinct upgrade path exists for customize-enabled tools (those that opt into per-tenant code mutation). workers/update-watcher is an hourly cron ("0 * * * *") that:

  1. Selects every apps_catalog row with enabled=true AND tool_type='deployment', checks each for upstream HEAD changes since the last poll.
  2. For each running deployment of that tool (with auto_update_paused_at IS NULL and a deployment_repos R2 bundle present), checks cadence + weekly cap + in-flight, then INSERTs a pending_updates row and fires UpdateWorkflow in the deploy worker.
  3. UpdateWorkflow runs resolver → judge → gate. The gate either auto-applies the patch (deploy worker calls /upgrade) or pauses for user confirmation — surfaced in the dashboard as a banner the tenant decides on via POST /api/deployments/[id]/updates/[updateId]/decide.

This path is opt-in per tool (customize is not the default) and capped at 5 updates/deployment/week. The manifest-driven release upgrade path covered above never auto-applies — only the customize/update-watcher path does.

For tools that don't opt into customize, Vendo treats running deployments as the tenant's workload and never silently changes the running version.

What survives an upgrade

ResourceSurvives upgrade
Neon PostgresYes
R2 bucketYes
Railway volumesYes
Cloudflare KV mappingsYes
Tenant's kind='user' env var editsYes
vendo_api_key (proxy key)Yes (same key, new image)
In-container filesystem outside a volumeNo — image is replaced
In-memory Redis without persistenceNo

If your release adds a database column or migrates schema, your container's startup must run the migration before the readiness endpoint reports healthy. The readiness check is your serialization point.

Env-var changes across versions

Two patterns:

  • Renaming a user-required env var. Tenants need to re-enter the value. Add the new key to userInputs[] and remove the old; existing deployments will pick up NULL for the new key until the tenant edits + restarts. Surface this clearly in your release notes.
  • Renaming an integration-derived env var. Avoid. The env var name comes from the provider's connectionEnvVars registry, not your manifest. If you must change it, coordinate with Vendo to update the registry; otherwise old deployments will keep the old env var name in deployment_env_vars until an upgrade re-resolves.

Rolling back

Cut a new release pointing at the previous manifest version, then upgrade. Old manifest versions stay in R2; old image tags stay wherever you publish them. If you garbage-collected the old image tag, rollback fails — keep at least the most recent N tags retained in your registry.

Tenant-initiated rollback is exposed at POST /api/deployments/[id]/rollback. The route swaps deployments.bundleVersiondeployments.previousTemplateVersion and fires the deploy worker's /upgrade endpoint with the previous version as target — so the same idempotent UpgradeWorkflow runs in reverse. The dashboard surfaces this as a button on failed/recent-update deployments.

Restart without upgrade

Sometimes you don't need a new version — you just need a fresh boot (env var changed, transient crash loop, etc.):

  • Dashboard → Restart. Pushes every deployment_env_vars row into Railway, redeploys each service, re-runs the readiness check, flips the row back to running.
  • Atomic. Status transitions runningrestartingrunning. A concurrent restart returns 409.

Restart is the right tool when the image is fine but the runtime state isn't.

Next: Suspend & resume.

On this page