Azure Well-Architected × Vercel production engineering
Best practices, with receipts
Five Azure pillars, each one pointed at a real file in this repo and a CI gate that fails if the file lies. Read the table; click the path; watch the gate run. No slideware.
Full contract: platform-elite-practices.md
Pillar I
Reliability
Design for failure. SWR caches, content-addressed assets, an offline cockpit bundle, health probes that keep telling the truth when something upstream goes down.
Implementation map
Gate: verify_all · curl smoke| Practice | Stonewall | Gate |
|---|---|---|
| Tiered CDN cache | Root vercel.json — HTML s-maxage=3600 + SWR 7d; CSS/JS immutable | Post-deploy curl -sI |
| Content-hashed assets | asset_version_stamp.py | verify_all |
| Reproducible public build | vercel_build.sh — stdlib-only; install no-op | Vercel build log |
| Cockpit degraded mode | operator-snapshot.json + web/lib/api.ts fallbacks | Runbook § bundled data |
Pillar II
Security
Secrets out of git. Export guards on the public build. Clerk on the operator boundary. HSTS, frame-deny, nosniff — on every surface, not just the one with the login.
Implementation map
Gate: gitleaks · build fail| Secret scanning | gitleaks on PR + main | Branch protection scan |
| Public export guard | vercel_build.sh — no case_codex.md in docs/ | Build exit 1 |
| Operator auth | Clerk + STONEWALL_ALLOWED_OPERATOR_EMAILS | web/middleware.ts |
| Headers | HSTS preload, frame deny, nosniff — apex + cockpit | vercel.json · next.config.ts |
| Supply chain | SHA-pinned Actions; zizmor | verify workflow |
Pillar III
Performance efficiency
Defer the observability scripts past LCP. ISR on the cockpit routes. Perf invariants in CI so a regression fails the build before it reaches a phone.
Implementation map
Gate: check_web_perf_invariants| Speed Insights | vercel-observability.js — requestIdleCallback | Canary script URL 200 |
| Self-hosted fonts | _shared/geist-fonts.min.css | Perf invariant tests |
| Cockpit TTFB | force-cache + revalidate; withShellBudget; Suspense dashboard | check_web_perf_invariants.py |
| Bundle discipline | optimizePackageImports in next.config.ts | pnpm check:bundles |
Pillar IV
Operational excellence
Runbooks match what production actually does. Drift gets caught by an automated editor. Two Vercel projects share one truth file so they cannot disagree.
Implementation map
Gate: render_surface_docs --check| Surface truth | catalog/intake/state/surfaces.toml | audit_surface_state.py |
| Doc drift watch | docs_api_drift.py daily | Rolling editor PR → |
| Deploy runbooks | vercel-deploy-playbook.md, cockpit runbook | PR deploy checklist |
| Path-scoped CI | verify | GitHub Actions |
Pillar V
Cost optimization
Pay for what actually runs. The public build installs nothing — stdlib only. Doc PRs verify the corpus, not the cockpit.
Implementation map
Gate: detect_pr_change_scope| Public build | Vercel installCommand no-op | Build minutes |
| Doc PR CI | verify_all --corpus-only | Scope detector |
| Cockpit lockfile | pnpm install --frozen-lockfile | web/vercel.json |
One line per pillar
What proves each one
| Reliability | Offline cockpit + SWR + content-hashed assets + health probes |
| Security | gitleaks + export guards + Clerk + HSTS + pinned CI |
| Performance | Speed Insights discipline + ISR + perf invariant script |
| Operations | surfaces.toml + drift automation + runbooks |
| Cost | Static public build + corpus-only CI + frozen lockfile |