Observability for small teams: minimum viable monitoring
Three signals, two dashboards, one on-call rotation. How we keep production calm without a platform team.

Observability is a big topic dominated by big teams. For small operators, most of the canonical advice is either overkill or written for orgs that have a dedicated platform team. We don't. Three signals Latency, errors, and saturation. That's it. If you can answer those three for every critical path in your app, you have observability — regardless of whether you're using OpenTelemetry, Prometheus, or a CSV file. Most outages are visible in the first sixty seconds — if anyone is looking. The hard part isn't choosing tools. It's setting thresholds that page humans only when something is genuinely broken. Get this wrong and you'll either miss outages or burn out your on-call. Two dashboards, one rotation One overview dashboard for the whole product, one drill-down per service. One on-call person at a time. If you need more than this, you're not a small team anymore.

