SentraWatch: probe results in, alerts out
How an uptime-monitoring platform runs tiered health checks on Centrali — ingesting probe results, updating hourly rollups inline, running the incident state machine alongside, and fanning out alerts to email, SMS, and customer-configured webhooks.
The Challenge
SentraWatch — a BlueInit-portfolio product — needed an uptime-monitoring backend that could schedule probes at multiple cadences, keep both raw results and long-lived hourly aggregates, run the incident lifecycle, and deliver notifications across three outbound channels (email, SMS, webhooks) with delivery-state tracking — all without operating a separate queue, cron, or notification service.
- Schedule health-check probes across four interval tiers (60s, 300s, 600s, 1800s)
- Ingest probe results and update hourly rollups inline, without a separate aggregation pipeline
- Run the incident state machine — auto-create on down, auto-resolve on recovery — alongside the probes that create them
- Fan out down and recovery alerts to email, SMS, and customer-configured webhooks
- Persist outbound webhook delivery status and consecutive-failure counts back to the subscribing record
- Isolate every customer behind a multi-tenant org layer with plan-gated features
The Solution
Centrali gave SentraWatch the full ingest → store → send spine in one SDK. Scheduled triggers pick up due checks, a single function pings URLs, updates hourly rollups inline, and drives the incident state machine. Event-driven triggers on incidents fan the outbound alerts out.
Tiered probe scheduling
Four scheduled interval triggers (60s, 300s, 600s, 1800s) call the same scheduler function with an interval tier. Each run queries the health checks whose next_check_at is due, pings URLs in parallel batches, and anchors the next run time to prevent drift.
Inline hourly rollups
As each probe result lands, the function upserts the hour-bucket rollup in the same execution. No separate aggregation job on a cron — 30-day and 90-day uptime math is ready on read.
Incident state machine next to the data
The same function that records probe results auto-creates an incident when a check goes down and auto-resolves it on recovery, computing downtime duration from started_at. One execution, end to end.
Outbound alerts — email, SMS, customer webhooks
Event-driven triggers on incidents (created and updated) fire the alert function. Email via Azure Communication Services, SMS via Twilio (paid plans), and customer-configured webhooks with Slack/Discord-compatible payloads. Webhook delivery status and consecutive failures persist back to the subscribing app record.
Multi-tenant orgs with plan gating
Clerk Organizations mirrored into organizations and org-members structures for server-side querying. Stripe billing lives on the org; plan-gated features (SMS alerts, read-only API keys) check the org's plan at request time.
Self-maintaining retention
Daily cron triggers clean up raw probe results (7-day retention) and audit logs (90-day retention). Rollups and incidents stay forever — small and queryable from the same SDK.
Architecture
SentraWatch runs its entire backend on Centrali. A separate Next.js app powers the dashboard and customer status pages, reading from Centrali via the SDK. Clerk handles auth; Stripe handles billing.
Results
Scheduled interval tiers driving the probe loop
Function handling probes, rollups, and incident state
Outbound channels — email, SMS, customer webhooks
Separate aggregation pipelines, queues, or schedulers
Build on the backend for webhooks in and out
Ingest third-party events, store them as data, and send your own — from one SDK.