Running Mission Control in a team or production environment requires additional attention to security, reliability, backups, and observability. This lesson covers the key hardening steps.
Rotate bearer tokens regularly — Treat bearer tokens like passwords. Rotate them on a regular schedule (monthly at minimum) and immediately when a team member with access leaves.
Validate HMAC signatures — Webhook HMAC verification is enabled by default and should never be disabled. Ensure your webhook secret is long (32+ characters) and randomly generated.
Use TLS — If Mission Control is reachable over any network, put it behind a reverse proxy (nginx, Caddy) that terminates TLS. Never transmit bearer tokens over plain HTTP.
Restrict network access — Use firewall rules or Tailscale ACLs to limit which machines can reach the Mission Control port. The dashboard should not be publicly accessible on the internet.
Audit API key storage — Agent API keys are stored encrypted in SQLite. Ensure the database file has restricted filesystem permissions (chmod 600 mission-control.db).
Health endpoint — Mission Control exposes GET /api/health which returns a simple 200 OK with database connectivity status. Monitor this endpoint with your uptime monitoring tool.
Log aggregation — Pipe Mission Control’s stdout to a log aggregation service. Activity events are valuable for post-incident analysis and should be retained for at least 30 days.
Alert on failure rates — Set up alerts if the task failure rate exceeds a threshold (e.g., >10% failures in a 15-minute window). Sudden spikes usually indicate an agent configuration problem or API key expiration.
SQLite backup strategy — SQLite’s single-file format makes backups trivial. Schedule a daily backup by copying the database file:
# Safe online backup using SQLite's backup API
sqlite3 mission-control.db ".backup /backups/mc-$(date +%Y%m%d).db"
Store backups off-machine. Retain at least 14 days of daily backups.
Before restarting Mission Control, check for in-progress tasks via the API. If tasks are running, either wait for them to complete or notify the relevant agents that the coordinator is going offline. In-flight SSE connections will drop on restart; clients reconnect automatically.
Keep Mission Control’s dependencies up to date, particularly the Next.js version and any packages with security advisories. Run npm audit regularly and address high-severity issues promptly.