Find out your backups are empty before you need them, not during the incident
Verifies your backups are actually restorable — checks completeness, freshness, and integrity. Runs automated restore tests to catch the silent failures that make backup confidence an illusion.
Create a skill called "Backup Prober". Verify backup health across my infrastructure: 1. Inventory all backup systems: - AWS RDS snapshots: `aws rds describe-db-snapshots` - AWS Backup: `aws backup list-backup-jobs` - Kubernetes Velero: `velero backup get` - Any custom backup scripts (check cron logs) 2. For each backup system, verify: - Last successful backup time (is it within the expected RPO?) - Backup size (is it reasonable? Has it changed dramatically?) - Backup completeness (are all expected databases/volumes included?) 3. If possible, run a restore test: - Restore to a test instance - Run basic data verification (row counts, key table spot checks) - Clean up test resources after verification 4. Generate a DR readiness report: - Actual RPO (time since last good backup) - Estimated RTO (based on backup size and restore process) - Gaps: what's not being backed up that should be Flag any backup system outside its expected RPO as CRITICAL. If I don't provide an RPO target, default to 24 hours.
GitLab found out 5 out of 5 backup methods had failed during their 2017
database outage. This skill makes sure that doesn't happen to you by
regularly verifying backup health.
Document and monitor the cron jobs nobody understands
Audits all cron jobs on a server, documents what each one does, wraps them with monitoring and alerting, and migrates critical ones to a proper scheduler if needed. No more "30 cron jobs and nobody knows what half of them do."
Turn 500 daily alerts into the 5 that actually matter
Analyzes your alerting rules and history to find the noise — duplicate alerts, overly sensitive thresholds, alerts that never lead to action. Generates tuned configurations that keep the signal and kill the spam.
Stop audio drift by quarantining variable-frame-rate clips at ingest
Audio slowly drifts out of sync or randomly desyncs in your timeline when footage is variable frame rate — common with iPhone footage, screen recordings, and some OBS workflows. This recipe catches VFR clips at ingest, transcodes them to constant frame rate, and quarantines the originals so drift never reaches your edit.
Catch the "stuck at 99%" class of export failures before they happen
Exports hang or fail late in the process — often near completion — due to insufficient free space, problematic clips, or unstable settings. This recipe checks disk space, validates export targets, and provides a fallback render path before you waste an hour waiting for a doomed export.