Airflow DAG Monitoring with DeadManCheck
Airflow tells you when a DAG fails. It doesn't tell you when a DAG runs successfully but processes
nothing — zero rows exported, zero records synced, zero files moved. DeadManCheck does.
What Airflow's built-in alerting misses
Airflow's on_failure_callback fires when
a task or DAG run fails. That's necessary, but it's not sufficient. The failures that hurt most
aren't crashes — they're silent successes: DAGs that complete with exit code 0 while doing
nothing useful.
- Database query returns 0 rows — upstream data pipeline broke. Your export DAG runs, exports nothing, marks itself green.
- API token expired — your sync DAG catches the exception, logs it, exits cleanly. All tasks: success.
- DAG paused by accident — someone hit the pause button in the UI. No alerts fire. DAG silently stops running.
- Schedule drift — DAG starts late every day, 5 minutes, then 20, then an hour. Airflow doesn't alert on duration anomalies.
Option 1: Ping via DAG-level callback
The cleanest approach for DAGs that should complete successfully: use
on_success_callback at the DAG level.
If the DAG doesn't succeed on schedule, DeadManCheck alerts you.
# airflow/dags/my_export_dag.py
# Airflow 2.x imports. For 3.x: from airflow.sdk import DAG
# from airflow.providers.standard.operators.python import PythonOperator
import requests
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime
DEADMANCHECK_TOKEN = "your-monitor-token"
def ping_deadmancheck(context):
# Called on DAG success — pass row count via ?count= for output assertion
rows = context["ti"].xcom_pull(task_ids="export_data", key="rows_exported")
url = f"https://deadmancheck.io/ping/{DEADMANCHECK_TOKEN}"
requests.get(url, params={"count": rows or 0}, timeout=5)
def ping_fail(context):
requests.get(
f"https://deadmancheck.io/ping/{DEADMANCHECK_TOKEN}/fail",
timeout=5
)
with DAG(
dag_id="daily_export",
schedule="0 2 * * *",
start_date=datetime(2026, 1, 1),
on_success_callback=ping_deadmancheck,
on_failure_callback=ping_fail,
) as dag:
... # your tasks here
Set the monitor interval to 25 hours (slightly longer
than your schedule) so DeadManCheck alerts if the DAG doesn't complete within the expected window.
Option 2: Ping as a final task
If you want the ping to run only after all upstream work succeeds — and you want it visible in
the Airflow task graph — add it as a final PythonOperator.
from airflow.operators.python import PythonOperator
from airflow.providers.http.operators.http import HttpOperator
# Or use HttpOperator for a dependency-free approach:
notify_success = HttpOperator(
task_id="notify_deadmancheck",
http_conn_id="deadmancheck_conn",
endpoint=f"/ping/{DEADMANCHECK_TOKEN}",
method="GET",
dag=dag,
)
# Chain after your last real task:
export_data >> validate_output >> notify_success
Use Airflow's Connections UI (Admin → Connections) to store the DeadManCheck host
as deadmancheck_conn: host = deadmancheck.io, schema = https.
Output assertions: catch DAGs that do nothing
DeadManCheck's output assertion feature is the check Airflow
itself can't provide. Pass a count with your ping and configure an alert threshold:
- Export DAG exported 0 rows → alert (even though Airflow shows "success")
- Sync DAG synced 0 records → alert
- Report generation produced 0 files → alert
No other cron monitoring tool checks this. Cronitor, Healthchecks.io, Better Stack — all of them
just verify the ping arrived. Only DeadManCheck lets you assert on the work done.
See how output assertions work →
Duration monitoring for slow DAGs
DeadManCheck tracks how long each DAG run takes and alerts when a run takes significantly
longer than the rolling average. Send a start ping before your DAG begins:
def ping_start(context):
requests.get(
f"https://deadmancheck.io/ping/{DEADMANCHECK_TOKEN}/start",
timeout=5
)
# Airflow 2.x: attach on_execute_callback to your first task, not the DAG
export_task = PythonOperator(
task_id="export_data",
python_callable=run_export,
on_execute_callback=ping_start,
)
Once you have 5+ runs of history, DeadManCheck builds a rolling average and alerts when
a DAG takes 2× longer than usual — a leading indicator of data volume spikes or upstream slowdowns.
Learn more about duration monitoring →
Start monitoring free — no credit card needed
Free for 5 monitors. $12/mo for 100. Self-host for free.