Airflow DAG Monitoring with DeadManCheck

Airflow tells you when a DAG fails. It doesn't tell you when a DAG runs successfully but processes nothing — zero rows exported, zero records synced, zero files moved. DeadManCheck does.

What Airflow's built-in alerting misses

Airflow's on_failure_callback fires when a task or DAG run fails. That's necessary, but it's not sufficient. The failures that hurt most aren't crashes — they're silent successes: DAGs that complete with exit code 0 while doing nothing useful.

Database query returns 0 rows — upstream data pipeline broke. Your export DAG runs, exports nothing, marks itself green.
API token expired — your sync DAG catches the exception, logs it, exits cleanly. All tasks: success.
DAG paused by accident — someone hit the pause button in the UI. No alerts fire. DAG silently stops running.
Schedule drift — DAG starts late every day, 5 minutes, then 20, then an hour. Airflow doesn't alert on duration anomalies.

Option 1: Ping via DAG-level callback

The cleanest approach for DAGs that should complete successfully: use on_success_callback at the DAG level. If the DAG doesn't succeed on schedule, DeadManCheck alerts you.

# airflow/dags/my_export_dag.py
# Airflow 2.x imports. For 3.x: from airflow.sdk import DAG
# from airflow.providers.standard.operators.python import PythonOperator
import requests
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime

DEADMANCHECK_TOKEN = "your-monitor-token"

def ping_deadmancheck(context):
    # Called on DAG success — pass row count via ?count= for output assertion
    rows = context["ti"].xcom_pull(task_ids="export_data", key="rows_exported")
    url = f"https://deadmancheck.io/ping/{DEADMANCHECK_TOKEN}"
    requests.get(url, params={"count": rows or 0}, timeout=5)

def ping_fail(context):
    requests.get(
        f"https://deadmancheck.io/ping/{DEADMANCHECK_TOKEN}/fail",
        timeout=5
    )

with DAG(
    dag_id="daily_export",
    schedule="0 2 * * *",
    start_date=datetime(2026, 1, 1),
    on_success_callback=ping_deadmancheck,
    on_failure_callback=ping_fail,
) as dag:
    ... # your tasks here

Set the monitor interval to 25 hours (slightly longer than your schedule) so DeadManCheck alerts if the DAG doesn't complete within the expected window.

Option 2: Ping as a final task

If you want the ping to run only after all upstream work succeeds — and you want it visible in the Airflow task graph — add it as a final PythonOperator.

from airflow.operators.python import PythonOperator
from airflow.providers.http.operators.http import HttpOperator

# Or use HttpOperator for a dependency-free approach:
notify_success = HttpOperator(
    task_id="notify_deadmancheck",
    http_conn_id="deadmancheck_conn",
    endpoint=f"/ping/{DEADMANCHECK_TOKEN}",
    method="GET",
    dag=dag,
)

# Chain after your last real task:
export_data >> validate_output >> notify_success

Use Airflow's Connections UI (Admin → Connections) to store the DeadManCheck host as deadmancheck_conn: host = deadmancheck.io, schema = https.

Output assertions: catch DAGs that do nothing

DeadManCheck's output assertion feature is the check Airflow itself can't provide. Pass a count with your ping and configure an alert threshold:

Export DAG exported 0 rows → alert (even though Airflow shows "success")
Sync DAG synced 0 records → alert
Report generation produced 0 files → alert

No other cron monitoring tool checks this. Cronitor, Healthchecks.io, Better Stack — all of them just verify the ping arrived. Only DeadManCheck lets you assert on the work done. See how output assertions work →

Duration monitoring for slow DAGs

DeadManCheck tracks how long each DAG run takes and alerts when a run takes significantly longer than the rolling average. Send a start ping before your DAG begins:

def ping_start(context):
    requests.get(
        f"https://deadmancheck.io/ping/{DEADMANCHECK_TOKEN}/start",
        timeout=5
    )

# Airflow 2.x: attach on_execute_callback to your first task, not the DAG
export_task = PythonOperator(
    task_id="export_data",
    python_callable=run_export,
    on_execute_callback=ping_start,
)

Once you have 5+ runs of history, DeadManCheck builds a rolling average and alerts when a DAG takes 2× longer than usual — a leading indicator of data volume spikes or upstream slowdowns. Learn more about duration monitoring →

Free for 5 monitors. $12/mo for 100. Self-host for free.