airflowrest-apijwtautomationrbac

Airflow 3 REST API & Remote Schedule Control

From JWT auth to triggering, pausing, and polling DAGs to completion — practical patterns for controlling Airflow 3 remotely from external systems.

Data DynamicsJuly 3, 202610 min read

This is Part 9 of the Airflow 3 in Practice series. In the previous part, Integrating External Systems & Calling Sinks, we covered calling out to external systems from within Airflow. This time we flip the direction and talk about external systems controlling Airflow remotely. The next part is Monitoring & Operations.

Why the REST API Matters Again

Back in the Airflow 2 days there were two kinds of REST API. One was /api/experimental, which was unstable and had weak authentication; the other was the stable API added later. People got confused, and they often used the experimental API in production and caused incidents.

Airflow 3 cleared up this confusion. The old experimental API was removed entirely, and the API server (the component that replaces the 2.x webserver) now serves a versioned, stable REST API alongside the UI. Authentication is unified around JWT tokens. In other words, the scenario of "controlling Airflow from the outside" became a first-class citizen for the first time.

The only channel through which external systems call Airflow is now the API server's stable REST API.

Why does this matter in practice? Data pipelines don't run in isolation. A "regenerate report" button on an internal portal, a webhook from an external SaaS, another orchestrator, even a Slack bot a human uses — all of these need to be able to say "run this specific DAG now." The REST API is that entry point.

The JWT Authentication Flow

Airflow 3's API requires a JWT (JSON Web Token) on nearly every call. The flow is simple. First you obtain a token using your credentials, then you attach that token to the Authorization: Bearer <token> header on every subsequent request. Tokens have an expiry, so you reissue them when they expire.

Loading diagram…

The exact endpoint path for obtaining a token and the credential method depend on the auth manager configured in your deployment (basic auth, OAuth, external IdP integration, etc.). That's why we don't assert a specific path here. For the actual path and payload, check the official REST API reference for the version you're running. The core principle is the same across every version: get a token first, then call with the Bearer header.

Here's the skeleton of a client that abstracts away token issuance.

import requests
 
class AirflowClient:
    def __init__(self, base_url: str, token: str):
        self.base_url = base_url.rstrip("/")
        # Attach the token as a shared header on every call
        self.session = requests.Session()
        self.session.headers.update({
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json",
        })
 
    def get(self, path: str, **kw):
        r = self.session.get(f"{self.base_url}{path}", timeout=30, **kw)
        r.raise_for_status()
        return r.json()
 
    def post(self, path: str, **kw):
        r = self.session.post(f"{self.base_url}{path}", timeout=30, **kw)
        r.raise_for_status()
        return r.json()

A token is like a password. Don't hardcode it into code, logs, or URL query strings — inject it from an environment variable or a secrets manager.

Triggering a DAG from the Outside

The most common scenario is "run this specific DAG once, right now." In the REST API this is expressed as creating a DAG Run for that DAG (POST .../dagRuns). You can send two things along with it.

logical_date: which point in time this run logically processes data for. In Airflow 3, execution_date was removed and unified into logical_date. For manual/external triggers it may be omitted or None, in which case it's handled via data_interval_start/end.
conf: parameters that apply only to this run. Inside the DAG code you read them out via dag_run.conf or via Params.

With curl the shape is intuitive (the path is a generalized example; the exact path follows the official docs).

# Trigger a DAG (inject parameters via conf)
curl -X POST "$AIRFLOW_URL/api/v2/dags/sales_report/dagRuns" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
        "logical_date": "2026-07-02T00:00:00Z",
        "conf": { "region": "APAC", "dry_run": false }
      }'

There are two ways for the DAG code to receive conf. The quick way is to pull it directly from the context's dag_run.conf; the robust way is to declare a schema and defaults with Params, which also gives you validation.

from airflow.sdk import dag, task
from airflow.models.param import Param
 
@dag(
    schedule="0 6 * * *",          # The schedule itself is fixed in code
    catchup=False,                  # The 3.x default. Stated explicitly to make intent clear
    params={
        "region": Param("KR", type="string"),
        "dry_run": Param(False, type="boolean"),
    },
)
def sales_report():
    @task
    def run(params: dict):
        # The conf passed from outside arrives as params
        region = params["region"]
        if params["dry_run"]:
            print(f"[dry-run] skipping {region} report generation")
            return
        print(f"generating {region} report")
    run()
 
sales_report()

Pausing/Resuming and "What Should You Change Remotely"

Turning a DAG on and off remotely takes a single API call too. PATCH the is_paused attribute of the DAG resource to pause/unpause it.

# Pause a DAG
curl -X PATCH "$AIRFLOW_URL/api/v2/dags/sales_report" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "is_paused": true }'

Here let's pause and sort out the boundaries. "Changing the schedule remotely" splits into two distinct things.

What you want to change	Where you change it	Why
Run cadence (cron, asset triggers, etc.)	Code (`schedule=...`) + Git	The schedule is part of the pipeline's definition. It should be reviewed and history-tracked via GitOps
Behavior parameters (thresholds, target region, toggles)	Variable or trigger `conf`/Params	Changes often, and you want it reflected immediately without a code deploy
Stop/resume just this once	The API's `is_paused` PATCH	A temporary measure for operational situations (maintenance, incidents)

Schedules in code, behavior parameters in Variables. Hold this line and you can trace "why did it run differently yesterday?" through code history.

You may be tempted to change schedule dynamically based on a Variable value, but we don't recommend it. If the schedule changes every time the DAG is parsed, tracing becomes impossible, and you throw away the benefit Airflow 3's DAG versioning gives you — "which version did this run use?" Instead, fix the cron string in code and use Variables/Params only to adjust "what to do this time" within it.

Variables can be read and written via the REST API too, which makes them great for managing toggles from an external control plane. That said, fetching a frequently read Variable at the top of the DAG (at parse time) puts load on the DAG processor, so read it at task execution time whenever possible.

Synchronous Trigger + Polling to Completion

A common requirement from an external system's point of view is "run a DAG, wait until it finishes, and get back success/failure." The REST API is inherently asynchronous (triggering immediately returns a run id), so to use it synchronously the client has to implement the pattern of trigger → poll DAG Run state → check the terminal state.

Loading diagram…

Let's build a polling client on top of the AirflowClient we made earlier. The key points are to (1) clearly distinguish terminal states (success/failed), (2) set a timeout to prevent waiting forever, and (3) add a little backoff to the polling interval.

import time
import uuid
 
TERMINAL = {"success", "failed"}
 
def trigger_and_wait(client, dag_id, conf=None,
                     poll_interval=5, timeout=1800):
    # 1) Trigger — specifying run_id yourself makes tracking easy
    run_id = f"ext__{uuid.uuid4().hex[:12]}"
    client.post(
        f"/api/v2/dags/{dag_id}/dagRuns",
        json={"dag_run_id": run_id, "conf": conf or {}},
    )
 
    # 2) Poll until a terminal state
    deadline = time.time() + timeout
    while time.time() < deadline:
        run = client.get(f"/api/v2/dags/{dag_id}/dagRuns/{run_id}")
        state = run.get("state")
        if state in TERMINAL:
            if state == "failed":
                raise RuntimeError(f"DAG run {run_id} failed")
            return run  # success
        time.sleep(poll_interval)
 
    raise TimeoutError(f"DAG run {run_id} did not finish within {timeout}s")

There are practical caveats when using this pattern. If you set the polling interval too short (under a second), you put unnecessary load on the API server; if you set it too long, responses get delayed. A few seconds to a few tens of seconds is usually fine. And always put a timeout and failure handling (alerting/retries) on the calling side. Some DAGs can run for hours, and if the client sits blocked the whole time, that system will die first. For longer workflows, having the DAG's last task fire a webhook to the outside (the callback pattern from Part 8) is cleaner than polling.

Security: Don't Take Tokens and Permissions Lightly

Opening the REST API means you've created "a door through which the pipeline can be moved without a code deploy." That door must be locked.

Token management: Use JWTs as short-lived tokens with an expiry, and inject them from a secrets manager. Never leave them in Git, logs, or URLs. Issue a separate account/token per external system so audit logs can distinguish who did what.
RBAC least privilege: Airflow provides role-based access control. Don't give an account for a "regenerate report button" admin rights — grant only the minimum permission needed to trigger that DAG. If you give a client that only needs to trigger the ability to write Variables or read Connections, then the moment that token leaks, everything is exposed.
Protecting the transport layer: Put TLS in front of the API server (HTTPS) and, where possible, narrow it further with an internal network/VPN/IP allowlist.

Loading diagram…

Wrap-up

Airflow 3's REST API is no longer a side feature but a proper gateway connecting the outside world to your pipelines. To summarize:

The old experimental API is gone, unified into a single stable REST API on the API server + JWT.
Triggering is creating a DAG Run, parameters are conf/Params, and turning on/off is an is_paused PATCH.
Schedules in code, behavior parameters in Variables — this boundary preserves traceability and GitOps.
If you need synchronous execution, trigger then poll state (timeout and failure handling are mandatory); for long jobs, a webhook is better.
Tokens, RBAC, and TLS are not optional — they're the baseline.

For exact endpoint paths and payload schemas, check the official REST API reference for the version you're running. In the next part, Monitoring & Operations, we cover how to observe the runs you triggered this way and set up alerts.