iceberglakehouserest-catalogdata-platform

Apache Iceberg REST Catalog Server — Specification, Implementations, and Operational Patterns

Why the Iceberg REST Catalog spec has become the new standard for multi-engine Lakehouses, the structure of the OpenAPI-based spec, a comparison of the major implementations (Polaris, Unity OSS, Lakekeeper, Nessie, Gravitino), self-hosting operations, and engine wiring.

Data DynamicsMay 20, 202619 min read

The moment you decide to run Apache Iceberg in a multi-engine environment, the next decision almost always shifts to the catalog. "Which catalog?" is not just a metastore choice — it is a question of where to place the control plane for your entire data infrastructure.

The Iceberg REST Catalog specification consolidated between 2023 and 2024, and the open-source servers that implement it, have become the new baseline for this question. This post lays out the spec essentials, how the major implementations differ, the operational patterns of self-hosting, and practical recipes for wiring up engines.

1. Why a REST Catalog

1.1 The catalog problem in a multi-engine era

Through the 2010s and pre-Iceberg, the de facto standard was the Hive Metastore (HMS). AWS Glue became the managed variant of HMS, and there were alternatives like JDBC catalogs, file system catalogs, and Project Nessie.

In that era, catalog choice translated directly into these problems:

Every catalog had a different client. HMS used Thrift, Glue the AWS SDK, JDBC its SQL drivers. Engines had to ship a separate adapter for each catalog.
Permission models differed per catalog. HMS Ranger, Glue Lake Formation, Databricks Unity — each expressed permissions in its own model.
Compute nodes had to access storage directly. The catalog only told you where the metadata lived; the worker still had to carry the credentials to access the actual data. In multi-tenant environments this was risky.
Catalogs reinforced engine lock-in. If your catalog was tied to one engine (e.g., Databricks), accessing it from a different engine was awkward.

1.2 The arrival of the REST Catalog spec

The Iceberg community had been discussing a specification in which every engine speaks the same HTTP API to its catalog since 2022, and v1 was formally agreed in 2023.

Core decisions:

A specified HTTP API based on OpenAPI — engines don't need to ship per-catalog clients; one REST client talks to every REST Catalog.
Metadata transactions live on the server — the metadata.json swap that the client used to perform under the old HMS model is now done atomically on the server. Client-side locking and retry logic gets simpler.
Vended credentials — the server hands out short-lived storage credentials to a client that has passed the catalog's permission checks. Workers no longer need to hold long-lived credentials.
Freedom of backend — the server can store its metadata anywhere (JDBC, HMS, Glue, its own DB). Only the external interface (the REST spec) needs to be identical.

1.3 What the REST Catalog changed

Loading diagram…

The consequences:

Engines and catalogs become genuinely decoupled. When you move from one engine to another, the catalog can stay in place.
Control concentrates at the catalog. Permissions, audits, and policies are decided in one place, and the result is the same regardless of which engine the user came in through.
Workers no longer hold long-lived keys. Vended credentials shrink the security surface.

These three points are why every camp (Snowflake, Databricks, AWS, Google, Trino, Apache) has been redesigning its catalog on top of the REST Catalog spec since 2024.

2. The Iceberg REST Catalog Specification

2.1 Where the spec lives

The official spec is open-api/rest-catalog-open-api.yaml in the Iceberg repository. It's an OpenAPI 3.0 document covering the following areas:

Namespace management — the isolation boundary for databases/schemas.
Table lifecycle — create, read, rename, delete.
Metadata commits — atomically apply a new metadata.json.
Authentication and credential vending — OAuth2/SigV4/JWT-based authentication, short-lived storage credentials.
Transactions — multi-table commits (optional, depends on the server implementation).
Views / multi-table transactions / materialized views — gradually introduced from spec v1.5+.

2.2 Key endpoints (summary)

GET    /v1/config                          Server config negotiation (default location, vended options, etc.)

GET    /v1/{prefix}/namespaces             List namespaces
POST   /v1/{prefix}/namespaces             Create a namespace
GET    /v1/{prefix}/namespaces/{ns}        Namespace metadata
DELETE /v1/{prefix}/namespaces/{ns}        Drop a namespace

GET    /v1/{prefix}/namespaces/{ns}/tables           List tables
POST   /v1/{prefix}/namespaces/{ns}/tables           Create a table
GET    /v1/{prefix}/namespaces/{ns}/tables/{table}   Fetch current metadata
POST   /v1/{prefix}/namespaces/{ns}/tables/{table}   Update metadata (commit)
DELETE /v1/{prefix}/namespaces/{ns}/tables/{table}   Drop a table

POST   /v1/{prefix}/namespaces/{ns}/register         Register an externally-created metadata.json
POST   /v1/{prefix}/transactions/commit              Multi-table transaction

GET    /v1/oauth/tokens                              OAuth2 token issuance

{prefix} is a catalog/workspace identifier (whether it is used depends on the implementation).

2.3 The metadata commit flow

The heart of the REST Catalog is "how do you atomically swap the metadata.json pointer?".

Loading diagram…

What this model means:

Clients do not hold locks. On concurrent writes, only the last update succeeds; the rest are rejected as conflicts.
Conflict handling is the client's responsibility. The Iceberg library retries automatically via commit.retry.* properties.
The catalog's consistency is exactly the consistency guarantee of the backend. Picking a transactional backend like Postgres, MySQL, or DynamoDB is standard practice.

2.4 Vended credentials — delegating credentials

The other defining feature of the REST Catalog spec.

Loading diagram…

Why this matters:

Workers don't carry long-lived IAM keys. Smaller security surface.
Permissions are consistent at the catalog. "Can read events" equals "gets short-lived credentials scoped to that table's S3 prefix."
The catalog absorbs per-cloud credential mechanics. The client doesn't need to know the difference between S3, ADLS, and GCS token issuance.

2.5 Authentication models

Accessing the REST Catalog itself typically uses one of:

OAuth2 (Client Credentials Grant) — fetch a token from /v1/oauth/tokens and call with Authorization: Bearer ....
SigV4 — for AWS-compatible catalogs (e.g., the Glue REST endpoint).
mTLS — client certificates inside the corporate network.
External IdP integration — federate with OIDC IdPs like Okta, Entra, or Keycloak.

OAuth2 is the most common and supported by virtually every implementation.

# Token issuance
curl -X POST https://catalog.example.com/v1/oauth/tokens \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials" \
  -d "client_id=svc-spark-prod" \
  -d "client_secret=..." \
  -d "scope=catalog"
 
# Response
{ "access_token": "eyJ...", "token_type": "Bearer", "expires_in": 3600 }

3. Major Implementations

The number of open-source servers implementing the REST Catalog spec has grown fast since 2024. The main candidates:

3.1 Apache Polaris

Origin — Open-sourced by Snowflake in 2024, donated to the ASF → incubated as Apache Polaris, now Top-Level.
Language — Java (Dropwizard / Quarkus-based).
Backend — JDBC (Postgres recommended), in-memory (for tests).
Permission model — RBAC with a Principal · Role · Privilege hierarchy. Fine-grained at the catalog, namespace, and table level.
Cloud — Vended credentials for S3, ADLS, and GCS.
Positioning — Started as a way to unify Snowflake and external engines on the same catalog, but is usable independently of Snowflake.

3.2 Unity Catalog OSS

Origin — Databricks open-sourced Unity Catalog in 2024.
Language — Java + Scala.
Backend — JDBC.
Permission model — The same three-level (catalog.schema.table) model as Databricks Unity, plus partial ABAC.
Cloud — S3, ADLS, GCS.
Positioning — Keeps compatibility with the Databricks environment while sharing with external engines. A distinguishing feature is that UC handles both Iceberg and Delta.

3.3 Lakekeeper

Origin — A lightweight Rust-based REST Catalog that grew quickly from 2024.
Language — Rust.
Backend — Postgres.
Permission model — OpenFGA-based fine-grained authorization (relation-based ABAC).
Strengths — Small memory footprint, fast cold start, k8s-friendly.
Positioning — Emphasizes lightweight and performance advantages over JVM-based catalogs.

3.4 Project Nessie

Origin — Started by Dremio in 2020.
Distinctive feature — Provides Git-like branch/merge semantics at the catalog level. Separate from Iceberg V2 branches/tags, you can create catalog-level branches.
Language — Java (Quarkus).
Backend — RocksDB, Postgres, MongoDB.
REST Catalog compatibility — Supports the Iceberg REST spec while also exposing Nessie's own API.
Positioning — Strong for multi-table transactions and experimental branching.

3.5 Apache Gravitino

Origin — Donated to the ASF by Datastrato in 2024.
Distinctive feature — Unifies not only Iceberg REST catalogs but a variety of catalogs and metastores (HMS, JDBC, Kafka schema registry, …) — a "metadata lake" orientation.
Language — Java.
Positioning — A unified metadata plane within an organization, not limited to Iceberg.

3.6 Tabular (Iceberg's founders)

Origin — A SaaS started by the Iceberg founders. Acquired by Databricks in 2024.
Status — Absorbed into Databricks' Unity Catalog post-acquisition. No new signups.
Note — Significant influence on the spec and architecture, but excluded as an option for new adoption.

3.7 Cloud-managed

AWS Glue (with the Iceberg REST endpoint) — A REST-compatible endpoint was added to Glue. Zero operational burden, AWS-bound.
Snowflake Polaris (managed) — Polaris hosted by Snowflake.
Databricks Unity (managed) — Unity hosted by Databricks.
Google BigLake — Partial Iceberg REST spec support.

3.8 Implementation comparison matrix (as of May 2026)

Aspect	Polaris	Unity OSS	Lakekeeper	Nessie	Gravitino	Glue REST
License	Apache 2.0	Apache 2.0	Apache 2.0	Apache 2.0	Apache 2.0	(managed)
Language	Java	Java	Rust	Java	Java	(managed)
Backend	JDBC	JDBC	Postgres	Rocks/PG/Mongo	JDBC	AWS-managed
REST spec compliance	Full	Full	Full	Full	Partial–Full	Partial
Vended credentials	✓	✓	✓	✓	✓	✓
RBAC	✓	✓	✓ (FGA)	✓	✓	IAM
Multi-cloud	✓	✓	✓	✓	✓	AWS only
Branch/merge (catalog)	✗	✗	✗	✓	✗	✗
Non-Iceberg catalogs	✗	Delta	✗	✗	✓ (many)	Glue assets
Operational burden	Medium	Medium	Low	Medium	Medium	None

Each implementation is moving fast; the table above is a generalized snapshot. Check the latest release notes from each project at adoption time.

4. Self-Hosted vs. Managed

4.1 When self-hosting makes sense

Multi-cloud / mixed on-prem — when you cannot afford to be tied to a single cloud vendor.
Aligning permission and audit models with internal standards — when integration with an internal OIDC IdP, Ranger, or OPA is required.
Data residency requirements — when even the metadata must stay in a specific region.
When the catalog must serve as the control plane of the internal data platform — for policy decisions you can't outsource to an external SaaS.

4.2 When managed is the right fit

Single-cloud, fast-adoption priority — Glue REST, Snowflake Polaris, Databricks Unity managed.
Limited operations headcount — when you don't have the bandwidth to operate the catalog itself.
You want to outsource the burden of keeping up with spec evolution — the REST spec is evolving fast, and self-hosting demands continual patching.

4.3 Side-by-side summary

Aspect	Self-hosted	Managed
Initial setup cost	High	Low
Operations headcount	Required	Minimal
Permission/policy flexibility	Very high	Limited (vendor model)
Multi-cloud	Free	Typically single-cloud
Data-residency control	Free	Vendor-dependent
Spec updates	You follow them	Automatic
Cost model	Infra + people	Usage-based

5. Self-Hosted Operational Patterns

5.1 Deployment topology

Loading diagram…

Key decisions:

REST Server is stateless, with metadata state held in the backend DB.
Horizontal scale-out — add instances as load grows. Start with 2–6 and adjust to traffic.
Backend DB HA — Postgres streaming replication plus automatic failover (e.g., Patroni, Cloud SQL).
Align object-storage lifecycle policies with the catalog's housekeeping jobs.

5.2 Authentication & authorization integration

The common pattern places an OIDC IdP (Okta, Entra, Keycloak) in front of the catalog.

Loading diagram…

The essentials:

Both users and services use only tokens issued by the IdP.
The catalog is only responsible for token → principal → authorization mapping.
Vended credentials are issued by the catalog (scoped to table prefix and expiry).

5.3 Backup and recovery

You need backups for both catalog assets.

Backend DB — Postgres point-in-time recovery (PITR). Daily base + WAL retention, typically 14–30 days.
metadata.json and manifests in object storage — object-storage versioning plus appropriate lifecycle.

Recovery scenarios:

Catalog DB corruption: restore the DB → restart REST Server → catalog points at the prior metadata.json location.
Table metadata corruption: swap the catalog pointer to an older metadata.json in object storage (register API). The same mechanism as time travel.

5.4 Monitoring metrics

Metric	Meaning	Red flag
Commit RPS	Catalog load	Sharp spike over baseline
Commit p95 latency	Backend DB performance	Above 500 ms
Conflict (409) ratio	Concurrent-write pressure	Above 5%
OAuth token issuance RPS	Authentication load	Sharp spike over baseline
Backend DB connections	Pool size	Above 80%
Vended credential issuance rate	Traffic shape	Abnormal pattern detected
5xx ratio	Service health	Above 0.1%
Average metadata.json size	Metadata-bloat signal	Above 8 MiB — review expiration policy

5.5 Upgrade strategy

The REST spec evolves rapidly. A safe upgrade pattern:

Canary instance — Bring up the new version on a single host with 5% traffic. Observe for 1–7 days.
Compatibility checks — Confirm in-house engines correctly interpret responses from the new spec.
Backend migration — When a spec change demands schema changes, separate the migration into its own job.
Rollback plan — Take a DB snapshot before any migration, and pre-validate the rollback procedure.

6. Engine Wiring Examples

6.1 Apache Spark

spark.conf.set("spark.sql.catalog.lake",                "org.apache.iceberg.spark.SparkCatalog")
spark.conf.set("spark.sql.catalog.lake.catalog-impl",   "org.apache.iceberg.rest.RESTCatalog")
spark.conf.set("spark.sql.catalog.lake.uri",            "https://catalog.example.com/v1")
spark.conf.set("spark.sql.catalog.lake.warehouse",      "s3://data-lake/warehouse")
spark.conf.set("spark.sql.catalog.lake.credential",
                 "svc-spark-prod:SECRET")              // OAuth2 client_credentials
spark.conf.set("spark.sql.catalog.lake.scope",          "catalog")
spark.conf.set("spark.sql.catalog.lake.token-refresh-enabled", "true")
spark.conf.set("spark.sql.catalog.lake.io-impl",        "org.apache.iceberg.aws.s3.S3FileIO")
 
spark.sql("USE lake.db")
spark.sql("SELECT count(*) FROM events").show()

6.2 Trino

catalog/iceberg.properties:

connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=https://catalog.example.com/v1
iceberg.rest-catalog.warehouse=s3://data-lake/warehouse
iceberg.rest-catalog.security=OAUTH2
iceberg.rest-catalog.oauth2.credential=svc-trino-prod:SECRET
iceberg.rest-catalog.oauth2.scope=catalog
iceberg.rest-catalog.vended-credentials-enabled=true
fs.native-s3.enabled=true

6.3 Apache Flink

CREATE CATALOG lake WITH (
  'type'='iceberg',
  'catalog-impl'='org.apache.iceberg.rest.RESTCatalog',
  'uri'='https://catalog.example.com/v1',
  'warehouse'='s3://data-lake/warehouse',
  'credential'='svc-flink-prod:SECRET',
  'scope'='catalog',
  'io-impl'='org.apache.iceberg.aws.s3.S3FileIO'
);
 
USE CATALOG lake;
SELECT count(*) FROM db.events;

6.4 PyIceberg

from pyiceberg.catalog.rest import RestCatalog
 
catalog = RestCatalog(
    name="lake",
    **{
        "uri": "https://catalog.example.com/v1",
        "warehouse": "s3://data-lake/warehouse",
        "credential": "svc-pyiceberg:SECRET",
        "scope": "catalog",
    },
)
 
table = catalog.load_table(("db", "events"))
df = table.scan(row_filter="event_ts >= '2026-05-01'").to_arrow()
print(df.num_rows)

6.5 Snowflake (Polaris as an external catalog)

CREATE OR REPLACE CATALOG INTEGRATION ext_polaris
  CATALOG_SOURCE = POLARIS
  TABLE_FORMAT   = ICEBERG
  REST_CONFIG = (
    CATALOG_URI = 'https://catalog.example.com/v1',
    CATALOG_NAME = 'lake'
  )
  REST_AUTHENTICATION = (
    TYPE = OAUTH,
    OAUTH_CLIENT_ID = 'svc-snowflake-prod',
    OAUTH_CLIENT_SECRET = '...',
    OAUTH_ALLOWED_SCOPES = ('catalog')
  )
  ENABLED = TRUE;
 
CREATE ICEBERG TABLE events
  EXTERNAL_VOLUME = 'vol_data_lake'
  CATALOG = 'ext_polaris'
  CATALOG_TABLE_NAME = 'events'
  CATALOG_NAMESPACE = 'db';
 
SELECT count(*) FROM events;

7. Migrating from HMS / Glue to REST Catalog

7.1 Two approaches

Backend adapter — place a REST Server on top of the existing HMS/Glue.
- Polaris, Lakekeeper, and others can use HMS as a backend (check support per implementation).
- Leave data and metadata locations in place; only point engines at the REST endpoint.
- The standard pattern for a gradual migration.
Metadata re-registration — move to an entirely new backend.
- Stand up a new REST Catalog (e.g., Polaris with Postgres) separately.
- Register existing tables' metadata.json locations via the register API.
- Keep the old catalog read-only for a period → confirm every engine has migrated → decommission.

7.2 Phase-by-phase checklist

Phase 0 — Inventory

Table count and namespace structure in the current catalog (HMS/Glue)
List of in-house engines and the type of catalog client each uses
Mapping plan for the current permission model (Ranger, Lake Formation, Unity, …) onto the REST Catalog

Phase 1 — Build a shadow catalog

Deploy N REST Server instances
Configure the backend DB (HMS adapter or a fresh Postgres)
Validate OIDC IdP / OAuth integration
Design IAM roles and grant permissions for vended credentials

Phase 2 — Pilot one domain

Migrate one non-critical domain to the REST Catalog
Verify identical results across two engines (e.g., Spark + Trino)
Confirm operational automation jobs (compaction, expiration) work on the new catalog
Verify the permission/audit policies behave as intended

Phase 3 — Gradual migration of core domains

Migrate domain by domain, keeping the old catalog in parallel for a period
Move in-house BI/ML pipelines to the new endpoint
Replace all worker IAM keys with the vended-credentials model

Phase 4 — Decommission the old catalog

Announce internally that the new catalog is the single source of truth
Make the old catalog read-only → decommission
Clean up IAM credentials and service accounts associated with the old catalog

8. Common Mistakes and Pitfalls

Starting without a catalog — Beginning with a file system catalog and then paying a heavy migration price when multi-engine adoption arrives. Use a REST Catalog from day one.
Disabling vended credentials — The decision to "just keep using our existing IAM keys" gives up more than half of the security wins a REST Catalog brings.
Treating the backend DB as a single point of failure — The REST Server is stateless and easy to scale, but the catalog's HA equals the backend DB's HA. It needs its own design.
Forcing the permission model into the spec — If internal policy can't be expressed directly in the REST spec, pair it with an external policy engine like OPA. Don't oversimplify by looking only at the spec.
Falling behind on spec updates — The REST spec evolves quickly, so plan a catalog upgrade every 6–12 months to pick up new features (multi-table transactions, MV spec, etc.).
No catalog monitoring — If you aren't watching commit latency, conflicts, and 5xx, you only learn the catalog has become a bottleneck for the entire data pipeline after the fact.

9. Adoption Recommendations

9.1 A new multi-engine Lakehouse

Adopt one of Polaris / Unity OSS / Lakekeeper and start on the REST standard from day one.
Backend: Postgres + HA. OAuth2 with internal IdP.
Vended credentials are mandatory.

9.2 Single-cloud, fast adoption

AWS-centric → start with the Glue REST endpoint, move to self-hosted later if needed.
Snowflake-centric → managed Polaris.
Databricks-centric → managed Unity.

9.3 Environments with significant existing HMS / Glue assets

Adapter-style REST (Polaris/Lakekeeper with HMS as the backend) → gradual migration → move to the new backend.
Don't migrate all at once. The order is: pilot domain → core domains → decommission.

9.4 Environments with strong data-residency or security requirements

Self-hosting is effectively the only option. Evaluate Lakekeeper (lightweight) or Polaris (feature-rich).
Design IdP and policy engine (OPA, Ranger) integration up front.

10. Wrap-up

The REST Catalog is a specification, and there are multiple open-source servers implementing it. "Picking a catalog" really means "picking an implementation."
Three core values: engine/catalog decoupling, the catalog becoming the control plane, and shrinking the security surface with vended credentials.
For a new multi-engine environment, starting with a REST Catalog is the de facto standard. If single-cloud and fast adoption matter most, go managed; if multi-cloud and policy flexibility matter most, self-host.
Operate the catalog as stateless REST servers + a transactional backend + an IdP + object storage. Bake monitoring, backup, and upgrade strategies into the design from the start.

The REST Catalog is, more than even the Iceberg spec itself, the decisive factor that changes the Lakehouse operating model. When the catalog is the control plane, permissions, audits, and policies converge in one place, and the cost of adding a new engine drops dramatically. That is why nearly every data platform in 2026 is redesigning its catalog.

References

Iceberg REST Catalog OpenAPI spec — open-api/rest-catalog-open-api.yaml in the Iceberg repository
Apache Polaris — polaris.apache.org
Unity Catalog OSS — unitycatalog.io
Lakekeeper — lakekeeper.io
Project Nessie — projectnessie.org
Apache Gravitino — gravitino.apache.org
AWS Glue Iceberg REST endpoint documentation
Iceberg official spec — iceberg.apache.org/spec

1. Why a REST Catalog

1.1 The catalog problem in a multi-engine era

1.2 The arrival of the REST Catalog spec

1.3 What the REST Catalog changed

2. The Iceberg REST Catalog Specification

2.1 Where the spec lives

2.2 Key endpoints (summary)

2.3 The metadata commit flow

2.4 Vended credentials — delegating credentials

2.5 Authentication models

3. Major Implementations

3.1 Apache Polaris

3.2 Unity Catalog OSS

3.3 Lakekeeper

3.4 Project Nessie

3.5 Apache Gravitino

3.6 Tabular (Iceberg's founders)

3.7 Cloud-managed

3.8 Implementation comparison matrix (as of May 2026)

4. Self-Hosted vs. Managed

4.1 When self-hosting makes sense

4.2 When managed is the right fit

4.3 Side-by-side summary

5. Self-Hosted Operational Patterns

5.1 Deployment topology

5.2 Authentication & authorization integration

5.3 Backup and recovery

5.4 Monitoring metrics

5.5 Upgrade strategy

6. Engine Wiring Examples

6.1 Apache Spark

6.2 Trino

6.3 Apache Flink

6.4 PyIceberg

6.5 Snowflake (Polaris as an external catalog)

7. Migrating from HMS / Glue to REST Catalog

7.1 Two approaches

7.2 Phase-by-phase checklist

Phase 0 — Inventory

Phase 1 — Build a shadow catalog

Phase 2 — Pilot one domain

Phase 3 — Gradual migration of core domains

Phase 4 — Decommission the old catalog

8. Common Mistakes and Pitfalls

9. Adoption Recommendations

9.1 A new multi-engine Lakehouse

9.2 Single-cloud, fast adoption

9.3 Environments with significant existing HMS / Glue assets

9.4 Environments with strong data-residency or security requirements

10. Wrap-up

Related Posts

References