iceberglakehousetable-formatdata-platform

Comparing Apache Iceberg Spec Versions — What V1, V2, and V3 Each Changed

What each Apache Iceberg format-version (V1, V2, V3) introduced and how it affects operations, engine compatibility, and workloads. A practitioner-oriented summary covering Position vs Equality deletes, Deletion Vectors, Variant and Geospatial types, and Row Lineage.

Data DynamicsMay 20, 202618 min read

One commonly overlooked decision when adopting Apache Iceberg is format-version selection. "Let's use Iceberg" and "Let's use Iceberg V2" are not the same decision. Even on the same table, the available operations, operational cost, and engine compatibility differ significantly depending on the spec version.

This post lays out what V1, V2, and V3 each solved, where they fell short, and how to choose in practice.

1. What `format-version` Is

At the top of an Iceberg table's metadata.json there is a format-version field.

{
  "format-version": 2,
  "table-uuid": "5f8a...e9",
  "location": "s3://bucket/warehouse/db/events",
  ...
}

This single integer decides which spec the table must be interpreted under. Engines (readers and writers) look at it to decide what operations are possible and what metadata fields exist.

Three key observations:

The spec is cumulative. V2 contains every concept in V1 and adds new features on top. V3 adds on top of V2.
The spec can be upgraded. A table created as V1 can be promoted to format-version=2. Only the metadata is updated; no data is rewritten.
The spec cannot be downgraded. Once a table is on V2, you cannot drop it back to V1 by only using V1-exclusive features. Engines assume this.

2. V1 (2018) — Laying the Foundation

2.1 What it introduced

Iceberg V1 started at Netflix in 2018 and was promoted to Apache Top-Level in 2020. Its five essentials were:

A three-tier metadata model — catalog → metadata.json → manifest list → manifest → data file
Snapshot-based isolation — every change creates a new snapshot, and the pointer in the catalog is swapped atomically.
Field-ID-based column mapping — columns get permanent IDs, so rename and reorder happen without rewriting data.
Hidden partitioning — partition keys are hidden from the user; query predicates are automatically used for pruning.
Time travel — query past state by snapshot ID or timestamp.

V1 was the first comprehensive answer to "how do you resolve Hive's structural limitations with metadata?".

2.2 V1 metadata.json (simplified)

{
  "format-version": 1,
  "table-uuid": "5f8a-...",
  "location": "s3://...",
  "last-updated-ms": 1714000000000,
  "last-column-id": 4,
  "schema": {
    "schema-id": 0,
    "fields": [
      { "id": 1, "name": "event_id",   "required": true,  "type": "long" },
      { "id": 2, "name": "user_id",    "required": false, "type": "long" },
      { "id": 3, "name": "event_ts",   "required": true,  "type": "timestamptz" },
      { "id": 4, "name": "event_type", "required": true,  "type": "string" }
    ]
  },
  "partition-spec": [
    { "name": "event_day", "source-id": 3, "transform": "day", "field-id": 1000 }
  ],
  "current-snapshot-id": 8123412345678901234,
  "snapshots": [ ... ]
}

Notable differences against V2:

schema / partition-spec are single objects (V2 uses arrays).
No sequence number.
No delete-file-related fields.

2.3 What V1 didn't solve

V1 was sufficient for analytical workloads, but the moment you started operating it, the following limits showed up.

Problem 1 — Row-level UPDATE/DELETE cost

V1 has no concept of a delete file. To delete or update a single row, you must rewrite the entire data file that row belongs to. This is Copy-on-Write (CoW).

Scenario: UPDATE one row out of a 1 GB Parquet file
─────────────────────────────────────────
V1 behavior: read all 1 GB → modify one row → write 1 GB
Actual cost: ~2 GB of I/O, shuffle, S3 PUTs

This cost was fatal for these workloads:

CDC (Change Data Capture) — Unusable on tables receiving thousands of row-level updates per second.
GDPR/legal corrections — To erase one user, you must rewrite every file that user appears in.
Frequent MERGE — Cost explodes for Slowly Changing Dimension (SCD) patterns.

Problem 2 — Consistency limits without sequence numbers

V1 orders snapshots in time, but there is no explicit sequence number linking metadata files and data files. With many concurrent writers, conflict detection became awkward.

2.4 Workloads that fit V1

Append-only logs: clickstream, event ingestion
Daily/hourly batch ETL result tables
Read-heavy mart tables

In 2026, there is virtually no reason to choose V1 for a new table. V2 contains everything V1 has while providing stronger consistency and row-level operations.

3. V2 (2021+) — Row-Level Operations Arrive

3.1 What was added

V2 brought two big changes:

Sequence numbers — A unique monotonically increasing number per snapshot. Guarantees consistency for concurrent writes and deletes.
Delete files — Two kinds (position / equality) of delete file express row-level operations without rewriting data files.

This made Merge-on-Read (MoR) mode possible.

3.2 Position delete files

Record "which row at which position in which data file was deleted" as (file_path, position) pairs.

position delete file (Parquet)
┌──────────────────────────────────────────────────────┬──────────┐
│ file_path                                            │ position │
├──────────────────────────────────────────────────────┼──────────┤
│ s3://.../data/00000-aab.parquet                      │       42 │
│ s3://.../data/00000-aab.parquet                      │      105 │
│ s3://.../data/00000-aab.parquet                      │      219 │
│ s3://.../data/00001-bcd.parquet                      │        7 │
└──────────────────────────────────────────────────────┴──────────┘

Properties:

No data file rewrite — only delete markers are appended.
At read time, the engine loads the data file together with the position delete and filters the deleted rows out in memory.
An UPDATE is expressed as "delete the old position + append a new data file."

Workload fit:

CDC ingestion (when you know each change's prior position).
MERGE jobs (when you can find the old position via JOIN).

3.3 Equality delete files

Express deletes as (column = value) predicates.

equality delete file (Parquet)
─────────────────────────────────
schema: (user_id long)

user_id
───────
   1024
   2050
   3199

This means "as of the current moment, every row with user_id in (1024, 2050, 3199) is deleted." If the same key appears again in a later snapshot, that row is considered alive (sequence numbers disambiguate).

Workload fit:

When you only know keys: "delete all rows for these user IDs" (the typical GDPR-correction pattern).
Delete events from external systems where you can't track each old position (file + row number).

3.4 Position vs Equality

Aspect	Position delete	Equality delete
Unit	(file path, row position)	(column = value)
Needs old position	Required	Not required
Write cost	Cost of locating positions	Cost of extracting keys
Read-apply cost	Per-file matching	Evaluate keys across all data files
Best for	CDC, MERGE	GDPR / key-based corrections
Typical performance	Faster	Higher apply cost

Recommendation:

Prefer position delete when possible. Equality is the safety net for "no positional info."
It is normal for both to coexist; Iceberg uses sequence numbers to decide apply order.

3.5 V2 metadata.json — what differs from V1

{
  "format-version": 2,
  "table-uuid": "5f8a-...",
  "last-sequence-number": 142,
  "schemas": [ { "schema-id": 0, "fields": [ ... ] } ],
  "current-schema-id": 0,
  "partition-specs": [ { "spec-id": 0, "fields": [ ... ] } ],
  "default-spec-id": 0,
  "current-snapshot-id": 8123...,
  "snapshots": [
    {
      "snapshot-id": 8123...,
      "sequence-number": 142,
      "timestamp-ms": 1747804790000,
      "summary": {
        "operation": "delete",
        "added-position-delete-files": "1",
        "added-position-deletes": "47"
      },
      "manifest-list": "s3://.../snap-8123-1-abc.avro",
      "schema-id": 0
    }
  ]
}

Changes versus V1:

schema → schemas array (preserves full schema-evolution history).
partition-spec → partition-specs array (enables partition evolution).
A top-level last-sequence-number appears.
Each snapshot has a sequence-number.
The snapshot summary now reports delete-file statistics.

3.6 Operational impact of adopting V2

The moment you turn V2 on, new operational responsibilities arrive.

Delete-file accumulation — Frequent corrections and deletes make delete files grow rapidly. Read cost grows cumulatively.
Compaction is mandatory — V2 MoR tables must absorb deletes via periodic rewrite_data_files.
Read-side memory pressure — When there are many equality deletes, keys are evaluated against every data file, increasing memory usage.

-- The standard compaction pattern for V2 tables
CALL system.rewrite_data_files(
  table => 'db.events',
  options => map(
    'target-file-size-bytes',  '536870912',
    'delete-file-threshold',   '5',     -- Rewrite when 5+ deletes per data file
    'min-input-files',         '5'
  )
);

3.7 Why V2 became the default recommendation

Since 2024, almost every new Iceberg table is created at V2. Reasons:

V2 contains every V1 feature (no reason to choose V1 for analytics).
Operational features like partition evolution only behave correctly from V2.
Most engines treat V2 as first-class.
The moment you start CDC or MERGE, you need V2.

4. V3 (2025+) — Leaps in Expressiveness and Efficiency

V3 was finalized in 2024–2025 and is being gradually adopted by major engines starting in 2025. Five big changes.

4.1 Deletion Vectors (Puffin)

Solves the efficiency problem of V2 position deletes.

V2 position deletes listed (file_path, position) rows inside a Parquet file. Deleting 10,000 rows from one data file produces a 10,000-row Parquet file. At read time you load it into memory and apply via hash or sort-merge.

V3 stores the same information compressed as a Roaring bitmap inside a Puffin format file.

Puffin file
  └─ blob: "deletion-vector-v1"
       schema: Roaring bitmap of deleted positions
       referencedDataFile: s3://.../data/00000-aab.parquet

Effects:

Space efficiency — Even tens of thousands of deletes per file fit in kilobytes.
Read efficiency — Apply with a bitmap AND. No sort or hash join needed.
One deletion vector per data file — Multiple position deletes for the same data file no longer accumulate.

After V3 adoption, read performance for MoR workloads typically improves 30–70% (depending on the table and query).

4.2 Column default values

In V1 and V2, adding a column always filled existing rows with NULL. V3 supports default values at the spec level.

ALTER TABLE events ADD COLUMN priority STRING DEFAULT 'normal';
 
-- Existing rows read as 'normal'. No data rewrite.
SELECT count(*) FROM events WHERE priority = 'normal';

The essentials:

Default applied without data rewrite. The default value is recorded in metadata.json, and engines apply it automatically when reading old data files.
Two kinds of defaults: initial-default for rows that existed before the column was added, and write-default applied to new writes.

4.3 Variant type

Stores semi-structured data like JSON with a consistent encoding. Spark, Snowflake, and Databricks already had their own Variants, but with different encodings. V3 defines a common Variant at the spec level, ensuring cross-engine compatibility.

CREATE TABLE events (
  event_id BIGINT,
  payload  VARIANT
) USING iceberg
TBLPROPERTIES ('format-version' = '3');
 
INSERT INTO events VALUES
  (1, parse_json('{"type":"click","url":"/blog","ua":"chrome/120"}'));
 
-- Extract with engine-specific functions
SELECT event_id,
       payload:type::STRING AS type,
       payload:url::STRING  AS url
FROM events
WHERE payload:type::STRING = 'click';

Strengths:

Another schema-evolution tool — Putting frequently-changing fields in a Variant avoids the cost of column additions and removals.
Cross-engine consistency — Spark, Trino, and Snowflake interpret the same Variant identically.

Caveat:

Columns inside a Variant have no column statistics. Fields hit by frequent predicates are still better as separate columns.

4.4 Geospatial types

V3 defines Geometry and Geography types based on the OGC SQL/MM standard.

CREATE TABLE locations (
  id     BIGINT,
  name   STRING,
  geom   GEOMETRY
) USING iceberg
TBLPROPERTIES ('format-version' = '3');
 
-- Spatial predicate (engine function names vary)
SELECT id, name
FROM locations
WHERE ST_Contains(
        ST_GeomFromText('POLYGON((...))'),
        geom);

With a consistent cross-engine representation, the same spatial data is portable across Spark, Trino, Snowflake, and BigQuery (function names differ, but the data is compatible).

4.5 Row Lineage

Assigns each row a stable ID and a last-updated sequence. Two hidden columns are added.

Field	Meaning
`_row_id`	Permanent unique ID within the table. Once assigned, never changes.
`_last_updated_sequence_number`	Sequence number of the snapshot that last modified the row.

Applications:

Reliable CDC — External systems track rows by _row_id and express change detection precisely.
Reproducible ML training — Recording the set of row IDs used for training lets you reproduce "exactly these rows," beyond what time travel alone can do.
Data governance — Row-level lineage tracking.

-- CDC pattern example
SELECT _row_id,
       _last_updated_sequence_number,
       *
FROM events
WHERE _last_updated_sequence_number > 1024;  -- Changes since the last processed sequence

4.6 Other changes

multi-arg transforms — Partition transforms generalized to accept multiple arguments.
Explicit nanosecond timestamps — Timestamp precision standardized.
Improved manifest statistics — New stats fields improve pruning efficiency.

5. V1 vs V2 vs V3 at a Glance

Item	V1	V2	V3
Introduced	2018	2021	2025
Metadata model	3-tier tree	Same + sequence numbers	Same
Schema evolution	✓	✓	✓ + default values
Partition evolution	Limited	✓ (specs array)	✓
Hidden partitioning	✓	✓	✓ + multi-arg transforms
Time travel	✓	✓	✓
Branch / Tag	Basic (main only)	✓	✓
Sequence number	None	Yes	Yes
Row-level delete	Impossible (CoW only)	Position / Equality delete	Deletion Vectors (Puffin)
Concurrency consistency	Basic	Strengthened	Strengthened
Variant type	No	No	✓
Geospatial types	No	No	✓
Row Lineage	No	No	✓
Default value	NULL only	NULL only	✓
Engine support	Widespread	Widespread	Spreading

6. Upgrading the Spec

6.1 V1 → V2

The simplest upgrade.

ALTER TABLE db.events SET TBLPROPERTIES ('format-version' = '2');

After upgrade:

Old data files remain in place. Only new data files and delete files are added in V2 format.
Sequence numbers start being assigned (old snapshots stay at 0 or null; new ones increase monotonically).
New writes can use MoR (write.delete.mode = 'merge-on-read').

Checkpoints:

Confirm every in-house engine supports V2 (as of 2026 nearly every major engine does).
Announce internally that downgrade is not possible.

6.2 V2 → V3

V3 is richer, so the upgrade requires more care.

ALTER TABLE db.events SET TBLPROPERTIES ('format-version' = '3');

What to confirm:

Every reader engine can read V3. If even one engine doesn't know V3, that engine may fail on the entire table.
You don't have to use new features immediately. After upgrading to V3, if you don't use Variant, Geospatial, or Row Lineage, the table effectively behaves like V2. Upgrading to V3 is the decision to "keep options open."
Deletion Vectors can be enabled gradually. It is common to use vectors only for new deletes while leaving old V2 deletes alone.

6.3 Pre-upgrade checklist

Engine support matrix for the spec (especially V3)
Backup catalog and metadata snapshots
Test that operational automation jobs work under the new spec
Adjust monitoring thresholds for the new spec (add deletion-vector size metrics, for example)
Reflect the change in the internal data catalog and docs

7. Engine Support by Version (as of May 2026)

Engine	V1 read	V1 write	V2 read	V2 write	V3 read	V3 write
Apache Spark	✓	✓	✓	✓	✓	✓ (in progress)
Trino	✓	✓	✓	✓	Partial	In progress
Apache Flink	✓	✓	✓	✓	In progress	In progress
Snowflake	✓	✓	✓	✓	In progress	In progress
Databricks (Unity)	✓	✓	✓	✓	In progress	In progress
BigQuery (BigLake)	✓	Partial	✓	Partial	In progress	In progress
AWS Athena	✓	✓	✓	✓	In progress	In progress
ClickHouse	✓	Experimental	✓	Experimental	Partial	✗
DuckDB	✓	Experimental	✓	Experimental	Partial	✗
PyIceberg	✓	✓	✓	✓	In progress	In progress

The table is a generalized snapshot of typical support levels. At adoption time, consult the latest release notes for each engine alongside the compatibility tables in the official Iceberg spec.

Key observations:

V1 and V2 are mature across every major engine.
V3 is still early in 2026, with significant gaps between engines.
For multi-engine environments, V3 must be aligned to the weakest engine's level of support.

8. Which Version Should You Pick

8.1 New tables

Default recommendation: V2. Stable across every major engine, and provides every must-have operational capability — MoR, partition evolution, sequence numbers.
Start with V3 when: every in-house engine supports V3 and you have a clear use case for at least one of Variant, Geospatial, or Row Lineage.

8.2 Existing V1 tables

Append-only analytical tables: Staying on V1 is fine. Still, upgrading to V2 is recommended for operational consistency and access to partition evolution.
Tables that may take on CDC, MERGE, or corrections: Upgrade to V2 immediately.

8.3 Existing V2 tables

No immediate need to move to V3. Upgrade after you have a clear reason to use a V3 feature.
Safer to adopt once your in-house engines' V3 support has matured.

8.4 Workload-by-workload guide

Workload	Recommended spec	Notes
Daily batch ETL result marts	V2	V1 also works, but V2 for operational consistency
Clickstream / log appends	V2	Benefits from sequence-number consistency
CDC ingestion (with positional tracking)	V2 (position delete)	V3 brings further efficiency
GDPR corrections / key-based deletes	V2 (equality delete)	V3 deletion vectors improve efficiency
Frequent MERGE (SCD)	V2 (MoR) + periodic compaction	V3 recommended (deletion vectors)
Semi-structured JSON payloads	V3 (Variant)	If any engine lacks V3, design a separate column
Location-based analysis	V3 (Geospatial)	Confirm engine function compatibility
Precise row-level CDC / ML reproducibility	V3 (Row Lineage)	Validate engine maturity

9. Decisions That Travel with the Spec Version

Picking a spec version is not a standalone decision. Bundle the following decisions for a clean operation.

9.1 Catalog

The catalog has to understand V3 metadata. Check REST Catalog spec support (v1.6+), or the V3-aware versions of Glue, Unity, and Polaris.

9.2 Write mode (CoW vs MoR)

Even after upgrading to V2/V3, CoW may still be the right choice for analytics-heavy tables.
Set write.delete.mode, write.update.mode, and write.merge.mode per table explicitly.

9.3 Operational automation

After V2, compaction, expiration, and orphan cleanup are not optional — they are mandatory.
Add delete-file / data-file ratios and deletion-vector accumulation to your monitoring.

9.4 Engine standardization

In a multi-engine environment, the weakest engine constrains the spec version.
Bake spec-compatibility checks into your process whenever a new engine is introduced.

10. Wrap-up

V1 established Iceberg's base model but cannot handle row-level operations. There is essentially no reason to pick V1 for a new table in 2026.
V2 introduced sequence numbers and delete files, enabling MoR, CDC, and correction workloads. It is the standard recommendation for new tables in 2026.
V3 lifts expressiveness and efficiency with Deletion Vectors, Variant, Geospatial, Row Lineage, and default values. Adopt gradually in areas where engine support has matured.
The spec version must be decided together with the catalog, write mode, operational automation, and engine standards.

The Iceberg spec is evolving fast, and the direction is consistent — implementing database semantics on top of object storage, more deeply and more efficiently. Understanding spec versions is not just curiosity; it is a key variable that directly drives adoption decisions and operational cost.

References

Apache Iceberg official spec — iceberg.apache.org/spec
Iceberg V2 spec — PR and design documents introducing format-version: 2
Iceberg V3 spec — Variant, Geospatial, and Row Lineage proposal documents
Puffin file format — iceberg.apache.org/puffin-spec
Roaring Bitmap — roaringbitmap.org