Features | DataEngLab

📄️ Apache Iceberg Features

Apache Iceberg is a table format for large analytic datasets. Its main features come from keeping table state in metadata instead of relying on directory layout or engine-specific conventions.

Schema evolution lets a table change over time without rewriting all existing data files. Iceberg tracks columns with stable field IDs, so it can distinguish a renamed column from a deleted-and-recreated column.

📄️ Hidden Partitioning

Hidden partitioning separates the logical table schema from the physical partition layout. Users query real columns, while Iceberg derives partition values internally.

📄️ Partition Evolution

Partition evolution lets you change a table's partition strategy without rewriting existing data files. Old files keep their original partition spec, and new files use the new spec.

📄️ Sort Order Evolution

Sort order evolution controls how new data is written inside files. It is separate from partitioning and from the logical order of columns in the schema.

📄️ Time Travel

Time travel lets you query an Iceberg table as it existed at a previous snapshot. This is useful for audits, reproducible reports, backfills, debugging, and incident recovery.

📄️ Branching and Tagging

Branches and tags are named references to snapshots. They make snapshot history easier to manage for audit, experiments, validation, and release workflows.

📄️ Row-Level Deletes

Row-level deletes let Iceberg remove rows from immutable data files without immediately rewriting the whole file. Delete information is stored separately and applied at read time.

📄️ Concurrency and Isolation

Iceberg is designed for concurrent readers and writers. Readers use committed snapshots, while writers create new metadata and commit changes atomically through a catalog.