What is Apache Iceberg?
Iceberg is a high-performance open table format for huge analytic datasets. Iceberg brings the reliability and simplicity of SQL tables to big data, while making it possible for engines like Spark, Trino, Flink, Presto, Hive and Impala to safely work with the same tables, at the same time.
Open Standard
Iceberg has been designed and developed to be an open community standard with a
specification to ensure compatibility across languages and implementations.
Apache Iceberg is open source, and is developed at the
Apache Software Foundation.
IOMETE and Apache Iceberg
IOMETE is a fully-managed (ready to use, batteries included) data platform. IOMETE optimizes clustering, compaction, and access control to Iceberg tables. The core of the IOMETE platform is a
serverless lakehouse that leverages Apache Iceberg as its core table format. The IOMETE platform includes the following modules:
Apache Iceberg Benefits
Iceberg avoids unpleasant surprises. Schema evolution works and won’t inadvertently un-delete data. Users don’t need to know about partitioning to get fast queries.
Reliability and Performance
Iceberg was built for huge tables. Iceberg is used in production where a single table can contain tens of petabytes of data and even these huge tables can be read without a distributed SQL engine.
- Scan planning is fast – a distributed SQL engine isn’t needed to read a table or find files.
- Advanced filtering – data files are pruned with partition and column-level stats, using table metadata.
Iceberg was designed to solve correctness problems in eventually-consistent cloud object stores.
- Works with any cloud store and reduces NN congestion when in HDFS, by avoiding listing and renames.
- Multiple concurrent writers use optimistic concurrency and will retry to ensure that compatible updates succeed, even when writes conflict.
- Serializable isolation – table changes are atomic and readers never see partial or uncommitted changes.
The Apache Iceberg Community
Iceberg has a vibrant and active community. The project has consistently seen code contributions grow year over year. And there are contributors across many companies, including Netflix, Apple, and AWS. There are regular monthly community syncs where there is an open discussion on the current items of development and highlights of recently merged features. The Iceberg community has fostered a very friendly and engaging culture, which has been a key part of the project’s success.
Related Docs
- Check out our blog post on Apache Iceberg.
- Download the IOMETE Apache Iceberg Cheat Sheet (PDF).
Apache Iceberg Resources