Z-ORDER sorting during compaction
· One min read
Apache Iceberg supports Z-ORDER sorting during compaction (rewrite_data_files), but not during normal inserts or as a create table configuration.
To force the whole dataset to be ordered using Z-ORDER, you can use the following steps:
-
Set a default WRITE ORDERED BY for the table.
ALTER TABLE db.table_name WRITE ORDERED BY (col1, col2);
-
Perform a rewrite_data_files operation with the
sort
strategy specified andrewrite-all
option set totrue
.CALL spark_catalog.system.rewrite_data_files(
table => 'db.table_name',
strategy => 'sort',
sort_order => 'zorder(col1, col2)',
options => map('rewrite-all', 'true')
);
Discovering the data lakehouse platform?
Try SandboxAdditional notes
- It is important to note that rewriting the whole dataset can be a very expensive operation, so it is important to only do this when necessary.
- It is also worth noting that there is an open issue on GitHub to add support for Z-ORDER sorting during normal inserts.