Z-ORDER sorting during compaction

October 25, 2023 · One min read

Aytan Jalilova

Developer Advocate @ IOMETE

Apache Iceberg supports Z-ORDER sorting during compaction (rewrite_data_files), but not during normal inserts or as a create table configuration.

To force the whole dataset to be ordered using Z-ORDER, you can use the following steps:

Set a default WRITE ORDERED BY for the table.

ALTER TABLE db.table_name WRITE ORDERED BY (col1, col2);

Perform a rewrite_data_files operation with the sort strategy specified and rewrite-all option set to true.

CALL spark_catalog.system.rewrite_data_files(
	table => 'db.table_name',
	strategy => 'sort',
	sort_order => 'zorder(col1, col2)',
	options => map('rewrite-all', 'true')
);

Discovering the data lakehouse platform?

Try Sandbox

Additional notes

It is important to note that rewriting the whole dataset can be a very expensive operation, so it is important to only do this when necessary.
It is also worth noting that there is an open issue on GitHub to add support for Z-ORDER sorting during normal inserts.