Data Compaction Job
Over the time iceberg tables could slow down and require to run data compaction to clean up tables. IOMETE provides built-in job to run data compactions for each table. This job triggers the next iceberg processes:
- ExpireSnapshots Maintenance - Expire Snapshots
- Delete Orphan Files - See Maintenance - Delete Orphan Files
- Rewrite Data Files - See Maintenance - Rewrite Data Files
- Rewrite Manifests - See Maintenance
To enable data compaction spark job follow the next steps:
- Navigate to the
Job Templates
, then click theDeploy
button on the Data Compaction Job card.
data:image/s3,"s3://crabby-images/8b68c/8b68c9059f1bec7c9c060fff6ef8bf2e458b781f" alt="IOMETE Spark Jobs | IOMETE"
data:image/s3,"s3://crabby-images/ea274/ea274db71c0f2ef114ade1ac655757f6dbebdd42" alt="IOMETE Spark Jobs | IOMETE"
- You will see the job creation page with all inputs filled.
data:image/s3,"s3://crabby-images/5eeeb/5eeeb828f46b3979c701ba6c24875d570580667d" alt="Create data compaction job | IOMETE"
data:image/s3,"s3://crabby-images/587b1/587b116fa841919716077ed94fd682df345e95fa" alt="Create data compaction job | IOMETE"
Job Configurations
data:image/s3,"s3://crabby-images/12204/12204d56b4fbcffcdf8dd9222e7442eb3f1aaeda" alt="Data compaction job configurations | IOMETE"
data:image/s3,"s3://crabby-images/8e022/8e022bda228fc784c0fc4b4e6c5f1e05e212a936" alt="Data compaction job configurations | IOMETE"
Instance
data:image/s3,"s3://crabby-images/82113/82113b1b0fb722a28205d768791f813b3e87ee0a" alt="Data compaction job instance | IOMETE"
data:image/s3,"s3://crabby-images/b7cb6/b7cb610999ecd539f9ccbe9746cfaa3c5ae09813" alt="Data compaction job instance | IOMETE"
Github
We've created initial job for data-compaction which will be enough in most cases. Feel free to fork and create new data compaction image based on your company requirements. View in Github