File Streaming
Transfer files to iceberg continuously.
File formats
Tested file formats.
- CSV
 
Job creation
- In the left sidebar menu choose Spark Jobs
 - Click on Create
 
Specify the following parameters (these are examples, you can change them based on your preference):
- Name: 
file-streaming-job - Docker image: 
iomete/iomete_file_streaming_job:0.2.0 - Main application file: 
local:///app/driver.py - Environment variables: 
LOG_LEVEL:INFOor ERROR 


You can use Environment variables to store your sensitive variables like password, secrets, etc. Then you can use these variables in your config file using the ${DB_PASSWORD} syntax.
Config file
- 
Config file: Scroll down and expand
Application configurationssection and clickAdd config fileand paste following JSON.

 
{
  file: {
    format: csv,
    path: "files/",
    max_files_per_trigger: 1,
    latest_first: false,
    max_file_age: "7d"
  }
  database: {
    schema: default,
    table: awesome_csv_addresses
  }
  processing_time: {
    interval: 5
    unit: seconds # minutes
  }
}
Configuration properties
| Property | Description | 
|---|---|
file | Required properties to connect and configure. 
  | 
database | Destination database properties. 
  | 
processing_time | Processing time to persist incoming data on iceberg. 
  | 
Create Spark Job - Deployment

Create Spark Job - Instance
You can use Environment Variables to store your sensitive data like password, secrets, etc. Then you can use these variables in your config file using the ${ENV_NAME} syntax.

Create Spark Job - Application Config

Tests
Prepare the dev environment
virtualenv .env #or python3 -m venv .env
source .env/bin/activate
pip install -e ."[dev]"
Run test
python3 -m pytest # or just pytest