Skip to main content

JDBC Sync Job


IOMETE provides JDBC Sync Job to easily replicate tables from JDBC databases (MySQL, PostgreSQL, etc.) to IOMETE. Only you need to provide configuration properties. Powerful feature of job is Sync mode. You can choose a technique that how you want to move your data. Full load or Incremental. You can find more details bellow about sync mode

Sync Mode

You can define sync mode for each table. Currently, supported sync modes are:

  • FullLoad: Read everything in the source and overwrites whole table at the destination at each sync
  • IncrementalSnapshot: It creates the snapshot of table in the destination and only move the newly inserted and updated records. While writing to IOMETE it uses merge statement. This mode requires 2 parameters: identification_column will be used on merge statement, and tracking_column to track the where it should continue to get data from the source table

Installation

  • In the left sidebar menu choose Spark Jobs
  • Click on Create

Specify the following parameters (these are examples, you can change them based on your preference):

  • Name: jdbc-sync-job
  • Schedule: 0 0/22 1/1 * *
  • Docker image: iomete/iomete_jdbc_sync:0.2.1
  • Main application file: local:///app/driver.py
  • Environment variables: DB_PASSWORD: 9tVDVEKp
IOMETE Spark Jobs Create
Environment variables

You can use Environment variables to store your sensitive variables like password, secrets, etc. Then you can use these variables in your config file using the ${DB_PASSWORD} syntax.


### Config file
  • Config file: Scroll down and expand Application configurations section and click Add config file and paste following JSON.

    IOMETE Spark Jobs add config file
{
source_connection: {
type: mysql,
host: "iomete-tutorial.cetmtjnompsh.eu-central-1.rds.amazonaws.com",
port: 3306,
username: tutorial_user,
password: ${DB_PASSWORD}
},
syncs: [
{
source.schema: employees
source.tables: ["*"]
source.exclude_tables: ["salaries"]
destination.schema: employees_raw
sync_mode.type: full_load
},
{
source.schema: employees
source.tables: [ departments, dept_manager ]
destination.schema: employees_dep
sync_mode.type: full_load
}
]
}

Configuration properties

PropertyDescription
source_connection

Required properties to connect source.

  • type your database type like Mysql, Postgres, etc.
  • host your database host. Example: iomete-tutorial.cetmtjnompsh.eu-central-1.rds.amazonaws.com
  • port port number. Example: 3306
  • username database username
  • password database password
syncs
source.schema

Database where your tables stored.

source.tables

List of tables which you want to replicate. If you want to move all tables set ["*"]

source.exclude_tables

List of tables which you do not want to replicate.

(Optional)
destination.schema

Database name where you want to store tables in your warehouse.

typefull_load overwrites whole table at the destination at each sync or incremental_snapshot only move the newly inserted and updated records.
  • full_load
  • incremental_snapshot
    • identification_column
    • tracking_column

IOMETE Spark job sync create spark job application configuration

And, hit the create button.

The job will be run based on the defined schedule. But, you can trigger the job manually by clicking on the Run button.

IOMETE Spark job sync manual run

Github

You can find source code of JDBC Sync Job in github. View in Github