Skip to main content

Data Catalog Sync Job


IOMETE offers the Data Catalog Sync Job, allowing you to synchronize data from multiple sources with the IOMETE Data Catalog. This centralization process simplifies the discovery, access, and management of your data for analytical and ML/AI workloads.

Installation

Automated Installation

You can easily install the Data Catalog Sync Job through the Spark Job Marketplace in IOMETE. Just search for the job in the marketplace and click "Deploy" to get started.

IOMETE Spark Jobs | IOMETEIOMETE Spark Jobs | IOMETE

Manual Installation

Here's a simplified guide to manually installing the Data Catalog Sync Job:

  • In the left sidebar menu choose Spark Jobs
  • Click on Create
IOMETE Spark Jobs | IOMETEIOMETE Spark Jobs | IOMETE

Please specify the following parameters (these are examples, and you can customize them according to your preferences):

  • Name: catalog-sync
  • schedule: 0 * * * *
  • concurrency: FORBID
IOMETE Spark Jobs add General info | IOMETEIOMETE Spark Jobs add General info | IOMETE
  • Deployment:

    • Docker image: iomete/iom-catalog-sync:1.8.0

    • Main Application File: spark-internal

    • Main Class: com.iomete.catalogsync.App

      IOMETE Spark Jobs add Deployment config | IOMETEIOMETE Spark Jobs add Deployment config | IOMETE
    • Instance Config:

      • Driver Type: driver-small

      • Executor Type: exec-small

      • Executor Count: 1

        IOMETE Spark Jobs add config file | IOMETEIOMETE Spark Jobs add config file | IOMETE

        And, hit the Createbutton.

Run the job

You can trigger the job manually by clicking on the Run job button.

tip

The job will be run based on the defined schedule. But, you can trigger the job manually by clicking on the Run job button.

IOMETE Run Job Manually | IOMETEIOMETE Run Job Manually | IOMETE