Skip to main content

Unifying Data Sources in a Single Query with IOMETE


Imagine running a single SQL query that seamlessly pulls together data from a MySQL database, a PostgreSQL database, a CSV file stored in an S3 bucket, and a Snowflake table. Sounds like a dream, right? Well, with IOMETE, this is not only possible but also incredibly straightforward. In this guide, we'll show you how to do just that. Let's dive in!

The Power of a Single Query

Before we get into the nitty-gritty details, let's see the end game—a SQL query that joins data from multiple sources:

-- Joining MySQL, PostgreSQL, CSV, and Snowflake tables
SELECT m.*, p.*, c.*, s.*
FROM mysqlTable m
JOIN postgreTable p ON =
JOIN csvTable c ON c.flight_id = c.flight_id
JOIN snowflake_table s ON m.snowflake_id =;

This query brings together data from MySQL, PostgreSQL, a CSV file, and a Snowflake table. Now, let's backtrack and see how we set up each of these data sources in IOMETE.

Data Sources Supported by IOMETE

  • IOMETE Managed Data Lake Tables (Iceberg)
  • Files in Object Storage (CSV, JSON, Parquet, ORC)
  • JDBC Sources (MySQL, PostgreSQL, Oracle, etc.)
  • Snowflake Tables

JDBC Sources: MySQL and PostgreSQL

JDBC sources like MySQL, PostgreSQL, MS SQL Server and Oracle can be easily integrated into IOMETE. You can create a proxy table that links to your database table and then query it as if it were a local table.


-- Creating a proxy table
USING org.apache.spark.sql.jdbc
url "jdbc:mysql://db_host:db_port/db_name",
dbtable "schema.tablename",
driver 'com.mysql.cj.jdbc.Driver',
user 'username',
password 'password'


-- Creating a proxy table
CREATE TABLE postgreTable
USING org.apache.spark.sql.jdbc
url "jdbc:postgresql://db_host:db_port/db_name",
dbtable 'schema.tablename',
user 'username',
password 'password'

For more, visit our JDBC Sources Documentation.

Object Storage: CSV, JSON, Parquet, and ORC Files

IOMETE allows you to read various file formats directly from object storage services like S3.

CSV Files

CREATE table csvTable
header "true",
path "s3a://iomete-lakehouse-shared/superset_examples/tutorial_flights.csv"

JSON Files

CREATE TABLE countries
USING org.apache.spark.sql.json
path "s3a://iomete-lakehouse-shared/superset_examples/countries.json"

Parquet and ORC Files

-- Parquet
CREATE TABLE parquetTable
USING org.apache.spark.sql.parquet
path "s3a://iomete-lakehouse/trial-staging-area/parquet/userdata1.parquet"

-- ORC
path "s3a://iomete-lakehouse-shared/orc/userdata1_orc"

For more details, check our documentation on CSV, JSON, Parquet, and ORC.

Snowflake Integration

IOMETE also supports Snowflake, and the connector is already included in the distribution package.

-- Create a new table in Spark linked to a Snowflake table
CREATE TABLE snowflake_table
USING snowflake
dbtable '<table-name>',
sfUrl '<snowflake-account-url>',
sfUser '<username>',
sfPassword '<password>',
sfDatabase '<database-name>',
sfSchema '<schema-name>',
sfWarehouse '<warehouse-name>'

For more details, check the Snowflake Connector.


IOMETE makes it incredibly easy to join multiple disparate data sources into a single query, enabling you to perform complex analytics without the hassle of data movement or transformation. With just a few lines of SQL, you can bring together data from JDBC sources, object storage files, and even Snowflake tables.

For more information, you can always refer to the comprehensive IOMETE documentation. Happy querying!