Read and write data from Snowflake
IOMETE provides a Snowflake connector to support reading and writing data from Snowflake.
info
The connector is already included in the IOMETE distribution package. No additional installation is required.
Query a Snowflake table in IOMETE
You can configure a connection to Snowflake and then query data:
-- Create a new table in Spark linked to a Snowflake table
CREATE TABLE snowflake_table
USING snowflake
OPTIONS (
dbtable '<table-name>', -- Name of the table in Snowflake
sfUrl '<snowflake-account-url>', -- Snowflake account URL
sfUser '<username>', -- Username for Snowflake account
sfPassword '<password>', -- Password for the username
sfDatabase '<database-name>', -- Database name in Snowflake
sfSchema '<schema-name>', -- Schema name within the database
sfWarehouse '<warehouse-name>' -- Virtual warehouse for data processing
);
-- Query the table
SELECT * FROM snowflake_table LIMIT 100;
The CREATE TABLE
statement with the Snowflake Spark Connector is used to define a new table in Spark that is backed by a table in Snowflake. Below is an explanation of each parameter in the OPTIONS
clause:
Parameters
dbtable '<table-name>'
- Description: Specifies the name of the table in Snowflake that you want to connect to.
- Example: If your Snowflake table is named
employees
, you would setdbtable 'employees'
.
sfUrl '<snowflake-account-url>'
- Description: The URL of the Snowflake account you are connecting to.
- Example:
<account_identifier>.<region_id>.<cloud_provider>.snowflakecomputing.com
(e.g.xy12345.us-east-1.aws.snowflakecomputing.com
).
sfUser '<username>'
- Description: The username used to authenticate with Snowflake.
- Example: If your Snowflake username is
john_doe
, you would setsfUser 'john_doe'
.
sfPassword '<password>'
- Description: The password corresponding to the username, used for authentication.
- Example: If your password is
myPassword123
, you would setsfPassword 'myPassword123'
.
sfDatabase '<database-name>'
- Description: The name of the Snowflake database you want to use.
- Example: If you are using a database named
hr_db
, you would setsfDatabase 'hr_db'
.
sfSchema '<schema-name>'
- Description: The name of the schema within the Snowflake database.
- Example: If you are using a schema named
public
, you would setsfSchema 'public'
.
sfWarehouse '<warehouse-name>'
- Description: The name of the Snowflake virtual warehouse to use for the data processing tasks.
- Example: If you are using a warehouse named
my_warehouse
, you would setsfWarehouse 'my_warehouse'
.
Data frame API
Data frame readers and writers are also available:
Pyspark example:
snowflake_table = (spark.read
.format("snowflake")
.option("dbtable", table_name)
.option("sfUrl", database_host_url)
.option("sfUser", username)
.option("sfPassword", password)
.option("sfDatabase", database_name)
.option("sfSchema", schema_name)
.option("sfWarehouse", warehouse_name)
.load()
)
Scala example:
val snowflake_table = spark.read
.format("snowflake")
.option("dbtable", table_name)
.option("sfUrl", database_host_url)
.option("sfUser", username)
.option("sfPassword", password)
.option("sfDatabase", database_name)
.option("sfSchema", schema_name)
.option("sfWarehouse", warehouse_name)
.load()
Pyspark example writing data to Snowflake and read it back
import os
snowflake_user = os.getenv('SNOWFLAKE_USER')
snowflake_password = os.environ.get('SNOWFLAKE_PASSWORD')
options = {
"sfUrl": "<snowflake-url>",
"sfUser": snowflake_user,
"sfPassword": snowflake_password,
"sfDatabase": "<snowflake-database>",
"sfSchema": "<snowflake-schema>",
"sfWarehouse": "<snowflake-cluster>"
}
# Generate a simple dataset containing five values and write the dataset to Snowflake.
spark.range(5).write \
.format("snowflake") \
.options(**options) \
.option("dbtable", "table_name") \
.save()
# Read the data written by the previous cell back.
df = spark.read \
.format("snowflake") \
.options(options) \
.option("dbtable", "table_name") \
.load()
df.show()
See Snowflake Connector for Spark for more usage information.