Skip to main content

Compute Clusters

A compute cluster provides dedicated CPU and memory resources, powered by Apache Spark, for executing queries. Table data is stored separately in S3 compatible object storage. This separation allows multiple clusters to access the same data while keeping compute environments fully isolated. IOMETE uses the Apache Iceberg table format to support reliable ACID transactions.

Because storage and compute are decoupled, you can right-size each cluster for its specific workload, whether batch ETL, interactive analytics, or a dedicated BI connection. When a cluster is no longer needed, you can shut it down to stop compute costs.

Compute Clusters | IOMETECompute Clusters | IOMETE

Viewing the Cluster List

The Cluster List page shows all compute clusters you have permission to access, along with their current state. Open it by selecting Compute in the left sidebar.

Each row represents one cluster and includes the following columns:

ColumnDescription
NameOpens the cluster detail page. The cluster ID appears below the name (hover to copy).
DriverDisplays the driver status (STARTING, ACTIVE, STOPPED, FAILED) and its node type. Sortable by status.
ExecutorShows executor state, such as Running 2/4, along with the executor node type. Displays Single node for single-node clusters.
NamespaceThe Kubernetes namespace where the cluster runs. Sortable.
Auto scalingDisplays the idle timeout for auto-suspend. Shows Single node for single-node clusters.
ImageConfigured Docker image name. Hidden by default. Use the column selector to display it.
ActionsEllipsis menu with actions based on the cluster's current state. See Managing a Compute Cluster.

Filtering the List

Use the controls above the table to filter results:

  • Namespace. Filter clusters by deployment namespace.
  • Status. Filter by driver state (Starting, Active, Stopped, Failed).
  • Search. Match clusters by name or cluster ID.
Compute cluster list with filters | IOMETECompute cluster list with filters | IOMETE

Creating a Cluster

Create a separate cluster for each workload so resources remain isolated and predictable. The setup typically takes about a minute.

  1. Go to the Compute page.
  2. Click New Compute Cluster in the top-right corner.
  3. Complete the configuration across the six tabs: General, Configurations, Dependencies, Docker settings, Tags, and Review & Create.
  4. Open Review & Create, verify the summary, then click Create.

You can move between tabs using Previous and Next, or by selecting a tab directly. The Next button validates the current tab before proceeding. If validation fails, the tab shows a red exclamation mark and you must fix the errors before continuing.

General Tab

The General tab defines the core configuration of the cluster.

  • Name (required): A unique name using lowercase letters, numbers, and hyphens. It must start and end with a letter or number. This value cannot be changed after creation.

    Naming Constraints

    Maximum 53 characters. Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$.

  • Description (optional): A short explanation of the cluster's purpose.

  • Bundle (required if resource-level access control is enabled): Associates the cluster with a resource bundle that defines access permissions. Hidden when resource-level access control is disabled. Like the name, this cannot be changed later.

  • Namespace (required): The Kubernetes namespace where the cluster will be deployed. Only namespaces available to your account are shown.

  • Deployment type: Choose between:

    • Multi-node (default): Uses separate driver and executor pods.
    • Single-node: Runs only the Spark driver. Executor-related fields and Auto scaling are hidden.
  • Driver node (required): The node type assigned to the Spark driver. The driver coordinates executors and handles incoming connections.

  • Executor node (required for multi-node): The node type used for executor pods.

  • Executor count (required for multi-node): Maximum number of executor pods. Default is 1. The minimum cannot exceed the maximum.

  • Use spot instances (optional): Enables spot or preemptible instances for executor pods to reduce cost. Disabled by default.

  • Volume (optional): Attach a persistent volume. See Volumes for configuration details.

  • Auto scaling (multi-node only): Enabled by default. Executors scale down to zero after the configured idle period and scale back up when a query runs. Idle timeout options range from 1 minute to 3 hours. Default is 30 minutes. Select Disabled to keep executors running continuously.

    Keep Auto Scaling Enabled

    Only executors in the Running state are billed. Scale-up takes 10 to 15 seconds with a hot pool, or 1 to 2 minutes otherwise.

Create compute cluster -- General tab | IOMETECreate compute cluster -- General tab | IOMETE

Configurations Tab

The Configurations tab lets you tune Spark behavior, inject secrets, and set JVM options without rebuilding a Docker image.

  • Environment variables: Key-value pairs injected at runtime. Supports plain text and secret-backed values.
  • Spark config: Standard Spark properties (for example, spark.executor.memoryOverhead = 512m). Also supports secret-backed values.
  • Arguments: Command-line arguments passed to the Spark application.
  • Java options: JVM flags for driver and executor processes (for example, -XX:+UseG1GC).
Create compute cluster -- Configurations tab | IOMETECreate compute cluster -- Configurations tab | IOMETE

Dependencies Tab

The Dependencies tab loads external JARs, Python packages, and Maven artifacts at Spark startup.

  • Jar file locations: URLs or paths to JAR files on the classpath (for example, https://repo.example.com/my-udf.jar).
  • Files: URLs or paths to additional files available at runtime.
  • PY file locations: Paths to Python files (.py, .egg, or .zip) for PySpark (for example, local:///app/package.egg).
  • Maven packages: Maven coordinates resolved at startup (for example, org.apache.spark:spark-avro_2.13:3.5.0).
Create compute cluster -- Dependencies tab | IOMETECreate compute cluster -- Dependencies tab | IOMETE

Docker Settings Tab

This tab lets you override the default Spark runtime image. The Docker image field (optional) lists images from your registered Docker registries. See Private Docker Registry for setup details.

Create compute cluster -- Docker Settings tab | IOMETECreate compute cluster -- Docker Settings tab | IOMETE

Tags Tab

Add Resource tags (key-value metadata pairs) to categorize the cluster. Tags appear in the cluster detail view and help with cost allocation or operational filtering.

Create compute cluster -- Tags tab | IOMETECreate compute cluster -- Tags tab | IOMETE

Review & Create Tab

The Review & Create tab displays a read-only summary of your configuration. Review each section carefully. To make changes, select any tab to return and update the settings. When everything looks correct, click Create.

If creation succeeds, IOMETE provisions and starts the cluster, then redirects you to its detail page. If the cluster name is already in use, you’re returned to the General tab with a validation error prompting you to choose a different name. If resource quotas are exceeded, the form highlights the affected fields with error messages.

Create compute cluster -- Review & Create tab | IOMETECreate compute cluster -- Review & Create tab | IOMETE

Viewing a Compute Cluster

The cluster detail page is where you monitor status, manage lifecycle actions, and connect external tools. To open it, click a cluster name in the list.

The page title displays Compute: {name}, and the breadcrumb shows Compute > {name}.

The header includes state-aware action buttons (see Managing a Compute Cluster) and two monitoring links:

  • Spark Metrics UI. Opens the Spark metrics dashboard. Always available.
  • Spark UI. Opens the live Spark web interface. Enabled only when the driver state is ACTIVE.

If another user deletes the cluster while you are viewing the page, a yellow banner appears stating: This compute has been deleted.

Details Tab

The Details tab shows the current state and configuration of the cluster.

Compute Section

  • Displays identity fields: ID and Name.
  • Shows the Driver state badge.
  • When the state is FAILED, a tooltip explains the failure reason.
  • Lists resource settings, including driver and executor node types, executor counts, Volume, and Auto scaling timeout.
  • For single-node clusters, executor-related fields are hidden.

Metadata Section

  • Namespace
  • Created by user and timestamp
  • Tags
  • Description
Compute cluster detail -- Details tab | IOMETECompute cluster detail -- Details tab | IOMETE

Connections Tab

The Connections tab has ready-to-use snippets for connecting BI tools and applications. Click a connection type card to reveal its configuration.

Available types: Python, JDBC, DBT, Tableau, Power BI, Superset, Metabase, Redash, and Spark Connect. If the Arrow Flight module is enabled, Arrow Flight also appears.

The SQLAlchemy connection URL follows this format:

iomete://{userId}:{accessToken}@{host}{port}/{db}?lakehouse={lakehouseName}

The HTTP path depends on whether a namespace is configured:

  • With namespace: data-plane/{namespace}/lakehouse/{name}
  • Without namespace: lakehouse/{name}
Compute cluster detail -- Connections tab | IOMETECompute cluster detail -- Connections tab | IOMETE

Logs Tab

The Logs tab streams Spark driver logs in real time. Use the time range selector to narrow the window, and click Download to save them as spark-driver-logs.txt.

When per-executor logging is enabled, an Instance dropdown appears above the log viewer so you can inspect individual pod logs.

Compute cluster detail -- Logs tab | IOMETECompute cluster detail -- Logs tab | IOMETE

Kubernetes Events Tab

The Kubernetes events tab lists events for the cluster's pods and highlights warnings. The tab badge shows warning and total counts (for example, 2 / 15) so you can spot problems at a glance. Kubernetes retains events for one hour by default.

Compute cluster detail -- Kubernetes events tab | IOMETECompute cluster detail -- Kubernetes events tab | IOMETE

Activity Tab

The Activity tab logs every start and terminate event for the cluster. Each row shows the Action, Time, and User who triggered it. Results paginate at 20 rows per page.

Compute cluster detail -- Activity tab | IOMETECompute cluster detail -- Activity tab | IOMETE

Configuration Tab

The Configuration tab lists every active Spark key-value pair (both your custom values and IOMETE system defaults) in read-only form.

Compute cluster detail -- Configuration tab | IOMETECompute cluster detail -- Configuration tab | IOMETE

Cluster States

Understanding cluster states helps you predict billing, diagnose failures, and pick the right action.

Driver States

The driver moves through four states during its lifecycle. The page refreshes automatically when the state changes.

StateMeaningAvailable actionsBilling
STARTINGDriver pod is booting. Takes 1 to 2 minutes, or 10 to 15 seconds with a hot pool.Terminate, RestartNot billed
ACTIVEDriver is running and accepting connections. The Spark UI link becomes active.Terminate, RestartBilled
STOPPEDDriver is offline. No connections accepted.Start, ConfigureNot billed
FAILEDDriver crashed or didn't start. Check the Details tab for the error.Terminate, Restart, ConfigureNot billed

A newly created cluster enters STARTING automatically. It moves to ACTIVE once the driver is ready, or to FAILED if a deployment error occurs.

Executor States (Multi-Node Only)

Both the Details tab and the Executor column in the cluster list show executor state. IOMETE hides this for single-node clusters.

State displayMeaning
No running executorsAll executors scaled to zero (auto-suspend kicked in). They scale up when a query arrives.
Running N/MN executors are active out of M configured.
Scaling N/MN executors are pending, waiting for Kubernetes resources.
Running N/M + Scaling P/MA mix of active and pending executors. Load is increasing.

IOMETE only bills for executors in the Running state. Executors scaled to zero don't incur compute charges.

Managing a Compute Cluster

Once a cluster exists, you control its lifecycle from two places: the detail page header and the ellipsis menu on each list row.

Cluster detail page header showing Restart, Terminate, and Configure action buttons | IOMETECluster detail page header showing Restart, Terminate, and Configure action buttons | IOMETE

Configuring a Cluster

To reconfigure a cluster, the driver must be STOPPED or FAILED. While the cluster is running, the Configure button shows a tooltip asking you to terminate first.

Configure button disabled with tooltip: Terminate the compute before configuring | IOMETEConfigure button disabled with tooltip: Terminate the compute before configuring | IOMETE

Configure opens the same six-tab form used during creation, pre-populated with current values. Name and Bundle are read-only in edit mode. After you make changes, review them on the Review & Save tab and click Save.

After saving, the cluster may show an amber Restart required label in the list. Starting or restarting the cluster applies the new settings and clears the label.

Starting a Cluster

Click Start in the header or the list row ellipsis menu. There's no confirmation dialog. The driver moves from STOPPED to STARTING, then to ACTIVE. Start is only enabled when the driver is STOPPED with no other operation pending.

Restarting a Cluster

Click Restart in the header or the list row ellipsis menu, then confirm with Yes, restart it. The driver cycles through STOPPED, STARTING, and ACTIVE. This action is available when the driver is ACTIVE, STARTING, or FAILED.

Restart confirmation popover with Cancel and Yes, restart it buttons | IOMETERestart confirmation popover with Cancel and Yes, restart it buttons | IOMETE
Restart Is Not Atomic

If the start phase fails after a successful terminate, the cluster stays STOPPED. Check the Logs and Kubernetes events tabs to diagnose the failure.

Terminating a Cluster

Click Terminate in the header or the list row ellipsis menu, then confirm with Yes, terminate it. The driver transitions to STOPPED, all active connections drop, and executor state clears. Available when the driver is ACTIVE, STARTING, or FAILED.

Unlike restart, termination leaves the cluster stopped. Start it again manually when ready.

Terminate confirmation popover with Cancel and Yes, terminate it buttons | IOMETETerminate confirmation popover with Cancel and Yes, terminate it buttons | IOMETE

Deleting a Cluster

Deletion permanently removes the cluster and its configuration. Data in cloud object storage is not affected.

  1. Select Delete from the detail page or list row ellipsis menu.
  2. In the confirmation modal, type the exact cluster name.
  3. Click Delete. The button stays disabled until the typed name matches.
Deletion Is Permanent

You cannot undo this action. IOMETE permanently removes the cluster configuration and drops any active connections.

Delete cluster confirmation modal | IOMETEDelete cluster confirmation modal | IOMETE

Access Permissions

Permissions are granted to users or groups and enforced at two levels:

  • Domain level
    The Create Compute permission allows a user to create new clusters. Administrators assign this either directly through member permissions or indirectly through a domain bundle. See Domain Authorization for configuration details.

  • Resource level
    Per-cluster permissions (VIEW, EXECUTE, CONSUME, UPDATE, DELETE) are inherited from the cluster’s assigned resource bundle. The CONSUME permission allows a user to submit queries against the cluster. The cluster list displays only clusters where you have at least VIEW permission. See Resource Bundles to manage bundle-based access control.

Explore these guides for features referenced on this page.

  • Node Types: Create and manage node types for driver and executor pods.
  • Volumes: Attach persistent volumes via the Volume field on the General tab.
  • Secrets: Reference secret values in environment variables and Spark configuration.
  • Private Docker Registry: Register Docker registries so their images appear in the Docker settings tab.
  • Domain Authorization: Manage domain-level permissions for users and groups.
  • Resource Bundles: Control per-resource access through bundle permissions.