Skip to main content

Life Cycle Management (LCM)

Installation

IOMETE is designed to operate within a Kubernetes environment. The installation process, streamlined through Helm, is well-organized, efficient, and straightforward.

On-prem infrastructure overview | IOMETEOn-prem infrastructure overview | IOMETE

Prerequisites:

  1. A pre-configured Kubernetes cluster.
  2. Necessary credentials and access to external storage and database (a built-in database solution can also be used).

Steps for Installation:

  1. Repository Cloning: Clone the dedicated GitHub repository that contains the Helm charts and other essential files for deployment.
  2. Helm Installation: Utilize the provided Helm charts from the repository to initiate the installation process.

All required files, examples, and a comprehensive installation guide can be found in our GitHub repository. We highly recommend following the step-by-step guide provided in the repository and using the example configurations as templates.

Installation Example

Here is an example for data plane installation. You'll only need to modify the values in the data-plane-values.yaml file and run the following command:

helm upgrade --install -n iomete iomete-dataplane iomete/iomete-dataplane \
-f data-plane-values.yaml --version 1.0.5

Below is an example of the data-plane-values.yaml file that the Helm chart uses to install the IOMETE data plane:

data-plane-values.yaml
dataPlane:
adminUser:
username: "admin"
email: "admin@example.com" # required for password reset
firstName: Admin
lastName: Admin
temporaryPassword: "temp_password_here" # you will be prompted to change this on first login

istioGateway:
# if you want to enable TLS, set the tlsCertSecretName
# name of the secret with certificates. Secret should contain 2 files: tls.crt and tls.key
# if redirectHttpToHttps is set to true, this secret must be provided to handle https traffic
tlsCertSecretName: ""

spark:
logLevel: warn

jupyter:
authToken: "********"

keycloak:
adminUser: "kc_admin"
adminPassword: "********"
# if adminPasswordSecret is set, the password will be read from the secret
adminPasswordSecret: {}
# name: secret-name
# key: password

storage:
lakehouse_bucket_name: "lakehouse"
assets_bucket_name: "assets"
settings:
type: "minio"
endpoint: "http://minio.default.svc.cluster.local:9000"
accessKey: "admin"
secretKey: "********"
# if secretKeySecret is set, the secretKey will be read from the secret
secretKeySecret: {}
# name: secret-name
# key: secretKey
additionalConfigOverrides: {}
# core-site-xml-overrides: |
# fs.s3a.path.style.access: true
# fs.s3a.connection.ssl.enabled: false

database:
host: "mysql"
port: "3306"
user: "db_user"
password: "********"
# if passwordSecret is set, the password will be read from the secret
passwordSecret: {}
# name: byo-db-creds
# key: password
prefix: "iomete_" # all IOMETE databases should be prefixed with this. See database init script.

clusterDomain: cluster.local

ranger:
enabled: true

Upgrades

To ensure our customers benefit from the continuous improvements, feature additions, and optimizations we make, we release updates periodically. With every release, our primary focus remains the seamless transition of our users from one version to another, ensuring stability, data integrity, and operational continuity.

Semantic Versioning

IOMETE follows the Semantic Versioning standard. This ensures that our customers can easily understand the impact of a new release and plan accordingly.

  • Major Version: A major version release indicates that the update contains significant changes. These changes can include new features, breaking changes, or other significant improvements.
  • Minor Version: A minor version release indicates that the update contains new features, improvements, and bug fixes. These changes are backward-compatible and do not require any significant changes to the existing setup.
  • Patch Version: A patch version release indicates that the update contains bug fixes and minor improvements. These changes are backward-compatible and do not require any significant changes to the existing setup.

Each release is accompanied by a detailed changelog, which provides a comprehensive overview of the changes included in the update.

Upgrade Notification and Support

Upon the release of a new update:

  1. Notification: Customers will receive a notification regarding the new release. This will detail the new features, improvements, and any critical fixes included.
  2. Upgrade Guide: Accompanying the notification, customers will be provided with a comprehensive upgrade guide. This manual will have step-by-step instructions, tailored to help users navigate through the upgrade process with ease.
  3. Dedicated Support: Understanding that upgrades can sometimes be intricate, our support team is always ready to assist. Whether you need clarification on a step or face any technical challenges, our experts will be there to guide and ensure a successful upgrade.

Upgrade Process

  1. Backup: Before starting the upgrade, take backups of configurations, databases, and any vital data. This precautionary step ensures data safety.
  2. Access New Release: Fetch the updated Helm chart corresponding to the new release.
  3. Apply the Upgrade: Use the Helm upgrade command with the fetched chart. Follow the instructions provided in the upgrade guide. Using Kubernetes replication and RollingUpdate strategies we ensure smooth transition of our services.
  4. Validation: Post-upgrade, verify the platform's state. Check if all the services are running correctly and test key functionalities to ensure everything operates as expected.

Upgrade Example

In most cases, just running the below command will be enough to upgrade the platform. Assuming data-plane-values.yaml is adjusted to the new version:

important

Replace --version 1.0.5 with the version you want to upgrade to

helm upgrade --install -n iomete iomete-dataplane iomete/iomete-dataplane \
-f data-plane-values.yaml --version 1.0.5

Rollback Process

While we strive to make every upgrade seamless, unforeseen issues can arise. In such scenarios, the platform is designed to support smooth rollbacks.

  1. Rollback Guide: If the need arises, a rollback guide will be provided, detailing how to revert to the previous stable version without loss of data or functionality.
  2. Rollback Execution: Use Helm's rollback command to revert to the previous state/version. Helm’s rollback feature ensures that the platform is restored to its previous operational state without hitches.
  3. Support During Rollback: During a rollback, our support team remains available to address any challenges or concerns, ensuring that the system is restored to its optimal state without unnecessary downtime.

Rollback Example

Again, in most cases, rollback is just running helm upgrade commands with the previous version. Assuming data-plane-values.yaml is adjusted to the previous version:

tip

It is recommended to keep a copy of the previous version of the file. Version control systems like Git can be used to track changes.

helm upgrade --install -n iomete iomete-dataplane iomete/iomete-dataplane \
-f data-plane-values.yaml --version 1.0.5

Monitoring and Management

Monitoring and management are pivotal to ensuring the health and performance. We've integrated leading monitoring tools such as Prometheus, Loki, and Grafana.

Monitoring Features:

  1. Prometheus: Collects metrics from configured targets at specified intervals, evaluates rule expressions, and can trigger alerts if certain conditions are observed.
  2. Loki: A horizontally scalable, highly available, multi-tenant log aggregation system.
  3. Grafana: Offers visualization panels to view metrics and logs.

IOMETE provides built-in monitoring for Spark services

Spark jobs metrics | IOMETESpark jobs metrics | IOMETE

Monitoring Kubernetes nodes through Grafana.

Additional metrics dashboard on Grafana | IOMETEAdditional metrics dashboard on Grafana | IOMETE

High Availability / Disaster Recovery

1. External Dependencies and HA

  • MySQL Database:
    • Our platform integrates with a high-availability MySQL database setup. This ensures that database operations remain resilient to failures, guaranteeing uninterrupted data access.
    • Features like automated failover, replication, and backup capabilities of the MySQL setup mean that the platform can continue its operations even if a primary database node fails.
  • Storage Solution:
    • The platform is designed to work seamlessly with high-availability storage solutions. This guarantees that the data remains accessible and safe.
    • Through redundancy, data replication, and failover protocols present in these storage solutions, data loss or inaccessibility issues are effectively negated.

2. Stateless Architecture

One of the foundational strengths of our platform is its stateless nature:

  • Being stateless ensures that no session or user-specific data is stored on any single server or node. This means, in the event of a node failure, another node can easily take over without any loss of data or disruption to the user.
  • This architecture also aids in effortless horizontal scaling, as new nodes can be added without needing any unique configurations or data.

3. Kubernetes-driven Replication and Failover

Leveraging Kubernetes' powerful orchestration capabilities, our platform achieves enhanced availability:

  • Replication Across Nodes: Kubernetes ensures that the platform's services and pods are replicated across multiple nodes. If a node faces issues or goes down, traffic is automatically directed to healthy nodes, ensuring uninterrupted service.

4. Disaster Recovery and Quick Switching

In the improbable event of a complete Kubernetes cluster failure:

  • Customers have the option to maintain a separate standby Kubernetes cluster with our platform deployed.
  • This standby instance can be quickly pointed to the same high-availability database and storage. This ensures a swift switch without any data loss or significant downtime.