Data Lakehouse Features
The best performance SQL Data Lakehouse service with data warehouse functionality and data lake flexibility across all of your data. Run all SQL and BI applications at scale with up to 10x better price performance.
- BI - business intelligence workloads that require handling a high volume of concurrent requests
- Exploratory SQL
- SQL ETL/ELT (for example, using DBT or custom backend applications)
- Data science and ML - Prepare (clean/enrich/transform) training data, build feature stores, etc.
- Collect and build centralized data lake for your whole organization
Separation of compute and storage brings greater flexibility and cost savings to organizations planning to monetize their data using big data and advanced analytics
Leveraging modern, battle-tested open-source engines:
- Apache Spark
- Apache Iceberg (Storage Format)
- Ensuring the highest data reliability andintegrity through ACID transaction support
- Ensuring the highest data reliability andintegrity through schema enforcement and governance
- Multi-Cluster Lakehouse for workload isolations
- Enjoy blazing-fast performance - Query petabytesof data in seconds
- Enjoy the benefit of unlimited scaling backed byAWS compute and storage capacity
- One source of truth - keep all your structured and unstructured data in one place
AWS S3 provides outstanding durability and unlimited scalability for the data. Your data is stored in your AWS S3 bucketin open standard Apache Parquet format. Data is compressed by 5-20x, whichtranslates into an equivalent monetary saving
Separation of compute and storage brings greater flexibility and cost savings to organizations planning to monetize their data using big data and advanced analytics
Full ANSI SQL Compatible.
Run any ANSI SQL-compatible SQL code without any modification at IOMETE.
Allows to write custom functions in Scala/Java and use it in the SQL similar to built-in functions.
- UDFs: User-Defined Functions (UDFs) are user-programmable routines that act on one row.
- UDAFs: User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result.
The IOMETE Lakehouse works off of data stored in cheap and scalable cloud data storage provided by the three major cloud vendors
- This means IOMETE can handle all types of data (structured, semi-structured, and unstructured).
- It can also handle everything from AI to BI.
The future is open. Vendor lock-in and proprietary data formats slow down innovation. A single company cannot out-innovate a global community of innovators. Even the most regulated industries realize that open source is the best way to foster innovation, recruit and retain the best talent, and future-proof a technology platform.
Open
Built on two world-known open-source technologies, Apache Spark and Apache Iceberg.
Your data stays in your own cloud accounts in an open format
Your data stays in your own cloud accounts in an open format. Using ORC/Parquet, open standard file format. No proprietary file formats
IOMETE is built on open standards to ensure your data is secure while being universally and easily accessible.
Built on two world-known open-source technologies, Apache Spark and Apache Iceberg. The technologies that are already adopted by world-leading companies, like Apple, Netflix, Alibaba, Adobe, you name it, use!
Running multiple compute clusters on shared data allow you to isolate your compute workloads based on team or use cases without duplicating data.
Use different clusters for your BI and ETL workloads. So, those workloads will not affect each other’s performance while still accessing the same shared data.
You define dedicated clusters for each team (sales, marketing, engineering, etc.) to give a budget of power. So, those team can do their work without affecting other team’s compute resources. As your organization grows, the system can scale horizontally.
You can easily see the history of your previously run queries and profile the query plan using an intuitive UI
You can easily see the history of your previously run queries in the SQL editor.
The query history is even keeping the result (30 days) of that query so you can compare what was the result of the query when you ran it yesterday.
Read more:
Each lakehouse cluster has a defined size. It indicates the number of executors of the cluster. Executors are the compute nodes where the actual execution happens. Higher the cluster size (executors count) greater the clusters’ total performance.
Lakehouse clusters support the following sizes:
- XSmall (1 executor)
- Small (2 executors)
- Medium (4 executors)
- Large (8 executors)
- XLarge (16 executors)
- Gold (32 executors)
- Platinum (64 executors)
- Diamond (128 executors)
IOMETE Time Travel enables accessing historical data (i.e. data that has been changed or deleted) at any point with no time limitation (!). Think of it as your magical undo-button and it serves as a powerful tool for performing the following tasks:
- Restoring data-related objects (tables, schemas, and databases) that might have been accidentally or intentionally deleted.
- Duplicating and backing up data from key points in the past
- Analyzing data usage/manipulation over specified periods of time.
Exceptional data infrastructure, set up in minutes
Read structured/semi-structured files from any location without moving data to IOMETE.
Key benefits
- ETL-less access your data that lives anywhere!
- Allow analyzing data without moving to IOMETE.
- Automatically infer schema on table creation. No need for a manual declaration of all the columns and types
Read data from other databases using *JDBC within IOMETE
Supported databases:
- MySQL
- PostgreSQL
- Oracle
- Microsoft SQL Server
- ETL-less access your data that lives anywhere!
- Allow analyzing data without moving to IOMETE.
- Automatically infer schema on table creation. No need for a manual declaration of all the columns and types
- Fast read as IOMETE reads it distributedly.
- Bi-directional connection. Read and write to the source database
Read data from other NoSQL databases **within IOMETE
Supported databases:
- MongoDB
- Cassandra
- AWS DocumentDB (MongoDB Compatible)
- ElasticSearch
- ETL-less access your data that lives anywhere!
- Allow analyzing data without moving to IOMETE.
- Automatically infer schema on table creation. No need for a manual declaration of all the columns and types
- Fast read as IOMETE reads it distributedly.
- Bi-directional connection. Read and write to the source database
Using provided JDBC, ODBC, and Python drivers, you can connect from your backend applications to the Lakehouse Clusters. Experience is the same experience as you have with other operational database connections like MySQL.
Supported languages
- Java
- Python
- Scala
- Kotlin
- Node.js
- Ruby
- Go
IOMETE provides a dedicated DBT adapter that provides native integration with Apache Iceberg and the whole IOMETE ecosystem.
IOMETE Provides BI integrations to all major BI platforms
Currently supported BI platforms
- Metabase
- Tableau
- Looker
- Power BI
- Apache Superset
Connect from the notebook instances to IOMETE Lakehouse clusters using JDBC/Python drivers.
The maximum number of executors that can be running at the same time across all jobs.
The worksheet is a document that allows you to save, search and share SQL statements.
With our built-in query editor, you can easily query large data sets from our intuitive interface. With auto-complete and syntax highlighting, you would think that writing SQL couldn’t get any easier
- Organize data by creating data lables and tags on column, row, or cell level
- Easily find data with advanced discovery and search functionality
For example, in the IOMETE web interface, a user connects by clicking the IdP option on the login page:
- If they have already been authenticated by the IdP, they are immediately granted access to IOMETE.
- If they have not yet been authenticated by the IdP, they are taken to the IdP interface where they authenticate, after which they are granted access to IOMETE.