Open data studio is an open initiative to bring machine learning and large scale data processing open-source software to click away for everyone.
Please visit open-datastudio.io
Component | Project | Description | Integration Status |
---|---|---|---|
Notebook | jupyter | Jupyter Lab | Integrated |
zeppelin | Integrates with Apache Zeppelin and Apache Spark on Kubernetes mode | Integrated | |
Data Lake | hive-metastore | Provides hive metastore server with Postgresql database | Integrated |
spark-thriftserver | Spark cluster on Kubernetes for ODBC/JDBC connection | Integrated | |
Computing | ray-cluster | Ray cluster | Integrated |
spark-serverless | On-demand Spark cluster from everywhere | Integrated | |
Machine learning | mlflow-server | MLflow model remote tracking server and ui | Integrated |
mlflow-model-serving | Deploy models from mlflow-server and get endpoint | Integrated | |
Business Intelligence | metabase | Metabase Business Intelligence | Integrated |
superset | Apache Superset Business Intelligence | Integrated | |
Misc | spark | It does not integrates to Staroid but publishes docker image for other projects | - |
You can create issues or pull requests to contribute individual repositories under open-datasicence.
If you'd like to create a new integration project here, please create an issue in this repository.
We need your help!
- Open data studio slack channel - Join
Open data studio is an open source projects. LICENSE file is included in each repository.