-
-
slatedb Public
Forked from slatedb/slatedbA cloud native embedded storage engine built on object storage.
Rust Apache License 2.0 UpdatedOct 2, 2024 -
-
limbo Public
Forked from tursodatabase/limboLimbo is a work-in-progress, in-process OLTP database management system, compatible with SQLite.
Rust MIT License UpdatedAug 20, 2024 -
toy-olap-db Public
OLAP DB implementation from scratch for educational purposes
Rust UpdatedJul 16, 2024 -
arrow-datafusion Public
Forked from apache/datafusionApache Arrow DataFusion SQL Query Engine
Rust Apache License 2.0 UpdatedFeb 5, 2024 -
risinglight-tutorial Public
Forked from risinglightdb/risinglight-tutorialLet's build an OLAP database from scratch! 🚧 UNDER CONSTRUCTION 🚧
Rust Apache License 2.0 UpdatedJan 23, 2024 -
delta-rs Public
Forked from delta-io/delta-rsA native Rust library for Delta Lake, with bindings into Python
Rust Apache License 2.0 UpdatedNov 5, 2023 -
polars Public
Forked from pola-rs/polarsFast multi-threaded, hybrid-out-of-core query engine focussing on DataFrame front-ends
Rust MIT License UpdatedOct 28, 2023 -
arrow2 Public
Forked from jorgecarleitao/arrow2Transmute-free Rust library to work with the Arrow format
Rust Apache License 2.0 UpdatedSep 11, 2023 -
-
-
diane Public
Hive helper functions for apache spark users
-
delta Public
Forked from delta-io/deltaAn open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Scala Apache License 2.0 UpdatedMar 27, 2023 -
levi Public
Forked from mrpowers-io/leviDelta Lake helper methods. No Spark dependency.
Python MIT License UpdatedMar 19, 2023 -
delta-examples Public
Forked from delta-io/delta-examplesDelta Lake examples
Jupyter Notebook UpdatedMar 9, 2023 -
-
data-scrapbook Public
Forked from MrPowers/data-scrapbookA collection of images and captions to explain core data concepts
UpdatedFeb 3, 2023 -
jodie Public
Forked from mrpowers-io/jodieDelta lake and filesystem helper methods
Scala UpdatedJan 24, 2023 -
-
twitter4s Public
Forked from DanielaSfregola/twitter4sAn asynchronous non-blocking Scala client for both the Twitter Rest and Streaming API
Scala Apache License 2.0 UpdatedJul 13, 2022 -
-
incubator-pinot Public
Forked from apache/pinotApache Pinot (Incubating) - A realtime distributed OLAP datastore
Java Apache License 2.0 UpdatedJul 17, 2021 -
Foundational and adjacent skills in data engineering.
UpdatedJul 20, 2020 -
-
netflix-content-reviews Public
This project is a data pipeline created with the intention of generating data related to netflix's content opinion on reddit, this data will serve a twitter bot that will tweet every time someone w…
-
-
airflow-pipeline Public
Forked from dsaidgovsg/airflow-pipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
-
-
Data modeling of an OLAP database from the streaming music datasets
Python UpdatedMar 11, 2020