The open source high performance ELT framework powered by Apache Arrow
-
Updated
Jun 11, 2024 - Go
The open source high performance ELT framework powered by Apache Arrow
Workflow Engine for Kubernetes
Example API implementation for Data Caterer
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Privacy and Security focused Segment-alternative, in Golang and React
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Journey through the cryptic corridors of code. Unravel the secrets encoded in the shadows. Welcome to the realm where algorithms whisper in binary tongues. Dare to explore, for within lies the essence of innovation.
lakeFS - Data version control for your data lake | Git for data
Main website for the Seedcase Project
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.
Turns Data and AI algorithms into production-ready web applications in no time.
Simple datalake
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
Jupyter Notebook Databases Stack
My personal project for data engineering zoomcamp
Data Analytics with Apache Spark ⭐
Jayvee is a domain-specific language and runtime for automated processing of data pipelines
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."