Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
-
Updated
Jun 11, 2024 - Python
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
The open source high performance ELT framework powered by Apache Arrow
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Flink CDC is a streaming data integration tool
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Use SQL to instantly query Turbot Pipes resources across workspaces. Open source CLI. No DB required.
A CLI tool for transforming large RDF datasets using pure SPARQL.
SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
Postgres to Elasticsearch/OpenSearch sync
Privacy and Security focused Segment-alternative, in Golang and React
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
A machine-readable, human-editable database of the Yu-Gi-Oh! Trading Card Game, Official Card Game, Master Duel, Rush Duel, Speed Duel.
Pull and standardize data on cloud compute resources.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
log data pre processing in python
Add a description, image, and links to the etl topic page so that developers can more easily learn about it.
To associate your repository with the etl topic, visit your repo's landing page and select "manage topics."