datalake
Here are 225 public repositories matching this topic...
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
-
Updated
Jun 2, 2024 - Java
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
Updated
Jun 2, 2024 - Java
lakeFS - Data version control for your data lake | Git for data
-
Updated
Jun 2, 2024 - Go
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
-
Updated
Jun 2, 2024 - Java
Upserts, Deletes And Incremental Processing on Big Data.
-
Updated
Jun 2, 2024 - Java
汇总Apache Hudi相关资料
-
Updated
Jun 2, 2024
Postgres for Search and Analytics
-
Updated
Jun 1, 2024 - Rust
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
-
Updated
May 31, 2024 - Java
An IDE and translation engine for detection engineers and threat hunters. Be faster, write smarter, keep 100% privacy.
-
Updated
May 31, 2024 - Python
Open Control Plane for Tables in Data Lakehouse
-
Updated
Jun 1, 2024 - Java
LOGVERZ APPLICATION BUNDLE. Logverz is a cutting-edge self-service data platform and instant data lake. The fastest route from AWS S3 to instant reports. The application bundle is the packaged repository incorporating the "LogverzPortalAccess", "LogverzPortal", and "LogverzCore" components.
-
Updated
May 30, 2024 - PowerShell
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
-
Updated
May 30, 2024 - Java
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
-
Updated
May 30, 2024 - Python
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
-
Updated
May 30, 2024 - Java
Improve this page
Add a description, image, and links to the datalake topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the datalake topic, visit your repo's landing page and select "manage topics."