- Hangzhou, China
- https://yaooqinn.github.io
- @kent_zju
- in/kent-yao
- u/kentyao
Highlights
-
spark Public
Forked from apache/sparkApache Spark - A unified analytics engine for large-scale data processing
-
velox Public
Forked from facebookincubator/veloxA C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
C++ Apache License 2.0 UpdatedApr 17, 2026 -
spark.js Public
Apache Spark Connect Client for JavaScript
-
PyHive Public
Forked from dropbox/PyHivePython interface to Hive and Presto. 🐝
-
presto Public
Forked from prestodb/prestoThe official home of the Presto distributed SQL query engine for big data
Java Apache License 2.0 UpdatedApr 9, 2026 -
-
spark-website Public
Forked from apache/spark-websiteApache Spark Website
HTML Apache License 2.0 UpdatedApr 8, 2026 -
-
claude-code-source-code Public
Forked from sanbuphy/learn-coding-agentClaude Code v2.1.88 Source Code
TypeScript UpdatedMar 31, 2026 -
LakeBench Public
Forked from microsoft/LakeBenchA multi-modal Python library for benchmarking lakehouse engines and ELT scenarios, supporting both industry-standard and novel benchmarks.
Python MIT License UpdatedMar 30, 2026 -
spark-history-cli Public
CLI tool for querying Apache Spark History Server REST API
-
skills-for-fabric Public
Forked from microsoft/skills-for-fabricA collection of skills and MCP systems to enable users of CLI, VSCode, Claude to operate over Microsoft Fabric
PowerShell MIT License UpdatedMar 9, 2026 -
oci-oracle-free Public
Forked from gvenzl/oci-oracle-freeBuild scripts for Oracle Database FREE container/docker images
-
-
gluten-benchmark Public
Fine-grained micro-benchmark for comparing Gluten+Velox with Vanilla Spark
Scala UpdatedFeb 4, 2026 -
awesome-copilot Public
Forked from github/awesome-copilotCommunity-contributed instructions, prompts, and configurations to help you make the most of GitHub Copilot.
JavaScript MIT License UpdatedJan 21, 2026 -
skills Public
Forked from anthropics/skillsPublic repository for Agent Skills
Python UpdatedDec 20, 2025 -
delta Public
Forked from delta-io/deltaAn open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Scala Apache License 2.0 UpdatedDec 18, 2025 -
supabase Public
Forked from supabase/supabaseThe Postgres development platform. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.
TypeScript Apache License 2.0 UpdatedNov 6, 2025 -
netty Public
Forked from netty/nettyNetty project - an event-driven asynchronous network application framework
Java Apache License 2.0 UpdatedNov 4, 2025 -
official-images Public
Forked from docker-library/official-imagesPrimary source of truth for the Docker "Official Images" program
Shell Apache License 2.0 UpdatedOct 31, 2025 -
-
arrow-java Public
Forked from apache/arrow-javaOfficial Java implementation of Apache Arrow
Java Apache License 2.0 UpdatedAug 5, 2025 -
spark-status-connector Public
A Spark DSv2 connector for querying its runtime status
-
spark-postgres Public
PostgreSQL and GreenPlum Data Source for Apache Spark
-
orc-format Public
Forked from apache/orc-formatApache ORC - the smallest, fastest columnar storage for Hadoop workloads
Apache License 2.0 UpdatedApr 25, 2025 -
incubator-kyuubi Public
Forked from apache/kyuubiApache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
-
lancedb Public
Forked from lancedb/lancedbDeveloper-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
Python Apache License 2.0 UpdatedMar 13, 2025 -
arrow Public
Forked from apache/arrowApache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
C++ Apache License 2.0 UpdatedJan 23, 2025 -
duckdb Public
Forked from duckdb/duckdbDuckDB is an analytical in-process SQL database management system
C++ MIT License UpdatedDec 2, 2024