8000
Skip to content

Latest commit

 

History

History
468 lines (355 loc) · 16.8 KB

File metadata and controls

468 lines (355 loc) · 16.8 KB

Dragonfly Development Guide

Essential reference for working with the Dragonfly codebase Architecture, build system, testing infrastructure, and development workflows.


Table of Contents

  1. Critical Workflow Rules
  2. Quick Command Reference
  3. Project Overview
  4. Repository Structure
  5. Build Instructions
  6. Testing
  7. CI/CD Pipeline
  8. Code Style & Pre-commit Hooks
  9. Third-Party Dependencies
  10. Platform Support
  11. CMake Build Options
  12. Key Files Reference
  13. Common Pitfalls
  14. Debugging Tips
  15. Validation Checklist

Critical Workflow Rules

MANDATORY - Always Follow This Order:

  1. Read Before Edit - Always read files before modifying
  2. Use Correct Build Commands - See Quick Command Reference below
  3. Test After Changes - Build and run a relevant unit test - ninja <unit_test> && ./unit_test
  4. Format Code - pre-commit run --files <files>
  5. Follow Architecture - See Architecture Patterns below

Pull Request Guidelines

Conciseness is Key: PR descriptions should be short, focused, and easy to scan.

  • Title: Imperative, descriptive (e.g., "Fix fiber stack overflow in test_reply_guard_oom")
  • Summary: 1-2 sentences explaining what changed and why
  • Changes: Bullet points for key changes
  • Fixes: Link issues (e.g., "Fixes #123")
  • Commit messages: Keep every line (subject and body) <= 100 characters; wrap long descriptions

Quick Command Reference

CRITICAL: Read the full sections below for context. These are shortcuts only.

Building (see Build Instructions for details)

# Debug build (for development)
./helio/blaze.sh -DWITH_AWS=OFF -DWITH_GCP=OFF
cd build-dbg && ninja dragonfly              # Build main binary
cd build-dbg && ninja generic_family_test    # Build specific test

# Release build for local benchmarking
./helio/blaze.sh -release -DWITH_AWS=OFF -DWITH_GCP=OFF
cd build-opt && ninja dragonfly

Testing (see Testing for details)

# C++ Unit Tests
cd build-dbg
ctest -V -L DFLY                                    # Run all tests
./generic_family_test                               # Run specific test binary
./generic_family_test --gtest_filter="Set.*"        # Run specific test case

Code Formatting

# Setup (once)
pipx install pre-commit clang-format black
pre-commit install

# Format code
pre-commit run --files <files>              # Format specific files
pre-commit run --all-files                  # Format all files

Common Operations

# Check git status
git status

# Check current branch
git branch

# View recent commits
git log --oneline -10

Architecture Patterns

Code Style: .clang-format - snake_case vars, PascalCase functions, kPascalCase constants

DO ✅:

DON'T ❌:

  • std::thread, std::mutex (deadlocks!)
  • Global mutable state
  • Edit without reading
  • Skip tests
  • Use ./tools/docker/build.sh for local development (use ninja instead)
  • Use make for incremental builds (use ninja instead)

Project Overview

Dragonfly is a high-performance, Redis and Memcached compatible in-memory data store written in C++20. It delivers significantly higher throughput than traditional single-threaded Redis implementations through innovative architectural choices.

Key Characteristics

  • Language: C++20 (Google C++ Style Guide 2020 version)
  • Architecture: Shared-nothing multi-threaded design (via helio library)
  • Performance: Uses io_uring (Linux 5.11+) for high-performance async I/O, with epoll fallback
  • Threading Model: Fiber-based cooperative multitasking with lock-free data structures
  • Build System: CMake + Ninja via helio/blaze.sh wrapper script
  • Target Platform: Linux (kernel 5.11+ recommended), FreeBSD support available
  • Protocols: Redis RESP2/RESP3, Memcached binary protocol
  • Compatibility: Drop-in replacement for Redis API coverage

Architectural Highlights

For detailed architecture documentation, see docs/df-share-nothing.md

  1. Shared-Nothing Design: Each thread operates independently with its own data structures, minimizing lock contention
  2. Helio Framework: Custom I/O and threading library built on io_uring/epoll with fiber support
  3. DashTable: Novel hash table implementation optimized for multi-core systems - see docs/dashtable.md
  4. Transaction Model: Non-blocking optimistic transactions - see docs/transaction.md
  5. Tiering Support: Optional disk-backed storage for large datasets
  6. Search Module: Full-text search capabilities (when enabled with WITH_SEARCH)

Repository Structure

dragonfly/
├── src/                      # Main C++ source code
│   ├── server/               # Core server implementation
│   │   ├── dfly_main.cc      # Main entry point
│   │   ├── main_service.cc   # Service lifecycle & command routing
│   │   ├── db_slice.cc       # Per-thread database shard
│   │   ├── engine_shard_set.cc # Shard management
│   │   ├── cluster/          # Cluster mode implementation
│   │   ├── journal/          # Replication journal
│   │   ├── tiering/          # Tiered storage
│   │   ├── search/           # Search module
│   │   └── acl/              # Access control lists
│   ├── core/                 # Core data structures
│   │   ├── dash.h            # DashTable hash table
│   │   ├── dense_set.h       # Compact set implementation
│   │   ├── string_map.h      # Optimized string-keyed maps
│   │   ├── search/           # Search core algorithms
│   │   └── json/             # JSON support
│   ├── facade/               # Network & command handling
│   │   ├── dragonfly_connection.cc # Connection management
│   │   ├── redis_parser.cc   # RESP protocol parser
│   │   └── memcache_parser.cc # Memcached protocol
│   └── redis/                # Redis-specific implementations
│       └── lua/              # Lua scripting support
│
├── helio/                    # Git submodule: I/O and threading library
│   │                         # ** DO NOT EDIT unless contributing to helio **
│   ├── util/                 # Utilities: fibers, I/O, synchronization
│   ├── io/                   # io_uring & epoll abstraction
│   └── blaze.sh              # Build configuration wrapper
│
├── tests/                    # Test suite
│   ├── dragonfly/            # Python pytest integration/regression tests
│   │   ├── conftest.py       # Pytest fixtures & configuration
│   │   ├── requirements.txt  # Python test dependencies
│   │   └── *.py              # Test files
│   └── pytest.ini            # Pytest configuration & markers
│
├── docs/                     # Documentation
│   ├── build-from-source.md  # Build instructions
│   ├── dashtable.md          # DashTable internals
│   ├── transaction.md        # Transaction model
│   ├── df-share-nothing.md   # Shared-nothing architecture
│   └── differences.md        # Differences from Redis
│
├── contrib/                  # Utilities
│   ├── docker/               # Docker configurations
│   └── charts/dragonfly/     # Helm chart for Kubernetes
│
├── tools/                    # Benchmarking & utility tools
│   └── packaging/            # Packaging scripts
│
├── CMakeLists.txt            # Root CMake configuration
├── .clang-format             # C++ formatting rules (clang-format v14.0.6)
├── .pre-commit-config.yaml   # Pre-commit hooks configuration
├── pyproject.toml            # Python formatting (Black, 100 chars)
└── CONTRIBUTING.md           # Contribution guidelines

Critical Paths to Remember

  • Main entry: src/server/dfly_main.cc
  • Command dispatch: src/server/main_service.cc
  • Data storage: src/server/db_slice.cc
  • Networking: src/facade/dragonfly_connection.cc
  • Helio library: helio/ (I/O and threading library)

Build Instructions

For complete build instructions, see docs/build-from-source.md

Quick Start

Debug build (for development):

./helio/blaze.sh
cd build-dbg && ninja dragonfly
./dragonfly --alsologtostderr

Release build (for production/benchmarking):

./helio/blaze.sh -release
cd build-opt && ninja dragonfly

Production release build (static linking, optimized):

make release           # Configure + build
make package           # Create release packages with debug symbols

The Makefile builds production releases with:

  • Static linking: libstdc++, libgcc, Boost, OpenSSL
  • Architecture optimizations (x86_64: -march=core2 -msse4.1 -mtune=skylake)
  • Debug symbols (compressed)
  • Output: build-release/dragonfly-{arch}.tar.gz

Common build options:


Testing

For complete testing documentation, see tests/README.md

Quick Reference

C++ Unit Tests:

cd build-dbg
ctest -V -L DFLY                                    # Run all tests
./generic_family_test                               # Run specific test binary
./generic_family_test --gtest_filter="Set.*"        # Run specific test case

CI/CD Pipeline

For complete CI configuration, see .github/workflows/ci.yml

The CI workflow runs on all PRs and includes:

  • Pre-commit checks: clang-format, black formatters
  • Build matrix: Multiple OS/compiler/sanitizer combinations (Ubuntu 20/24, Alpine, GCC/Clang, ASAN/UBSAN)
  • Test execution: C++ unit tests, Python integration tests, cluster mode tests
  • Additional validations: Helm charts, Docker image builds

Code Style & Pre-commit Hooks

For complete contribution guidelines, see CONTRIBUTING.md

Code style configuration files:

Quick setup:

pipx install pre-commit clang-format black
pre-commit install
pre-commit run --all-files                          # Run all formatters

Third-Party Dependencies

Key Libraries: Abseil (strings/flags), Boost 1.71+ (context/intrusive), mimalloc (allocator), jsoncons (JSON), OpenSSL (TLS), libunwind (traces)

Build artifacts: build-dbg/third_party/ - DO NOT edit

For complete dependency info, see docs/build-from-source.md


Platform Support

Linux: Primary platform. Kernel 5.11+ (io_uring), 5.1+ (basic), < 5.1 (epoll fallback)

  • Check: uname -r
  • Force epoll: --proactor_type=epoll
  • Docker: --security-opt seccomp=unconfined

FreeBSD: Supported (kqueue backend)

macOS: Not supported for production (use Docker/Linux)

For complete platform info, see docs/build-from-source.md


CMake Build Options

For complete list of build options, see docs/build-from-source.md

Common Options

Pass options to helio/blaze.sh with -D prefix:

./helio/blaze.sh -DWITH_SEARCH=OFF -DWITH_AWS=ON

Most useful options:

  • WITH_ASAN=ON / WITH_USAN=ON - Enable sanitizers for debugging
  • WITH_SEARCH=OFF - Disable search module for faster builds
  • WITH_AWS=OFF / WITH_GCP=OFF - Disable cloud libraries
  • WITH_TIERING=OFF - Disable disk storage
  • USE_MOLD=ON - Faster linking with LTO (production builds)

Quick configurations:

# Minimal build (fast compilation)
./helio/blaze.sh -DWITH_GPERF=OFF -DWITH_AWS=OFF -DWITH_GCP=OFF -DWITH_TIERING=OFF -DWITH_SEARCH=OFF

# Full-featured (all options ON by default)
./helio/blaze.sh

# Production optimized
./helio/blaze.sh -release -DUSE_MOLD=ON

Key Files Reference

Quick reference to the most important files in the codebase.

Purpose File Path
Entry Points & Core
Main entry point src/server/dfly_main.cc
Server lifecycle & command routing src/server/main_service.cc
Per-thread database shard src/server/db_slice.cc
Shard management src/server/engine_shard_set.cc
Data Structures
DashTable hash table src/core/dash.h
Dense set implementation src/core/dense_set.h
String map src/core/string_map.h
Networking
Connection handling src/facade/dragonfly_connection.cc
Redis protocol parser src/facade/redis_parser.cc
Memcached protocol parser src/facade/memcache_parser.cc
Build System
Root CMake config CMakeLists.txt
Build script wrapper helio/blaze.sh
Server CMake config src/server/CMakeLists.txt
CI/CD
Main CI workflow .github/workflows/ci.yml
Pre-commit config .pre-commit-config.yaml
Code Style
C++ formatting .clang-format
Python formatting pyproject.toml
Testing
Pytest configuration tests/pytest.ini
Pytest fixtures tests/dragonfly/conftest.py
Test requirements tests/dragonfly/requirements.txt
Documentation
Build instructions docs/build-from-source.md
Architecture overview docs/df-share-nothing.md
DashTable internals docs/dashtable.md
Transaction model docs/transaction.md
Configuration
Contributing guide CONTRIBUTING.md
CLA agreement CLA.txt

Common Pitfalls

  1. Pre-commit not installed: pipx install pre-commit clang-format black && pre-commit install
  2. Wrong binary: Debug: build-dbg/dragonfly, Release: build-opt/dragonfly
  3. Wrong build command: Use cd build-dbg && ninja <target>, NOT ./tools/docker/build.sh
  4. Test timeouts: timeout 20m ctest -V -L DFLY
  5. ASAN leaks: Check CI, suppress in helio/util/asan_suppressions.txt
  6. Helio modifications: DON'T edit helio/ (it's a git submodule - changes go upstream)
  7. CodeQL checks: DON'T run codeql_checker when testing changes - it's slow and unnecessary for development

Debugging Tips

Logging: --alsologtostderr --v=1 --vmodule=module=2

ASAN: ASAN_OPTIONS=detect_leaks=1:symbolize=1, suppressions: helio/util/asan_suppressions.txt

CI reproduction: See .github/workflows/ci.yml

Troubleshooting: Check fiber deadlocks (use util::fb2 not std::mutex), timeout issues (--test_timeout), ASAN reports


Validation Checklist

Before claiming a task is complete, verify:

Code Quality

  • Code compiles without errors: cd build-dbg && ninja dragonfly
  • Code compiles without warnings (CI uses -Werror)
  • Code follows Google C++ Style Guide (run clang-format)
  • No new ASAN/UBSAN violations

Testing

  • All existing C++ unit tests pass: ctest -V -L DFLY
  • New feature has corresponding test coverage
  • Tests pass in both Debug and Release builds
  • Tests pass with ASAN/UBSAN enabled (if applicable)
  • DO NOT run codeql_checker - it's slow and unnecessary for development testing

Pre-commit & Style

  • Pre-commit hooks installed: pre-commit install
  • Code formatted with clang-format (C++) and black (Python)

Documentation

  • Public APIs have comments explaining purpose
  • Complex algorithms have explanatory comments
  • README or docs updated if behavior changes
  • No commented-out code left in final commit

Performance

  • No obvious performance regressions (run benchmarks if needed)
  • No unnecessary allocations in hot paths
  • Lock-free data structures used where appropriate
0