8000
Skip to content

paiml/pacha

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pacha

pacha

Model, Data and Recipe Registry with full lineage tracking.

Crates.io Documentation CI License MSRV


Table of Contents

Overview

Pacha is a unified registry for machine learning artifacts -- models, datasets, and training recipes -- with full lineage tracking, semantic versioning, and cryptographic integrity verification. It provides content-addressed storage with BLAKE3 hashing for deduplication, tamper detection, and efficient delta storage.

Key Capabilities

  • Model Registry - Register, version, and stage ML models with metadata and metrics
  • Data Registry - Track datasets with schema validation and provenance
  • Recipe Registry - Store training configurations with hyperparameters and environment specs
  • Lineage Tracking - Full dependency graph from data to deployed model
  • Content-Addressed Storage - BLAKE3-based deduplication and integrity verification
  • Cryptographic Signing - Ed25519 signatures for artifact authenticity
  • Experiment Tracking - Record training runs with metrics, parameters, and artifacts

Architecture

+------------------+     +------------------+     +------------------+
|  Model Registry  |     |  Data Registry   |     | Recipe Registry  |
|  (.apr files)    |     |  (.ald files)    |     | (TOML configs)   |
+--------+---------+     +--------+---------+     +--------+---------+
         |                        |                        |
         +------------------------+------------------------+
                                  |
                      +-----------+-----------+
                      |   Content-Addressed   |
                      |   Storage (BLAKE3)    |
                      +-----------+-----------+
                                  |
                      +-----------+-----------+
                      |  SQLite Metadata DB   |
                      |  (~/.pacha/registry)  |
                      +-----------------------+

Installation

Add to your Cargo.toml:

[dependencies]
pacha = "0.2.5"

Or install the CLI:

cargo install pacha

Usage

use pacha::prelude::*;

fn main() -> Result<()> {
    let registry = Registry::open(RegistryConfig::default())?;

    // Register a model with documentation
    let card = ModelCard::builder()
        .description("Fraud detection model")
        .metrics([("auc", 0.95), ("f1", 0.88)])
        .build();

    registry.register_model(
        "fraud-detector",
        &ModelVersion::new(1, 0, 0),
        &model_bytes,
        card,
    )?;

    // Retrieve and inspect
    let model = registry.get_model(
        "fraud-detector",
        &ModelVersion::new(1, 0, 0),
    )?;
    println!("Stage: {}", model.stage);

    Ok(())
}

Data Registry

use pacha::data::*;

// Register a dataset with schema
let schema = DataSchema::new(vec![
    Column::new("feature_1", DataType::Float64),
    Column::new("label", DataType::Int32),
]);

registry.register_data(
    "training-set-v2",
    &DataVersion::new(2, 0, 0),
    &data_bytes,
    schema,
)?;

Experiment Tracking

use pacha::experiment::*;

let experiment = Experiment::builder()
    .name("fraud-detection-v3")
    .model("fraud-detector", &ModelVersion::new(1, 0, 0))
    .dataset("training-set-v2")
    .hyperparams([("lr", "0.001"), ("epochs", "50")])
    .build();

registry.log_experiment(experiment)?;

CLI

# Initialize a registry
pacha init

# Model operations
pacha model register fraud-detector model.apr -v 1.0.0
pacha model list
pacha model stage fraud-detector -v 1.0.0 -t production
pacha model inspect fraud-detector -v 1.0.0

# Data operations
pacha data register training-set data.ald -v 1.0.0
pacha data list

# Registry statistics
pacha stats

Features

Feature Description Default
compression Zstd compression for stored artifacts Yes
cli Command-line interface Yes
signing Ed25519 cryptographic signing Yes
encryption ChaCha20-Poly1305 encryption at rest No
remote HTTP remote registry support No
lineage-graph Graph-based lineage visualization No
aprender-integration Integration with aprender ML library No
alimentar-integration Integration with alimentar data library No

Enable all features:

[dependencies]
pacha = { version = "0.2.5", features = ["full"] }

Benchmarks

Run benchmarks with:

cargo bench

Content-addressing operations (BLAKE3 hashing, storage, retrieval) are benchmarked using Criterion. See benches/content_address.rs for benchmark definitions.

Testing

468 tests passing with zero warnings.

# Unit tests
cargo test --lib

# All tests (unit + integration)
cargo test

# All features
cargo test --all-features

# With nextest (faster)
cargo nextest run

# Quality gates
make tier1   # Fast feedback: fmt, clippy, check
make tier2   # Pre-commit: tests + clippy
make tier3   # Pre-push: full validation

# Coverage
make coverage

# Mutation testing
cargo mutants --no-times --timeout 300

Recent Fixes (v0.2.5)

  • Non-atomic manifest write fixed: uses temp file + rename for crash safety
  • find_best_run handles empty input gracefully instead of panicking

Security

  • Cryptographic Integrity: All artifacts are content-addressed with BLAKE3
  • Ed25519 Signing: Optional artifact signing for authenticity verification
  • Encryption at Rest: Optional ChaCha20-Poly1305 encryption
  • Dependency Auditing: cargo-deny and cargo-audit in CI pipeline
  • No Unsafe Code: #![deny(unsafe_code)] enforced project-wide

To report a security vulnerability, please email security@paiml.com.

Contributing

Contributions welcome! Please follow the PAIML quality standards:

  1. Fork the repository
  2. Ensure all tests pass: cargo test
  3. Run quality checks: cargo clippy -- -D warnings && cargo fmt --check
  4. Submit a pull request

MSRV

Minimum Supported Rust Version: 1.75

See Also

License

MIT - see LICENSE for details.

About

Model, Data and Recipe Registry

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

0