This repository implements benchmarks to measure the overhead of cryptographically secure random number generation in differentially private stochastic gradient descent (DP-SGD) training. The benchmarks specifically focus on quantifying the performance impact of using AES-CTR (Counter Mode) for generating the noise required in DP-SGD, compared to standard pseudo-random number generation.
Differential Privacy requires cryptographically secure random number generation to maintain its privacy guarantees. However, existing DP-ML frameworks typically use standard PRNGs for performance reasons. This benchmark helps quantify the overhead of using proper cryptographic RNG in private training workflows.
- Benchmarks DP-SGD training steps with both standard and secure noise generation
- Implements hardware-accelerated AES-CTR random number generation
- Supports both ResNet-18 and Transformer architectures
- Detailed timing breakdown of forward/backward passes and noise generation
- Memory usage tracking for CPU and GPU
- Compatible with CUDA GPUs and Apple Metal (MPS)
The repository includes two AES-based random number generators:
BatchedAESRandomGenerator: Base implementation using AES in Counter ModeHWAccelBatchedAESRandomGenerator: Version that utilizes hardware acceleration via:- Intel AES-NI instructions on x86 processors
- Secure Enclave on Apple Silicon
The generators use batch generation and caching to improve performance, generating blocks of 10M random numbers at a time.
- First, set up your Python environment:
python -m venv dpsgd_rng_env
source dpsgd_rng_env/bin/activate # On Unix/macOS
pip install -r requirements.txt- Run the benchmark script:
# Basic usage with default settings (Transformer model)
python dpsgd_rng_benchmark.py
# To benchmark ResNet-18
python dpsgd_rng_benchmark.py --model_type resnet
# To disable Poisson sampling
python dpsgd_rng_benchmark.py --poisson_sampling false
# To use the AES privacy engine
python dpsgd_rng_benchmark.py --privacy_engine_type aes
# To specify a different output directory
python dpsgd_rng_benchmark.py --output_dir custom_resultsResults will be saved to the results directory, including:
- Detailed timing statistics
- Memory usage metrics
- Per-tensor noise generation times
- Raw data in CSV and NPZ formats
The benchmark generates several files in the results directory:
results/
└── run_YYYYMMDD_HHMMSS/
├── config.json # Benchmark configuration
├── summary.json # Summary statistics
├── timing_details.csv # Detailed timing data
├── noise_generation_details.csv # Per-tensor noise timing
├── tensor_stats_summary.csv # Tensor statistics
└── all_data.npz # All data in compressed format
If you use this benchmark in your research, please cite our paper:
@inproceedings{egan2024,
title={High-speed secure random number generator co-processors for privacy-preserving machine learning},
author={Egan, Shannon},
booktitle={Second Workshop on Machine Learning with New Compute Paradigms at NeurIPS},
year={2024},
url={https://openreview.net/pdf?id=8oraBCaYbm}
}