benchmark between curl_cffi and other python http clients
- curl_cffi
- requests
- pycurl
- python-tls-client
- httpx
- curl_cffi
- httpx
- aiohttp
All the clients run with session/client enabled.
Two distinct benchmarks are provided to evaluate the performance of the AsyncWebSocket implementation under different conditions.
-
Simple Throughput Test (
client,server)This is a lightweight, in-memory benchmark designed to measure the raw throughput and overhead of the WebSocket client. The server sends a repeating chunk of random bytes from memory, and the client receives it. This test is useful for quick sanity checks and detecting performance regressions under ideal, CPU-cached conditions.
-
Verified Streaming Test (
code)This is a rigorous, end-to-end test. It first generates a multi-gigabyte file of random data and its SHA256 hash. The benchmark then streams this file from disk over the WebSocket connection. The receiving end calculates the hash of the incoming stream and verifies it against the original, ensuring complete data integrity.
It measures the performance of the entire system pipeline, including Disk I/O speed, CPU hashing speed, and network transfer. On many systems, it is likely to be bottlenecked by the CPU's hashing performance or the disk speed. The result of the first run should be discarded as this reflects disk read speed rather than code performance, when the file is not cached in RAM.
- Python 3.10+
- Pip packages
pip install aiohttp curl_cffi
uvloopis highly recommended for performance on Linux and macOS. The benchmarks will automatically fall back to the standard asyncio event loop if it is not installed or on Windows.
-
TLS certificate (optional)
These benchmarks are configured to use WSS (secure WebSockets) by default on Linux and macOS. To generate a self-signed certificate:
openssl req -x509 -newkey rsa:2048 -nodes -keyout localhost.key -out localhost.crt -days 365 -subj "/CN=localhost"Note: If you are on any platform and skip certificate generation, the benchmarks will use the insecure
ws://instead. -
Configuration
The benchmark parameters (total data size, chunk size) can be modified by editing the
TestConfigclass. By default, both benchmarks are configured for10 GiBof data transfer.
It is recommended to run the server and client in different terminal windows.
-
Start the Server:
python ws_bench_1_server.py
-
Run the Client:
python ws_bench_1_client.py
-
Generate Test File (Initial Setup):
This command will create a large (
10 GiB) file namedtestdata.binand its hash:python ws_bench_2.py generate
Ensure you have sufficient disk space available
-
Start the Server:
python ws_bench_2.py server
-
Run the Client (Choose one):
This benchmark requires enough available RAM equal to the size of the random data (e.g.
10 GiB).- To test download speed (server sends, client receives):
python ws_bench_2.py client --test download
- To test upload speed (client sends, server receives):
python ws_bench_2.py client --test upload
Benchmark results can vary significantly based on system-level factors. The following should be kept in mind:
-
Loopback Interface: These tests run on the loopback interface (
127.0.0.1), which does not represent real-world internet conditions (latency, packet loss, etc.). -
CPU Affinity: For maximum consistency, especially on multi-core or multi-CPU (NUMA) systems, you can pin the server and client processes to specific CPU cores. This avoids performance penalties from processes migrating between cores or crossing CPU socket boundaries.
On Linux: Use
tasksetto specify a CPU core (e.g., core 0 for the server, core 1 for the client).# Terminal 1 taskset -c 0 python ws_bench_1_server.py # Terminal 2 taskset -c 1 python ws_bench_1_client.py
On Windows: Use the
start /affinitycommand. The affinity mask is a hexadecimal number (1for CPU 0,2for CPU 1,4for CPU 2, etc.).# PowerShell/CMD 1 start /affinity 1 python ws_bench_1_server.py # PowerShell/CMD 2 start /affinity 2 python ws_bench_1_client.py
-
Concurrent Tests: The
ws_bench_1_client.pybenchmark mode can be changed to download/upload/concurrent by changing theBenchmarkDirectionenum. A concurrent test completes when both directions finish. -
Queue Sizes: Adjust the
send_queueandrecv_queuesizes within theTestConfigclass to observe the impact on performance and backpressure.