Performance Benchmarks & Analysis

Real-world performance data for Catzilla v0.1.0

Executive Summary

Catzilla v0.1.0 demonstrates exceptional performance in comprehensive benchmarks against leading Python web frameworks, achieving 6-8x faster throughput with 87% lower latency on real server hardware.

Test Environment

Hardware Specifications

  • CPU: Intel Xeon E3-1245 v5 @ 3.5GHz (4 cores, 8 threads)

  • Memory: 31.17 GB RAM

  • Storage: 1.86 TB total space

  • Architecture: x86_64

Software Environment

  • Operating System: AlmaLinux 8.10 (Cerulean Leopard)

  • Kernel: Linux 4.18.0-513.9.1.el8_9.x86_64

  • Python: 3.8.12 (CPython)

  • System Load: Light (1.00 average)

Benchmark Configuration

  • Tool: wrk (HTTP benchmarking tool)

  • Duration: 10 seconds per test

  • Concurrent Connections: 100

  • Threads: 4

  • Warmup: 3 seconds before each test

Performance Results

Raw Performance Data

Framework

Endpoint

Requests/sec

Avg Latency

99% Latency

Transfer/sec

Catzilla

hello_world

24,758.50

4.07ms

5.98ms

3.2MB/s

Catzilla

json_response

15,753.65

6.38ms

8.37ms

3.8MB/s

Catzilla

path_params

17,589.58

5.68ms

7.36ms

3.6MB/s

Catzilla

query_params

11,144.58

8.95ms

11.28ms

2.9MB/s

Catzilla

complex_json

14,842.50

6.79ms

9.70ms

4.1MB/s

FastAPI

hello_world

2,843.94

35.04ms

36.02ms

0.46MB/s

FastAPI

json_response

2,421.29

41.16ms

43.74ms

0.58MB/s

FastAPI

path_params

2,340.97

42.58ms

43.98ms

0.48MB/s

FastAPI

query_params

1,419.16

70.08ms

71.84ms

0.37MB/s

FastAPI

complex_json

2,007.83

49.63ms

51.00ms

0.56MB/s

Django

hello_world

2,338.89

42.56ms

46.49ms

0.53MB/s

Django

json_response

2,207.93

45.04ms

52.15ms

0.50MB/s

Django

path_params

2,219.09

44.82ms

47.93ms

0.50MB/s

Django

query_params

1,975.37

50.37ms

55.36ms

0.45MB/s

Django

complex_json

2,161.83

46.00ms

50.06ms

0.49MB/s

Flask

hello_world

2,875.21

34.60ms

38.87ms

0.47MB/s

Flask

json_response

2,671.96

37.23ms

43.31ms

0.55MB/s

Flask

path_params

2,624.16

37.93ms

41.89ms

0.54MB/s

Flask

query_params

2,431.46

40.90ms

47.22ms

0.50MB/s

Flask

complex_json

2,520.75

39.47ms

43.68ms

0.69MB/s

Performance Analysis

Throughput Comparison

Catzilla dominates in every single endpoint category:

  1. Hello World: 24,759 RPS vs FastAPI’s 2,844 RPS (+771% faster)

  2. JSON Response: 15,754 RPS vs FastAPI’s 2,421 RPS (+551% faster)

  3. Path Parameters: 17,590 RPS vs FastAPI’s 2,341 RPS (+651% faster)

  4. Query Parameters: 11,145 RPS vs FastAPI’s 1,419 RPS (+685% faster)

  5. Complex JSON: 14,843 RPS vs FastAPI’s 2,008 RPS (+639% faster)

Average Performance Advantage:

  • vs FastAPI: +662% faster (7.6x improvement)

  • vs Django: +658% faster (7.6x improvement)

  • vs Flask: +531% faster (6.3x improvement)

Latency Analysis

Catzilla delivers consistently low latency:

  • Average Latency: 5.97ms vs FastAPI’s 47.69ms (87% lower)

  • Best Case: 4.07ms (hello_world)

  • 99th Percentile: Consistently under 12ms across all endpoints

Latency Improvements:

  • Hello World: 4.07ms vs FastAPI’s 35.04ms (88% faster)

  • JSON Response: 6.38ms vs FastAPI’s 41.16ms (85% faster)

  • Path Parameters: 5.68ms vs FastAPI’s 42.58ms (87% faster)

  • Query Parameters: 8.95ms vs FastAPI’s 70.08ms (87% faster)

  • Complex JSON: 6.79ms vs FastAPI’s 49.63ms (86% faster)

Technical Performance Insights

Why Catzilla is Faster

  1. C-Based Core: Event-driven I/O implemented in C using libuv

  2. Advanced Routing: O(log n) trie-based routing vs linear search in other frameworks

  3. Minimal Overhead: Direct C-Python bindings without middleware layers

  4. Memory Efficiency: Optimized memory allocation and zero-copy operations

  5. GIL Optimization: C operations release GIL for better concurrency

Framework Architecture Impact

Framework

Core Language

Routing Algorithm

Overhead

Catzilla

C + Python

Trie (O(log n))

Minimal

FastAPI

Python

Linear search

High (validation, serialization)

Django

Python

Linear search

Very high (ORM, middleware)

Flask

Python

Linear search

Medium (WSGI stack)

Performance Scaling Characteristics

Endpoint Complexity Impact:

  • Simple endpoints (hello_world): Catzilla shows maximum advantage (8.7x faster)

  • JSON endpoints: Consistent 6-7x performance advantage

  • Parameter parsing: Efficient trie-based routing maintains performance

  • Complex operations: C core minimizes overhead even with complexity

Real-World Implications

When to Choose Catzilla

High-Throughput Scenarios (>10,000 RPS):

  • API gateways and proxies

  • Microservices architectures

  • Real-time data streaming

  • IoT device communication

Low-Latency Critical (<10ms):

  • Financial trading systems

  • Real-time gaming backends

  • Live chat and messaging

  • Edge computing applications

Resource-Constrained Environments:

  • Cloud computing (cost optimization)

  • Embedded systems

  • Edge devices

  • Serverless functions

Cost-Benefit Analysis

Infrastructure Savings:

  • 87% fewer servers needed for same throughput vs FastAPI

  • Significant cost reduction in cloud environments

  • Lower power consumption due to efficiency

  • Reduced latency improves user experience metrics

Benchmark Reproducibility

Running Benchmarks Yourself

# Clone and setup
git clone https://github.com/rezwanahmedsami/catzilla.git
cd catzilla
git submodule update --init --recursive

# Build the project
./scripts/build.sh

# Install benchmark dependencies
pip install -r requirements-benchmarks.txt

# Run complete benchmark suite
cd benchmarks
./run_all.sh

Benchmark Validation

All benchmark results include:

  • System information collection for transparency

  • Automated result validation and error checking

  • Multiple endpoint types for comprehensive testing

  • Statistical significance through multiple runs

  • Raw data availability for independent verification

System Specifications Record

Complete system information captured during benchmarks:

{
  "cpu": {
    "model": "Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz",
    "cores_logical": 8,
    "cores_physical": 4,
    "frequency_current": "3538.27 MHz",
    "frequency_max": "3900.0 MHz"
  },
  "memory": {
    "total_ram": "31.17 GB",
    "available_ram": "9.1 GB",
    "swap_total": "1.0 GB"
  },
  "os": {
    "distribution": "AlmaLinux 8.10 (Cerulean Leopard)",
    "kernel": "Linux 4.18.0-513.9.1.el8_9.x86_64",
    "architecture": "x86_64"
  },
  "python": {
    "version": "3.8.12",
    "implementation": "CPython"
  }
}

Conclusion

Catzilla v0.1.0 demonstrates production-ready performance that significantly outperforms established Python web frameworks. The combination of C-based core architecture, advanced routing algorithms, and minimal overhead design delivers:

  • 6-8x faster throughput than competitors

  • 87% lower latency for responsive applications

  • Real server validation on production-grade hardware

  • Consistent performance across diverse endpoint types

These results position Catzilla as the fastest Python web framework currently available, making it ideal for high-performance applications requiring both speed and Python’s development productivity.