Why Rate Limiting Matters

Every public-facing web server is subject to traffic spikes, automated scrapers, brute-force login attempts, and distributed denial-of-service (DDoS) attacks. Without rate limiting, a single abusive client can exhaust your server’s CPU, memory, and file descriptors, making the service unavailable to legitimate users.

Nginx’s built-in rate limiting provides a lightweight, high-performance defense layer that operates at the reverse proxy level — before requests ever reach your application. Using the ngx_http_limit_req_module and ngx_http_limit_conn_module, you can enforce per-IP or per-user request rates, connection caps, and burst allowances with minimal configuration overhead.

This guide walks through practical rate limiting configurations covering API endpoints, login pages, static assets, and whole-site protection.

Prerequisites

  • Nginx 1.18+ installed (both modules are compiled in by default).
  • sudo or root access to edit Nginx configuration files.
  • Basic familiarity with Nginx configuration syntax (http, server, location blocks).
  • A test tool like curl, Apache Bench (ab), or wrk for validation.

Understanding the Leaky Bucket Algorithm

Nginx rate limiting uses the leaky bucket algorithm. Imagine a bucket with a small hole at the bottom:

  • Requests fill the bucket from the top.
  • The bucket drains at a fixed rate (your configured rate, e.g., 10 requests/second).
  • If the bucket overflows (too many requests arrive faster than the drain rate), excess requests are either rejected or queued (burst).

This model smooths out traffic spikes rather than enforcing a rigid per-second counter, making it well-suited for real-world traffic patterns.


Step-by-Step Solution

1. Define a Rate Limiting Zone

In your nginx.conf (inside the http block), define a shared memory zone:

http {
    # Zone "api": keyed by client IP, 10MB storage, 10 requests/second
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

    # Zone "login": stricter, 5 requests/minute for login pages
    limit_req_zone $binary_remote_addr zone=login:10m rate=5r/m;

    # Zone "general": moderate, 30 requests/second for general traffic
    limit_req_zone $binary_remote_addr zone=general:10m rate=30r/s;
}

Key parameters:

ParameterDescription
$binary_remote_addrClient IP as the key (16 bytes per IPv4 address)
zone=name:sizeNamed zone and shared memory size (10m ≈ 160,000 IPv4 addresses)
rate=Nr/s or rate=Nr/mAllowed request rate per second or per minute

2. Apply Rate Limits to Locations

server {
    listen 80;
    server_name example.com;

    # General site-wide rate limit with burst
    limit_req zone=general burst=50 nodelay;
    limit_req_status 429;

    # Stricter limit on login endpoint
    location /api/auth/login {
        limit_req zone=login burst=3 nodelay;
        proxy_pass http://backend;
    }

    # API endpoints with moderate limiting
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://backend;
    }

    # Static assets — no rate limiting needed
    location /static/ {
        limit_req off;
        alias /var/www/static/;
    }
}

Understanding burst and nodelay:

  • burst=N: Allows up to N extra requests above the rate limit to be queued instead of immediately rejected. Without burst, any request exceeding the rate is rejected.
  • nodelay: Processes burst requests immediately without throttling delay. Without nodelay, burst requests are metered out at the defined rate (requests are delayed in a queue).
  • delay=N: (Nginx 1.15.7+) A hybrid: the first N burst requests are processed immediately, and the rest are delayed.

3. Configure Connection Limiting

Connection limiting is separate from request rate limiting. It caps how many simultaneous TCP connections a single client can hold open:

http {
    limit_conn_zone $binary_remote_addr zone=addr:10m;

    server {
        # Max 10 simultaneous connections per IP
        limit_conn addr 10;

        # Return 429 instead of default 503
        limit_conn_status 429;

        location /downloads/ {
            # Stricter for download area
            limit_conn addr 2;
            limit_rate 500k;  # Bandwidth throttle: 500KB/s per connection
        }
    }
}

4. Whitelist Trusted IPs

Use a geo block to exempt monitoring systems, internal services, or office IPs from rate limiting:

http {
    geo $limit_key {
        default         $binary_remote_addr;
        10.0.0.0/8      "";   # Internal network — no limiting
        192.168.1.0/24  "";   # Office network — no limiting
        203.0.113.50    "";   # Monitoring server — no limiting
    }

    # Requests from whitelisted IPs get an empty key, which is not tracked
    limit_req_zone $limit_key zone=api:10m rate=10r/s;
}

5. Custom Error Responses

Replace the default error page with a JSON API response or a user-friendly HTML page:

# Return JSON for API endpoints
error_page 429 = @rate_limited;

location @rate_limited {
    default_type application/json;
    return 429 '{"error": "rate_limit_exceeded", "message": "Too many requests. Please retry after a few seconds.", "retry_after": 1}';
}

6. Test the Configuration

# Validate syntax
sudo nginx -t

# Reload Nginx
sudo systemctl reload nginx

# Test rate limiting with Apache Bench (100 requests, 10 concurrent)
ab -n 100 -c 10 http://example.com/api/data

# Or use curl in a loop
for i in $(seq 1 20); do
    echo "Request $i: $(curl -s -o /dev/null -w '%{http_code}' http://example.com/api/data)"
done

You should see 200 responses up to the rate limit, then 429 responses for excess requests.


Gotchas and Edge Cases

  • Reverse proxy chains: If Nginx is behind a load balancer or CDN, $binary_remote_addr will be the proxy’s IP, not the client’s. Use $http_x_forwarded_for or the realip module to extract the real client IP.
  • Rate limit zones are per-worker: Shared memory zones (zone=name:10m) are shared across all Nginx worker processes. This is correct behavior — no extra configuration needed.
  • IPv6 addresses: $binary_remote_addr is 16 bytes for IPv6 (vs. 4 for IPv4). A 10MB zone holds ~160k IPv4 or ~40k IPv6 addresses. Size accordingly.
  • API keys instead of IPs: For authenticated APIs, you can key by a header or cookie: limit_req_zone $http_x_api_key zone=apikey:10m rate=100r/s;
  • Logging: By default, rate-limited requests are logged at error level. Control this with limit_req_log_level notice; to reduce noise.

Prevention

  • Layer rate limiting with a WAF: Combine Nginx rate limiting with ModSecurity, Cloudflare, or AWS WAF for defense in depth.
  • Use fail2ban: Automatically ban IPs that persistently hit rate limits by parsing Nginx error logs.
  • Monitor with metrics: Export Nginx status metrics to Prometheus using the stub_status module and alert on elevated 429 rates.
  • Tune gradually: Start with generous limits and tighten them based on observed traffic patterns. Overly aggressive limits will block legitimate users.

Summary

  • Nginx rate limiting uses limit_req_zone and limit_req for request rates, and limit_conn_zone / limit_conn for connection caps.
  • The leaky bucket algorithm smooths traffic spikes rather than enforcing rigid per-second counts.
  • Always configure burst and nodelay for production workloads to handle legitimate traffic bursts gracefully.
  • Set limit_req_status 429 to return the standard HTTP 429 status code instead of the default 503.
  • Whitelist trusted IPs using geo blocks and test thoroughly with ab or wrk before deploying to production.