What is the difference between limit_req and limit_conn in Nginx?

The limit_req directive limits the rate of incoming requests using a leaky bucket algorithm, while limit_conn limits the number of simultaneous connections from a single IP or other key. Use limit_req for API rate limiting and limit_conn for download or streaming throttling.

How do I whitelist certain IPs from rate limiting in Nginx?

Use a geo or map block to create a variable that selects different rate limit zones based on the client IP. Set the rate limit zone to a generous value or bypass entirely for trusted IPs while applying strict limits to everyone else.

Why am I getting 503 errors instead of 429 when rate limiting with Nginx?

By default, Nginx returns a 503 Service Unavailable when rate limits are exceeded. To return the standard 429 Too Many Requests status code, add the directive limit_req_status 429 in your server or location block.

Nginx Rate Limiting Configuration Guide

Why Rate Limiting Matters

Every public-facing web server is subject to traffic spikes, automated scrapers, brute-force login attempts, and distributed denial-of-service (DDoS) attacks. Without rate limiting, a single abusive client can exhaust your server’s CPU, memory, and file descriptors, making the service unavailable to legitimate users.

Nginx’s built-in rate limiting provides a lightweight, high-performance defense layer that operates at the reverse proxy level — before requests ever reach your application. Using the ngx_http_limit_req_module and ngx_http_limit_conn_module, you can enforce per-IP or per-user request rates, connection caps, and burst allowances with minimal configuration overhead.

This guide walks through practical rate limiting configurations covering API endpoints, login pages, static assets, and whole-site protection.

Prerequisites

Nginx 1.18+ installed (both modules are compiled in by default).
sudo or root access to edit Nginx configuration files.
Basic familiarity with Nginx configuration syntax (http, server, location blocks).
A test tool like curl, Apache Bench (ab), or wrk for validation.

Understanding the Leaky Bucket Algorithm

Nginx rate limiting uses the leaky bucket algorithm. Imagine a bucket with a small hole at the bottom:

Requests fill the bucket from the top.
The bucket drains at a fixed rate (your configured rate, e.g., 10 requests/second).
If the bucket overflows (too many requests arrive faster than the drain rate), excess requests are either rejected or queued (burst).

This model smooths out traffic spikes rather than enforcing a rigid per-second counter, making it well-suited for real-world traffic patterns.

Step-by-Step Solution

1. Define a Rate Limiting Zone

In your nginx.conf (inside the http block), define a shared memory zone:

http {
    # Zone "api": keyed by client IP, 10MB storage, 10 requests/second
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

    # Zone "login": stricter, 5 requests/minute for login pages
    limit_req_zone $binary_remote_addr zone=login:10m rate=5r/m;

    # Zone "general": moderate, 30 requests/second for general traffic
    limit_req_zone $binary_remote_addr zone=general:10m rate=30r/s;
}

Key parameters:

Parameter	Description
`$binary_remote_addr`	Client IP as the key (16 bytes per IPv4 address)
`zone=name:size`	Named zone and shared memory size (10m ≈ 160,000 IPv4 addresses)
`rate=Nr/s` or `rate=Nr/m`	Allowed request rate per second or per minute

2. Apply Rate Limits to Locations

server {
    listen 80;
    server_name example.com;

    # General site-wide rate limit with burst
    limit_req zone=general burst=50 nodelay;
    limit_req_status 429;

    # Stricter limit on login endpoint
    location /api/auth/login {
        limit_req zone=login burst=3 nodelay;
        proxy_pass http://backend;
    }

    # API endpoints with moderate limiting
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://backend;
    }

    # Static assets — no rate limiting needed
    location /static/ {
        limit_req off;
        alias /var/www/static/;
    }
}

Understanding burst and nodelay:

burst=N: Allows up to N extra requests above the rate limit to be queued instead of immediately rejected. Without burst, any request exceeding the rate is rejected.
nodelay: Processes burst requests immediately without throttling delay. Without nodelay, burst requests are metered out at the defined rate (requests are delayed in a queue).
delay=N: (Nginx 1.15.7+) A hybrid: the first N burst requests are processed immediately, and the rest are delayed.

3. Configure Connection Limiting

Connection limiting is separate from request rate limiting. It caps how many simultaneous TCP connections a single client can hold open:

http {
    limit_conn_zone $binary_remote_addr zone=addr:10m;

    server {
        # Max 10 simultaneous connections per IP
        limit_conn addr 10;

        # Return 429 instead of default 503
        limit_conn_status 429;

        location /downloads/ {
            # Stricter for download area
            limit_conn addr 2;
            limit_rate 500k;  # Bandwidth throttle: 500KB/s per connection
        }
    }
}

4. Whitelist Trusted IPs

Use a geo block to exempt monitoring systems, internal services, or office IPs from rate limiting:

http {
    geo $limit_key {
        default         $binary_remote_addr;
        10.0.0.0/8      "";   # Internal network — no limiting
        192.168.1.0/24  "";   # Office network — no limiting
        203.0.113.50    "";   # Monitoring server — no limiting
    }

    # Requests from whitelisted IPs get an empty key, which is not tracked
    limit_req_zone $limit_key zone=api:10m rate=10r/s;
}

5. Custom Error Responses

Replace the default error page with a JSON API response or a user-friendly HTML page:

# Return JSON for API endpoints
error_page 429 = @rate_limited;

location @rate_limited {
    default_type application/json;
    return 429 '{"error": "rate_limit_exceeded", "message": "Too many requests. Please retry after a few seconds.", "retry_after": 1}';
}

6. Test the Configuration

# Validate syntax
sudo nginx -t

# Reload Nginx
sudo systemctl reload nginx

# Test rate limiting with Apache Bench (100 requests, 10 concurrent)
ab -n 100 -c 10 http://example.com/api/data

# Or use curl in a loop
for i in $(seq 1 20); do
    echo "Request $i: $(curl -s -o /dev/null -w '%{http_code}' http://example.com/api/data)"
done

You should see 200 responses up to the rate limit, then 429 responses for excess requests.

Gotchas and Edge Cases

Reverse proxy chains: If Nginx is behind a load balancer or CDN, $binary_remote_addr will be the proxy’s IP, not the client’s. Use $http_x_forwarded_for or the realip module to extract the real client IP.
Rate limit zones are per-worker: Shared memory zones (zone=name:10m) are shared across all Nginx worker processes. This is correct behavior — no extra configuration needed.
IPv6 addresses: $binary_remote_addr is 16 bytes for IPv6 (vs. 4 for IPv4). A 10MB zone holds ~160k IPv4 or ~40k IPv6 addresses. Size accordingly.
API keys instead of IPs: For authenticated APIs, you can key by a header or cookie: limit_req_zone $http_x_api_key zone=apikey:10m rate=100r/s;
Logging: By default, rate-limited requests are logged at error level. Control this with limit_req_log_level notice; to reduce noise.

Prevention

Layer rate limiting with a WAF: Combine Nginx rate limiting with ModSecurity, Cloudflare, or AWS WAF for defense in depth.
Use fail2ban: Automatically ban IPs that persistently hit rate limits by parsing Nginx error logs.
Monitor with metrics: Export Nginx status metrics to Prometheus using the stub_status module and alert on elevated 429 rates.
Tune gradually: Start with generous limits and tighten them based on observed traffic patterns. Overly aggressive limits will block legitimate users.

Summary

Nginx rate limiting uses limit_req_zone and limit_req for request rates, and limit_conn_zone / limit_conn for connection caps.
The leaky bucket algorithm smooths traffic spikes rather than enforcing rigid per-second counts.
Always configure burst and nodelay for production workloads to handle legitimate traffic bursts gracefully.
Set limit_req_status 429 to return the standard HTTP 429 status code instead of the default 503.
Whitelist trusted IPs using geo blocks and test thoroughly with ab or wrk before deploying to production.