08.04 Rate Limiting and Quotas

Why this is a reference article

This article documents the current state of API rate limiting and quotas in SimpleRisk. The state is brief: there isn't any native enforcement. The article covers what exists (nothing native), what doesn't (almost everything), and the standard reverse-proxy pattern programs use when they need enforcement.

What SimpleRisk natively enforces

Nothing. The v2 API codebase has no rate-limiting middleware, no per-key request quotas, no per-IP throttling, no per-endpoint limits. Each authenticated request executes immediately against the application without quota check.

The only rate-limit-adjacent feature is the test-email rate limit in the Notification Extra (5 test emails per hour per user, to prevent accidental SMTP service abuse during configuration debugging). That's specific to the Send Test Email button on the Mail Settings page, not the API in general.

Why this matters

For most programs, the lack of native rate limiting is fine. SimpleRisk installs are typically small enough (single-org, hundreds-to-thousands of users, internal-only network) that integrations are trusted and well-behaved. A well-coded integration polls every 5 minutes and creates negligible load; a misbehaving one might spike load briefly but operations notice and intervene.

The lack matters when:

Untrusted integrations: SaaS-style multi-tenant deployments where customer integrations might misbehave. Without enforcement, one customer's runaway script can affect the platform for everyone.
Audit requirements: SOC 2 type 2 controls sometimes require documented rate-limit enforcement.
DDoS surface: a public-internet-exposed SimpleRisk install with API access is a potential DDoS target via the API even with valid credentials.
Defense-in-depth: even with trusted integrations, a misbehaving CI pipeline at 4am with 100 concurrent workers could produce real load.

For these scenarios, reverse-proxy-based limiting is the path.

Recommended: reverse-proxy enforcement

nginx

A standard pattern using nginx's limit_req_zone:

http {
    # Define a rate-limit zone keyed on the API key (extracted from the X-API-KEY header)
    limit_req_zone $http_x_api_key zone=api_per_key:10m rate=10r/s;
    # Or keyed on client IP
    limit_req_zone $binary_remote_addr zone=api_per_ip:10m rate=10r/s;

    server {
        location /api/v2/ {
            # Apply the per-key limit: 10 requests per second, burst of 20, no delay
            limit_req zone=api_per_key burst=20 nodelay;
            # Or apply the per-IP limit
            # limit_req zone=api_per_ip burst=20 nodelay;

            proxy_pass http://simplerisk_backend;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
    }
}

Adjust rate=10r/s and burst=20 to your tolerance. nginx returns 503 Service Unavailable (or 429 Too Many Requests with limit_req_status 429;) when the limit is exceeded.

HAProxy

backend simplerisk_api
    stick-table type string len 64 size 100k expire 30s store http_req_rate(10s)
    http-request track-sc0 req.hdr(X-API-KEY)
    http-request deny status 429 if { sc_http_req_rate(0) gt 100 }
    server simplerisk simplerisk_backend:80

Returns 429 when a key exceeds 100 requests in 10 seconds.

Cloudflare / AWS WAF / Cloud-native

Cloudflare Rate Limiting Rules and AWS WAF rate-based rules both let you configure per-IP-and-path limits in the cloud edge. The configuration is in their UIs; the principle is the same: requests to /api/v2/* exceeding N per minute get rejected before reaching SimpleRisk.

For installs behind these services, configure the limit in the cloud edge rather than at the SimpleRisk-adjacent reverse proxy.

What rate limits to set

Rough starting points:

Internal trusted integrations

Per-key rate: Generous (100/sec, burst 200)
Burst: Mostly to catch runaway loops

Mixed trust (some external customer integrations)

Per-key rate: Moderate (10/sec, burst 30)
Burst: Reasonable for legitimate API use; throttles abuse

Public-internet API exposure

Per-key rate: Conservative (1/sec, burst 5)
Burst: Protects against discovery-then-abuse

Tune based on what your legitimate integrations actually need. If a normal integration polls every 5 minutes (1 request per 300 seconds), the conservative profile is more than sufficient.

Per-endpoint vs global limits

A useful refinement: heavier endpoints (large list responses, expensive queries) get tighter limits than lightweight endpoints. Example nginx pattern:

location /api/v2/risks {
    # Heavier: list endpoint may return MB of JSON
    limit_req zone=api_per_key_heavy burst=5 nodelay;
    proxy_pass http://simplerisk_backend;
}

location /api/v2/risks/ {
    # Per-record endpoints are cheaper
    limit_req zone=api_per_key_light burst=20 nodelay;
    proxy_pass http://simplerisk_backend;
}

The granularity is a function of how much you care; most programs find a single per-key limit at the right level is sufficient.

How integrations should behave

Even without enforcement, integrations should self-throttle:

Don't poll faster than necessary. Most use cases tolerate 5-15 minute poll intervals; sub-minute polling is rarely required.
Use modified-since filters where supported to reduce response sizes.
Cache locally to avoid repeat requests for unchanged data.
Implement exponential backoff on 5xx responses or connection errors.
Coordinate with the SimpleRisk operations team for high-volume integrations so they can monitor.

A well-behaved integration produces negligible load even without rate limiting; a misbehaving integration with rate limiting is still a problem (it's just bounded).

Quotas: also none native

Beyond per-second rate limiting, some platforms enforce monthly or daily request quotas (e.g., "this API key gets 10,000 requests per month"). SimpleRisk has no native quota mechanism. If you need quotas:

Track usage externally (the integration logs its own request count) and self-enforce.
Use reverse-proxy or API gateway tooling that supports quota enforcement (Kong, AWS API Gateway, etc.).
Build custom quota tracking via a middleware integration.

The complexity scales fast; most programs don't need quotas (rate limiting alone covers the operational risk).

What might come natively in the future

If SimpleRisk adds native rate limiting in a future release, the likely shape:

A rate_limit_per_key setting (requests per minute or second).
A rate_limit_per_ip setting.
Per-endpoint overrides for heavy operations.
Standard 429 Too Many Requests responses with Retry-After headers.
Per-key request count tracking in the audit trail.

None of this exists today. File a feature request via Getting Help if your environment needs it.

Common pitfalls

A handful of patterns recur with the rate-limiting conversation.

Assuming SimpleRisk has rate limiting because most modern APIs do. It doesn't. Verify (test it) rather than assume.
Building integrations that poll at high frequency because "the API will throttle us if we're too fast." It won't. Self-throttle.
Setting reverse-proxy rate limits without testing legitimate integration behavior. A 10/sec limit might break a legitimate batch import that intentionally uses concurrency. Test before enforcing.
Configuring rate limiting at the SimpleRisk-adjacent proxy and at the cloud edge with different limits. Both apply; the lower one wins. Coordinate the configurations.
Treating 429 responses as transient and retrying immediately. 429 means "you exceeded the limit"; retry with backoff, not immediately.
Logging the API key in rate-limit-rejection log entries. Don't. Log the user_id (which is derivable from the key after authentication) but not the key itself.
Setting per-IP limits when integrations all originate from the same IP (typical for cloud integrations behind NAT). Per-IP limits affect the whole NAT group; switch to per-key limits.
Forgetting to test rate limits in non-production environments. Production is not where you discover the limit is too tight.
Setting limits and never reviewing them. Integration patterns evolve; a limit that was right two years ago may now be too tight or too loose. Review annually.

API Overview
Authentication and API Keys
Permissions and the API
Using the API for Integrations
Securing the Web Server (the reverse-proxy layer where limits get enforced)
Getting Help

Reference

Native rate limiting: None in the v2 API.
Native per-key quotas: None.
Native per-IP throttling: None.
Test-email rate limit: 5 emails per user per hour (specific to the Mail Settings test button, not the API).
Recommended enforcement: Reverse proxy (nginx limit_req_zone, HAProxy stick-table, Cloudflare Rate Limiting Rules, AWS WAF rate-based rules). Per-key or per-IP.
Standard rejection code: 429 Too Many Requests (configure your reverse proxy to return this rather than the default 503).
Implementing files: n/a — no native implementation.