Kategorier
NGINX

Rate limit NGINX by User-Agent

Bots can be intelligent, but also aggressive, and sometimes they become really annoying and affect the performance of your system.

Bots can be intelligent, but also aggressive, and sometimes they become really annoying and affect the performance of your system.

If you’re using NGINX it’s not always clear how to handle these things. I’ve seen several example using the IP to block aggressive behavior, but bots are most likely using several IP’s to crawl your site. The following solution blocks bots based on their User-Agent.

conf.d/user-agent-rate-limit.conf

# 1 = soft, 2 = medium, 3 = hard
map $http_user_agent $rate_bot {
default "";
"~\bgooglebot\b" 1;
  "~\bbingbot\b" 3;
}

# http status to apply when rules are used
limit_req_status 429;

# soft rate limit
map $rate_bot $rate_bot_soft {
default "";
1 $http_user_agent;
}

limit_req_zone $rate_bot_soft zone=ratebot_soft:16m rate=5r/s;

# medium rate limit
map $rate_bot $rate_bot_medium {
default "";
2 $http_user_agent;
}

limit_req_zone $rate_bot_medium zone=ratebot_medium:16m rate=3r/s;

# hard rate limit
map $rate_bot $rate_bot_hard {
default "";
3 $http_user_agent;
}

limit_req_zone $rate_bot_hard zone=ratebot_hard:16m rate=1r/s;

conf.d/example-com.conf

server {
server_name example.com;
...

# apply ratebot rules
limit_req zone=ratebot_soft nodelay;
limit_req zone=ratebot_medium nodelay;
limit_req zone=ratebot_hard nodelay;
}

Happy blocking :)

Skriv et svar

Din e-mailadresse vil ikke blive publiceret. Krævede felter er markeret med *