Aggressive bots

Blocking common WordPress vulnerability probes

My server was wasting a lot of resources humoring automated scripts that were trying to find known vulnerabilities in WordPress and other content management systems. This article will cover how-to configure the Apache httpd web server to recognize a few common probing patterns used to look for vulnerabilities in WordPress, themes, and plugins. After identifying such probing, the attacking source IP can be blocked automatically using temporary firewall rules.

Do not try to implement unless you understand the effects of the configuration changes mentioned below. They’ll be explained as I go through them, but do not follow the instructions blindly. Especially if you run any content system other than WordPress. Even if you don’t run WordPress on your website, you’re likely to still have exploits like this and would benefit from implementing the blocking techniques discussed here.

These instructions are for the Apache web server, but you can apply the logic on just about any web server. Unless noted otherwise, all code examples are for httpd configurations.

This will be a two-step process. Step one is to configure Apache to issue 403 responses when encountering common probes for WordPress vulnerabilities; and will be what I’ll cover in this article. This step can be successfully used even when you’re not using WordPress as the traffic patterns will be the same regardless. Step two is to block repeated 403 responses using Fail2Ban as covered in the previous article.

The first file we’ll want to protect the is the high-value target “wp-config.php” that contains database passwords and other sensitive information about how your website runs. From my own experiences, I’d say roughly ¾ of WordPress specific attacks go after this file.

A common way of getting at the WordPress configuration file is by manipulating plug-ins and themes into serving up the file as a download. In order to do that they craft URLs like “/wp-content/plugins/downloader/download.php?file=../../../wp-config.php” that try to exploit weak security in file download and uploading plugins, as well as photo gallery and other plugins. Even if you don’t have these plug-ins installed, we’d want to block these requests. What they all have in common is that their query string contains “wp-config.php”, which we’re very unlikely to ever see in a legitimate request.

<Location "/wp-content/">
    <If "%{QUERY_STRING} =~ /wp-config.php/">
        Deny from all

If there ever is a problem with your web server and especially the PHP interpreter, the raw file with all it’s secrets could be served up simply by requesting the file “/wp-config.php” directly without using plug-ins or other exploits. To reduce the risk of direct access, many articles instruct you to block direct requests for the specific file in Apache. Unfortunately, due to human errors and ways, many variations of this file name is likely to exist temporarily on your server and anything that is temporary runs the risk of being forgotten in place. These are just the top requested variants of this file on my server:

  • /wp-config.php~
  • /wp-config.php.bak
  • /wp-config.php.old
  • /wp-config.php2
  • /wp-config.php1
  • /wp-config.php.rpmnew
  • /wp-config.bak
  • /
  • /wp-config.php.rpmsave
  • /wp-config.php_old
  • /wp-config.phpold
  • /wp-config.php.tmp
  • /wp-config.php-ftp
  • /wp-config-backup.php
  • /wp-config.old
  • /wp-config.txt

You’ll likely to recognize at least some of the above as either automatic backup or auto-save files created by popular text editors, or distribution specific backup files, and some are human copies made before a file is edited. Clearly, blocking a direct match for “/wp-config.php” isn’t enough and a broader matching than what is usually recommended is needed:

<Location "/wp-config*">
    Deny from all

Give what we’ve just discussed, you’ll also want to go back to the first configuration example and adjust the query string match from wp-config.php to wp-config. I’ve seen no evidence indicating that any bot are actively trying to probe for files other than the WordPress configuration file, but we all know that it will happen eventually.

Now that probing that targets the main WordPress configuration file is under control, we’ll take a quick look at the WordPress login system. It will already issue 403 responses following failed login attempts, so there isn’t much to do there. However, when you’ve disabled registrations in WordPress, only bots will ever try to register a new account. Block attempts at accessing the registration form directly:

<Location "/wp-login*">
    <If "%{QUERY_STRING} =~ /action=register/">
        Deny from all

Based on my own logs, this one alone should match against 6 % of all bot activity. Registering a new account is usually the third thing they try to do after aggressively trying to extract the WordPress configuration file.

I mentioned earlier that problems go after popular themes and not just plugins. This is mostly because some themes bundle in plugins or include custom and exploitable PHP code. The hallmark of all of these probes are that they all request files ending in “.php” inside the theme directory.

It can be improved greatly by blocking direct invocation of .php files in all your themes directory. Blocking .php files in the themes directory will already have blocked most attacks against known vulnerabilities in popular themes. Still, your website will only be using one theme so block access entirely to any theme but the one you’re using. Replace the default twentysixteen theme below with the name of the theme you’re currently using.

<LocationMatch "^/wp-content/themes/(.*.php|(?!twentysixteen))">
    Deny from all

Implementing this is a big risky as if you’ve just changed the theme you’re using there may still be references to images, styles, and scripts inside the old folder. You can avoid this by renaming the theme folder to “sitename” and always having the current theme be in that folder.

If you’re using any systems other than WordPress then this next one might not be for you. Be sure to test everything thoroughly after deploying it, in any case. Let’ block off attempts to access logins other than wp-admin, and let us just flat-out ban any non-PHP scripting languages. It’s not that you’re using them or that they could be used that is important, but it will cut off the request at the HTTP layer before even invoking WordPress and PHP to serve a 404 page. Like all the previous methods, this will free up server resources rather than let attackbots waste it.

<LocationMatch "^/(cgi-bin|admin|(?!wp-admin).*admin">
      Deny from all
<LocationMatch ".(?:as[px]{1,2}|cfm|cgi|jsp[x]?|p[ly])$">
      Deny from all

Now, every request that match one of the above rules will get a HTTP 403 Forbidden error. This has three main benefits: 1) httpd will serve up a 403 much faster than php+mysql+wordpress can serve up a 404, freeing up resources for real users. 2) some automated attacks give up after just two attempts when you return 403 instead of the expected 404 or 200. 3) They’re much easier to keep track of and block access.

Returning a 403 doesn’t stop them from trying other addresses and potentially stumbling upon one that will work. Luckily, repeated 403 errors in log files will now be a big red flag that separates normal visitors and malintent bots and users. See the previous article block repeated 403 responses using Fail2Ban for instructions on configuring such a set up.

Lastly, be sure to include the address patterns we’ve blocked above in Disallow directives block in your /robots.txt file. This will prevent good bots that honor the robots directives file like Bingbot and Googlebot from inadvertently being identified as bad bots and temporarily locked out of your website. Here is a quick example

User-Agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin
Disallow: /wp-config*
Disallow: /wp-content/themes/*.php
Disallow: /wp-login.php
Disallow: /*.asp*
Disallow: /*.cfm
Disallow: /*.cgi
Disallow: /*.|jsp*
Disallow: /*.pl
Disallow: /*.py

If you found this article interesting, you should move on to investigating a full-on application level firewall such as mod_security for the Apache web server.

Feature image “Bot!” © 2009 Eelke Dekker, used under the terms of CC BY 2.0.