I recently noticed something weird in my Apache access log files. There were entries like this:
abdussamad.com:80 localhost - - [09/May/2013:14:26:47 +0500] "POST /wp-login.php HTTP/1.0" 200 3784 "abdussamad.com/wp-login.php" "Mozilla/5.0 (Windows NT 6.1; rv:19.0) Gecko/20100101 Firefox/19.0" |
Now the second column above is supposed to contain the remote host that made this request i.e. the user’s computer. But here it is shown as localhost. One possibility is that the requests originated on my server but I ruled that out. So how is this possible?
The answer lies in the custom log format vhost_combined that is included by default in the Debian and Ubuntu Apache configuration file (/etc/apache2/apache2.conf):
LogFormat "%V:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined |
I draw your attention to %h which is the remote host i.e. Apache does a reverse DNS lookup of the user’s IP address and then logs that. There are two problems with this:
- You are assuming that the PTR record is accurate.
- It slows down Apache because it has to do DNS lookups for each request.
In my case it turned out that these were bots which had their pointer records set to localhost. The idea is that localhost is trusted so their activities go under the radar.
The solution is to use %a instead of %h in the Apache log format:
LogFormat "%V:%p %a %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined |
This records IP addresses instead of hostnames so that there is no confusion. As a bonus it saves one DNS lookup per request.
Also if you want to prevent brute forced WordPress logins you use a separate log file that records IP addresses and not hostnames. There is a full guide here.