I hate spam. In terms of WordPress, Akismet does an okay job of reducing the number of spam comments I have to deal with but it doesn't come close to eliminating the problem completely. As such, I wrote a plugin a few years ago that considers a number of other variables (outside the scope of Akismet) in determining if a comment should be considered legitimate. A variation of the PHP function discussed on this page forms the basis of a little personal spam-check service we operate.
The PHP function below will check an IP address against a list of defined DNSBL databases (DNS-based Blackhole List) in an attempt to determine if that IP is trustworthy or not.
There are dozens of DNSBLs in existence, which use a wide array of criteria for listing and delisting of addresses. These may include listing the addresses of zombie computers or other machines being used to send spam, listing the addresses of ISPs who willingly host spammers, or listing addresses which have sent spam to a honeypot system. You can read more about DNSBL on Wikipedia .
Included in this article is a couple of practical examples of how our PHP function might be used to mitigate spam. It's not fool proof... but it'll be of value if used with other tools.
The PHP Code
The first function will query any defined DNSBL server and return either a Boolean result (true if the IP is listed) or an array containing the queried servers and their response. The function provides for a percentage of servers that have to list the IP address as "bad" before it'll return a true result.
Usage is easy. In the first example we'll return a Boolean result and print whether the IP address is considered dodgy. For the purpose of all the following examples I'm using an IP address that's listed in a large number of spam databases (at the time of writing).
You may alter your aggressiveness in identifying spam using
beliefmedia_blacklist($ip, $p = '10', $array = 1) where
$p="10" is the percentage of (total) servers that have to return
true before you identify the IP as spam.
Note that an array is returned from the function by default (
$array = 0). Using
print_r(beliefmedia_blacklist($ip)); (with the same IP of
220.127.116.11) returns the following:
For the final example, we've provided another function that takes the returned array and loops through the results to determine what servers list the IP address in question. We'll simply list the SBL server address and then display either an html entity tick or cross.
The result looks like this:
On our own IP website we render the results in three rows after checking about 40 servers, and we use images rather than HTML entity characters.
We apply the feature described above for the following:
- To filter comments into a spam folder in our own CMS.
- Completely block certain people from accessing certain websites.
- In WordPress, we use a plugin to scrutinize and block comments deemed to be spam.
Since each DNSBL database has its own criteria for determining spam, it's best to use only those that best suit your needs. In addition, you should tweak your percentage as required until you find a balance that works.
Simply because an IP is listed in a database doesn't mean that it's spam - it is just an indication. Using multiple servers is the best means of assessing an IP to determine its legitimacy.