schlitt.info - php, photography and private stuff ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ :Author: Tobias Schlitt :Date: Thu, 27 Nov 2008 18:00:29 +0100 :Revision: 2 :Copyright: CC by-nc-sa ================================== Fighting trackback spam on PEARWeb ================================== :Description: Trackback spammers only needed a couple of weeks to discover PEARWeb's new trackback feature for their purposes. Of course I had to do something against that, which sadly lasted very long because I've been much too busy. Last week I finally managed to release Services_Trackback 0.5.0 and migrated the PEARWeb code to the new version. Until now, trackback spam seems to be stopped. Ususally there was a huge number new spam trackbacks on PEARWeb per day, since last Tuesday there was not even 1. Trackback spammers only needed a couple of weeks to discover `PEARWeb's new trackback feature`__ for their purposes. Of course I had to do something against that, which sadly lasted very long because I've been much too busy. Last week I finally managed to `release Services_Trackback 0.5.0`__ and migrated the PEARWeb code to the new version. Until now, trackback spam seems to be stopped. Ususally there was a huge number new spam trackbacks on PEARWeb per day, since last Tuesday there was not even 1. .. __: /opensource/blog/0297_trackbacks_on_pearweb.html .. __: /opensource/blog/0332_finally_services_trackback_0_5_0.html Since the new spam check features in Services_Trackback have some of the most common methods against trackback spam bundled, the process PEARWeb `performs`__ to check for spam is very simple: :: $trackback->createSpamCheck('Wordlist'); $trackback->createSpamCheck('DNSBL'); $trackback->createSpamCheck('SURBL'); $res = $trackback->checkSpam(); if ($res) { echo Services_Trackback::getResponseError('Your trackback seems to be spam. If it is not, please contact the webmaster of this site.', 1); exit; } The first 3 lines of code create 3 spam check modules and add them to the trackback recently received. Services_Trackback would support to add checks each with a different priority to select the order of their execution, I don't do that here, so they get executed in the same order they were added. All checks are used with their default configuration. .. __: http://cvs.php.net/co.php/pearweb/public_html/trackback/trackback.php?php=0518615793863083f5092ddb76311ebb&r=1.14#70 The *checkSpam()* method checks each of the added spam checks sequentially, stops if one of the reports spam and returns true (the trackback is considered spam). If none of the modules indicate spam, *checkSpam()* returns false. What this means exactly is, that first the trackback is scanned using the integrated "bad word list", which should filter almost 80% of the trackback spam. Next the host sending the trackback is checked against the `Spamcop`__ DNS blacklist, which should clean up another 5-10% of the spam trackbacks. Last a `SURBL`__ check is performed, checking the links contained in the trackback. This should (hopefully) catch the rest of trackback spam from comming through. Since the spam check stops if one module indicates spam, the waste of resources is pretty low (only if a trackback passes the Wordlist filter the more resource expensive DNSBL is performed and only if this is passed, the even more expensive SURBL checks are performed). .. __: http://www.spamcop.net/ .. __: http://www.surbl.org/ I'm currently very confident, that those checks will keep away most trackback spam from PEARWeb. Nevertheless, I'm interessted in implementing (maybe cheaper) spam protection methods. **So, anyone out there any feedback/ideas on Services_Trackback, it's spam protection or similar?** .. Local Variables: mode: rst fill-column: 79 End: vim: et syn=rst tw=79 Trackbacks ========== - Services_Trackback - Thoughts on trackback spam on Fri, 24 Jun 2005 16:35:35 +0200 in Tobias Schlitt - Weblog A few weeks ago I announced the release of Services_Trackback 0.5.0, which has a new module system for integrating spam protections into your trackback mechanisms. While the most easy filter (the bad word list) worked quite well for the first time frame, Comments ========