Outside the Cube (7633)
Web publishing (65)
about DelphiFAQ (14)
perl CGI (3)
Web Hosting (8)
New related comments
Number of comments in the last 48 hours
Regular expressions in Homesite
Googlebot requested a document that does not exist
Referer spam in my web server's log files
This article has not been rated yet. After reading, feel free to leave comments and rate it.
Question:I found a strange web site in my web server's referrer logs. I visited that site and it seems to have no links pointing to me. How is that possible?
Answer:Either that site was what is called a 'Scraper Site' or you found a case of 'Referrer Spam'.
A scraper site is a web site with automatically generated content. Each page on that site is about a few keywords, like 'radar detector', then a script runs a google search for these keywords, extracts the results and pretends that this extracts are content. Usually the add a lot of advertising to these pages to earn money.
Because their pages are constantly been updated, they rank pretty good in search engines.
You recognize these types of sites immediately when you see one. They are bad but they do not break any laws except the code of ethics.
If it is not a scraper site, then it probably was 'Referrer Spam'.
Referrer spam is when a spammer sends a lot of page requests with faked referrer information to your site. The goal for the spammer is to be displayed in referral statistics at the target site. This is really only helpful if your site has the statistics - or an extract of them - automatically posted to their site. Some blogs display a link to the most frequent or most recent web site found in their referer logs using scripts. The spammer wants his site listed there.
To find targets like your site, the spammer does something like a Google query for the phrase "recent referers".
How does the spammer generate so many hits then? I could just block his IP number?
A common trick is to include an image tag in one the spammer's most popular pages and the SRC tag of this image will refer to your web site's home page. No image is loaded, but every visitor to the spammer's site will generate a hit on your site that is logged by your server. Assuming that the spammer already has some decent traffic going, that way many distinct IP numbers will flood your log file.
And of course spammers won't be shy when it comes to using real dirty tricks like spyware, adware etc. A virus or other malware, maybe pretending to be a toolbar can generate such http requests easily and the user will never know.
You can block those spammers by using Apache's mod_rewrite. Put the following in your .htaccess file: