How to Block Bad Traffic to Your Proxy Site

Posted by Vectro 7 December 2011

Overview

Some traffic to proxy sites serves no real purpose and wastes resources. However, weeding out harmful or useless traffic to your proxy can be difficult. This purpose of this entry is to outline different types of bad traffic and what the solutions are. There will be links to other entries with more specific information on each issue.

Out of Control Search Engine Crawlers

One problem with proxies is that PHP stores proxied pages which users view. This means the pages can be linked to later on. It also means search engine crawlers such as Googlebot or Bingbot might try to index these pages. Sometimes, thousands of proxied pages are stored, which can keep Google pretty busy. None of these pages are of any use in search results, so it’s better to inform the crawlers not to index them. The way to solve this in Glype is by adding an exclusion for browse.php to your robots.txt file. Unfortunately, this method is not compatible with PHProxy.

Hotlinkers

As mentioned above, people might try to hotlink your proxy. The strongest way to combat this is to set up server-side protection using your .htaccess file. This method involves copying and pasting some code and also comes with a caveat, but is well worth it if you take the time to set it up.

Bots and Scrapers

Automated bots might try to leech access from your proxy or use it for malicious purposes. There are three things which should be done to slow the typical onslaught of web junk. The first one is to rename browse.php in Glype as several bots specifically look for that file. The second thing to do is block known bad bots in your .htaccess file using a list which was compiled with the help of experienced webmasters. The third thing to do is enable CloudFlare on your account for free, provided your hosting company has it available. After it is enabled, a special domain redirect is required, which your host should be able to help you with. The redirect does not harm search engine rankings. CloudFlare filters known bot traffic before it even reaches your site by blocking it at the DNS level before the request is even sent to the web server. All of the data was compiled during stringent research under Project Honeypot, which is managed by the same company.

Known Spammers and Infected Computers

This is a major problem. Some people might attempt to connect to webmail services via your proxy and send spam. Others might be using your proxy to post comment spam on blogs or forums. This can reflect poorly on your site. Even worse, some computers are infected with hidden viruses which allow hackers to control them. You wouldn’t want the hacker using one of those computers to access your proxy. You don’t know what they might be up to. The best solution for this is to use a hosting provider who has implemented RBL (real-time blocklists) into their web server. This will prevent access for known problematic users.

Leechers

Some people over-use proxies and never click ads or purchase anything. These individuals are called leechers. You can find leechers and ban them by checking your web traffic statistics and finding out which IP addresses are making the highest number of pageviews or using the most bandwidth. In AWStats, they can be found by clicking ‘Full list’ under ‘Hosts’ from the left-side menu. It is recommended that you block anyone who has accumulated 10,000 or more hits or uses 100MB of bandwidth or more. When you find some addresses, the best way to ban them is by using cPanel if you have it. Log into cPanel and go to ‘IP Deny Manager’ under the ‘Security’ section. Copy and paste the addresses there.

There are also people who will attempt to download large files through your proxy. Glype and PHProxy both have settings to protect you from this. The setting in Glype is called max_filesize and the setting in PHPRoxy is called max_file_size.

Low-paying High-traffic Countries

Some countries certainly generate a lot of proxy traffic, especially those who’s governments impose Internet censorship on citizens. As much as it is a good thing to help those people, some countries like China generate more traffic than servers can handle and don’t generate the same amount of AdSense revenue as traffic from other locations. Additionally, some countries like Nigeria have a particularly high rate of Internet fraud. The best option is to block China, Taiwan, Hong Kong, Iran and Nigeria. There are two possible ways to accomplish this. You can either generate and use cut-and-paste code from blockacountry.com or use GeoIP to block countries if your hosting provider has it available. Both of these methods involve editing your .htaccess file.

Summary

It might take some time to implement all of the suggestions above, but they are worth it in the end because they substantially cut back on abuse and misuse. The simplest way accomplish everything is to host your proxies at x Proxy Host. Many of the features are built-in and don’t require any effort on your part. Those which are not will be implemented for you at no extra cost if you open a support ticket.

Sorry, comments are closed.

Previous Post
«
Next Post
»