Monday, February 7, 2011

How can I ban a whole company from my web site?

For reasons I won't go into, I wish to ban an entire company from accessing my web site. Checking the remote hostname in php using gethostbyaddr() works, but this slows down the page load too much. Large organizations (eg. hp.com or microsoft.com) often have blocks of IP addresses. Is there anyway I get the full list, or am I stuck with the slow reverse-DNS lookup? If so, can I speed it up?

Edit: Okay, now I know I can use the .htaccess file to ban a range. Now, how can I figure out what that range should be for a given organization?

  • Take a look at .htaccess if you're using apache: .htaccess tutorial

    From mrlinx
  • How about an .htaccess:

    Deny from x.x.x.x
    

    if you need to deny a range say: 192.168.0.x then you would use

    Deny from 192.168.0
    

    and the same applies for hostnames:

    Deny from sub.domain.tld
    

    or if you want a PHP solution

    $ips = array('1.1.1.1', '2.2.2.2', '3.3.3.3');
    if(in_array($_SERVER['REMOTE_ADDR'])){die();}
    

    For more info on the htaccess method see this page.

    Now to determine the range is going to be hard, most companies (unless they are big corperate) are going to have a dynamic IP just like you and me.
    This is a problem I have had to deal with before and the best thing is either to ban the hostname, or the entire range, for example if they are on 192.168.0.123 then ban 192.168.0.123, unfortunatly you are going to get a few innocent people with either method.

    John Millikin : If I'm reading the OP right, he wants a way to ban based on DNS name (not IP address).
    Unkwntech : Yah I just updated it.
    joelhardi : Unkwntech's got it. I'll add that it's better performance-wise to figure out the IP-block to ban (no DNS lookups). And even faster to put a firewall (or iptables) block in, so they can't even hit Apache. If they're on a dynamic IP, you could have a cronjob nslookup their domain and update the rule.
    From Unkwntech
  • Do you have access to the actual server config? If so depending on the server you could do it in the configuration.

    See this thread for some information that may be helpful.

    From iros
  • Continue to use gethostbyaddr(), but behind a cache. You should only have to resolve it once per IP address, and then it would not be a significant performance issue. If you want, prime the cache from your server logs so returning users won't even hit the one-time slowdown.

  • If you're practicing safe webhosting, then you have a firewall. Use it.

    Large companies have blocks of IP addresses, but even smaller companies rarely change their IP. So there's an easy way to do this without reducing your performance:

    Every month do a reverse lookup on all the IPs in your log and then put all the IPs used by that company in your firewall as deny.

    After awhile yo'll begin to see whether they have dynamic addresses or not. If they do, then you may have to do reverse lookups for each connection attempt, but unless they are a small company you shouldn't have to worry about it.

    From Adam Davis
  • First search for the company on whois.net. If you know they are just one domain, do a whois lookup. Otherwise, search for domains they own by keyword.

    You can find out the main IP ranges assigned to the company through whois queries, and then build your deny rule(s) accordingly.

    From jtimberman
  • http://en.wikipedia.org/wiki/Rwhois telnet rwhois.arin.net 4321

    This used to work.

  • I know WikiScanner lets you search for a company or other organization, and then lists the IP address ranges belonging to them. Just as an example, here's all the IP addresses belonging to Google, at least according to WikiScanner.

    According to HowStuffWorks, they use something called "IP2Location".

  • If your goal in doing this is to make it slightly inconvenient for people from a company to access your site, follow the advice above. But you won't be able to completely ensure you're blocking every access because they could always be going through a proxy. And if it's accessible to the rest of the public, you'll have to worry about archive.org, search engine caches, etc.

    Probably not the answer you're looking for, but it's accurate.

  • The load shouldn't be put on the webserver, you should put it on the firewall.

  • Note that using the techniques above it will never be possible to completely ban the specific company from accessing your website. It will still be possible for them to use proxy servers or look at your site from home.

    If you absolutely want to control who has access, you should only allow authenticated and authorized users to access your site.

0 comments:

Post a Comment