Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: User Agent Java/1.x.x_xx

On Fri, 12 May 2006 17:15:08 +0100, Roy Schestowitz
<newsgroups@xxxxxxxxxxxxxxx> opined:
> __/ [ David Cary Hart ] on Friday 12 May 2006 17:01 \__
> 
> > Who is using the UA and why? I cannot even find were to obtain it.
> > This seems to be a Hoover yet I have yet to see an instance where
> > any have checked robots.txt. I have been redirecting these. Am I
> > losing legitimate traffic?
> 
> What's the nature of the site? MATLAB, whose is heavily based on
> Java, can be  used as a (fairly rudimentary) Web browser, so
> denying it might not be a  good  idea. It is primarily used for
> up-to-date documentation. That  is why I am asking about the nature
> of your site.

Anti-spam and DNSBL.
> 
> Additionally,  see the if the paths (sequence of requested files)
> seems to characterise  these as human visitors rather than some
> experimental bot, a lamer,  or a leech. 

Multiple, simultaneous gets. Never a referrer. I am watching it
closely for false positives. These are added to the firewall for 30
minutes (after notification) and generate an email to root.

> Don't neglect the
> possibility of spoofing. In the past week  alone,  two  people in
> the search engine newsgroups  reported  being whacked  by  Google
> or Yahoo. Upon closer inspection, there were  probably fakers (at
> least one of them confirmed).
> 
>         wget -R --user-agent="Java/1.2.4_55" your_site_URL\

That's how I test the setup. Spoofs are more likely to generate a
Mozilla string. Leeches are easy to spot because the consecutively
lack a referrer.
> 

-- 
Displayed Email Address is a SPAM TRAP
Our DNSRBL - Eliminate Spam: http://www.TQMcube.com
Multi-RBL Check: http://www.TQMcube.com/rblcheck.php
The Dirty Dozen Spammiest Ranges: http://tqmcube.com/dirty12.php

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index