total
tag | mails | description |
ld | 10791 | non-crippled mailaddress, linked, with description |
li | 8461 | non-crippled mailaddress, linked, without description |
no | 6748 | non-crippled mailaddress, not linked |
rc | 6273 | additional string REMOVETHIS in local part |
ra | 5630 | additional string NOSPAM in local part |
hc | 2087 | several html comments in mailaddress |
rb | 757 | additional string NOSPAM in domain |
hd | 30 | some chars replaced by html entities |
ha | 17 | several html comments in mailaddress, not linked |
hb | 7 | some chars replaced by html entities, not linked |
sh | 1 | "@" replaced by "at" |
sg | 0 | "@" replaced by "at", not linked |
sf | 0 | "@" replaced by "#" |
se | 0 | "." and "@" replaced by "(dot)" and "(at)" |
sd | 0 | "." and "@" replaced by "dot" and "at" |
sc | 0 | not linked, "@" replaced by "#" |
sb | 0 | not linked, "." and "@" replaced by "(dot)" and "(at)" |
sa | 0 | not linked, "." and "@" replaced by "dot" and "at" |
rd | 0 | additional string REMOVETHIS in domain |
total | 40802 | |
It seems to be effective, replacing "@" by "at" or some other string. Do not insert html comments into your mail address. I suppose the crawlers even speed up their proccesses by replacing html comments in the whole page before parsing it.
Note: the crawlers tried to remove REMOVETHIS and NOSPAM, but they mostly didn't succeed as you can see below.
valid
tag | mails | valid percent | description |
ld | 9740 | 90.2% | non-crippled mailaddress, linked, with description |
li | 7457 | 88.1% | non-crippled mailaddress, linked, without description |
no | 4542 | 67.3% | non-crippled mailaddress, not linked |
hc | 2065 | 98.9% | several html comments in mailaddress |
rb | 757 | 100.0% | additional string NOSPAM in domain |
ra | 715 | 12.7% | additional string NOSPAM in local part |
hd | 30 | 100.0% | some chars replaced by html entities |
ha | 17 | 100.0% | several html comments in mailaddress, not linked |
hb | 7 | 100.0% | some chars replaced by html entities, not linked |
sh | 1 | 100.0% | "@" replaced by "at" |
sg | 0 | - | "@" replaced by "at", not linked |
sf | 0 | - | "@" replaced by "#" |
se | 0 | - | "." and "@" replaced by "(dot)" and "(at)" |
sd | 0 | - | "." and "@" replaced by "dot" and "at" |
sc | 0 | - | not linked, "@" replaced by "#" |
sb | 0 | - | not linked, "." and "@" replaced by "(dot)" and "(at)" |
sa | 0 | - | not linked, "." and "@" replaced by "dot" and "at" |
rd | 0 | - | additional string REMOVETHIS in domain |
rc | 0 | - | additional string REMOVETHIS in local part |
total | 25331 | 62.1% | |
crawler IPs showing the first 20 of 114
IP | mails | DNS reverse lookup |
217.153.147.50 | 5189 | unable to resolve |
207.242.44.207 | 4298 | unable to resolve |
82.40.135.121 | 3478 | 82-40-135-121.cable.ubr02.pert.blueyonder.co.uk. |
66.36.79.250 | 2425 | unable to resolve |
216.130.32.5 | 2324 | unable to resolve |
210.44.196.73 | 1961 | unable to resolve |
221.223.253.222 | 1840 | unable to resolve |
80.5.90.174 | 1822 | unable to resolve |
66.36.73.146 | 1356 | unable to resolve |
66.36.77.211 | 1271 | unable to resolve |
67.49.19.48 | 1204 | cpe-67-49-19-48.socal.res.rr.com. |
203.181.87.120 | 1160 | p203-181-87-120.sub.ne.jp. |
65.19.54.107 | 834 | qwest107-dsl10.cybermesa.com. |
217.153.147.50 | 424 | unable to resolve |
195.56.138.164 | 369 | budaors-37.dialin.datanet.hu. |
216.136.138.173 | 323 | unable to resolve |
66.47.249.148 | 246 | user-112vuck.biz.mindspring.com. |
216.136.138.172 | 238 | unable to resolve |
68.72.142.239 | 204 | adsl-68-72-142-239.dsl.chcgil.ameritech.net. |
24.135.112.14 | 201 | unable to resolve |
65.19.54.107 | 200 | qwest107-dsl10.cybermesa.com. |
66.47.246.111 | 184 | user-112vtjf.biz.mindspring.com. |
66.43.176.10 | 149 | uslec-66-43-176-10.cust.uslec.net. |
61.213.92.202 | 119 | j092202.ppp.asahi-net.or.jp. |
202.169.155.188 | 116 | pc-202-169-155-188.cable.kumin.ne.jp. |
222.226.47.251 | 107 | KHP222226047251.ppp-bb.dion.ne.jp. |
221.221.221.218 | 106 | unable to resolve |
66.44.196.69 | 93 | ETCDSL-196-69.ellijay.com. |
210.32.6.108 | 74 | unable to resolve |
210.32.6.108 | 74 | unable to resolve |
Compare this plot to the plot at the very top of this page. You will
notice that very few crawling results in the last year were used to
generate spam. The amount of spam however, exploded within the last year.
It seems that certain old crawling results are used to generate most of the
spam. Due to this fact there is a little chance to reduce the spam in the
near future by just taking off a mail address.
Interpretation: The database of few crawlers is used to generate most of the spam.