大宇网络
apache、iis6、ii7规则拦截蜘蛛抓取-大宇网络

apache、iis6、ii7规则拦截蜘蛛抓取

  • 来源:大宇网络
  • 作者:大宇云
  • 时间:2016-9-9 7:57:11
  • 阅读:

如果是正常的搜索引擎蜘蛛访问,不建议对蜘蛛进行禁止,否则网站在百度等搜索引擎中的收录和排名将会丢失,造成客户流失等损失。可以优先考虑升级虚拟主机型号以获得更多的流量或升级为云服务器(不限流量)。 



Linux下 规则文件.htaccess(手工创建.htaccess文件到站点根目录)

<IfModule mod_rewrite.c>
RewriteEngine On
#Block spider
RewriteCond %{HTTP_USER_AGENT} "Webdup|AcoonBot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot|WangIDSpider|WBSearchBot|Wotbox|xbfMozilla|Yottaa|YandexBot|Jorgee|SWEBot|spbot|TurnitinBot-Agent|mail.RU|curl|perl|Python|Wget|Xenu|ZmEu" [NC]
RewriteRule !(^robots\.txt$) - [F]
</IfModule>

windows2003下 规则文件httpd.conf   (在虚拟主机控制面板中用 “ISAPI筛选器自定义设置 "  开启自定义伪静态 Isapi_Rewite3.1 )

 

#Block spider
RewriteCond %{HTTP_USER_AGENT} (Webdup|AcoonBot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot|WangIDSpider|WBSearchBot|Wotbox|xbfMozilla|Yottaa|YandexBot|Jorgee|SWEBot|spbot|TurnitinBot-Agent|mail.RU|curl|perl|Python|Wget|Xenu|ZmEu) [NC]
RewriteRule !(^/robots.txt$) - [F]

windows2008下 web.config

<rule name="Block spider">
      <match url="(^robots.txt$)" ignoreCase="false" negate="true" />
      <conditions>
        <add input="{HTTP_USER_AGENT}" pattern="Webdup|AcoonBot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot" ignoreCase="true" />
      </conditions>
      <action type="CustomResponse" statusCode="403" statusReason="Forbidden" statusDescription="Forbidden" />
</rule>



注:规则中默认屏蔽部分不明蜘蛛,要屏蔽其他蜘蛛按规则添加即可
附各大蜘蛛名字:
google蜘蛛:googlebot
百度蜘蛛:
baiduspider
yahoo蜘蛛:
slurp
alexa蜘蛛:
ia_archiver
msn蜘蛛:
msnbot
bing蜘蛛:
bingbot
altavista蜘蛛:
scooter
lycos蜘蛛:
lycos_spider_(t-rex)
alltheweb蜘蛛:
fast-webcrawler
inktomi蜘蛛:
slurp
有道蜘蛛:YodaoBot和
OutfoxBot
热土蜘蛛:
Adminrtspider
搜狗蜘蛛:
sogou spider
SOSO蜘蛛:
sosospider
360搜蜘蛛:360spider



上一篇:关于WordPress pingback被利用对外攻击的防范措施
下一篇:BaiDuSpider百度蜘蛛占用流量,robots.txt设置
1G云虚拟主机 88元/年 免备案香港云虚拟主机全新升级

Top

24小时客服热线

400-6118-263

0371-56782366

您好,非正常上班时间若有紧急技术问题,请拨总机后按7号键, 其他问题请提交工单或在上班时间联系,谢谢支持!